Distance based similarity measures of fuzzy sets

Johanyák, Z. C., Kovács S.: Dstance based smlarty measures of fuzzy sets, SAMI 2005, 3 rd Slovakan-Hungaran Jont Symposum on Appled Machne Intellgence, Herl'any, Slovaka, January 2-22 2005, ISBN 963 75 35 3, pp. 265-276. http://johanyak.hu Dstance based smlarty measures of fuzzy sets Zsolt Csaba Johanyák Department of Informaton Technology, Kecskemét College, GAMF Faculty Kecskemét, H-600 Pf. 9, Hungary johanyak.csaba@gamf.kefo.hu Szlveszter Kovács Department of Informaton Technology, Unversty of Mskolc, Mskolc-Egyetemváros, Mskolc, H-355, Hungary szkovacs@t.un-mskolc.hu Abstract: In case of fuzzy reasonng n sparse fuzzy rule bases, the queston of selectng the sutable fuzzy smlarty measure s essental. The rule antecedents of the sparse fuzzy rule bases are not fully coverng the nput unverse therefore fuzzy reasonng methods appled for sparse fuzzy rule bases requres smlarty measures able to dstngush the smlarty of non-overlappng fuzzy sets too. The goal of ths paper s enumeratng some of these dstance based smlarty measures and brefly ntroducng them. Keywords: smlarty measure, dstance of fuzzy sets, vague envronment Dstance based smlarty measure The most obvous way of calculatng smlarty of fuzzy sets s based on ther dstance. There are more approaches on how the relaton between the two notons n form of a functon can be expressed. Two of them are presented below. The frst functon s the followng [7]: ( A,B) = + DM A,B SM, () ( ) where SM s the smlarty measure, DM s the dstance measure of two fuzzy sets, and A respectve B are the examned fuzzy sets.

Another way of dstance based smlarty assessment s proposed by Wllams and Steele n []. The suggested formula (2) contans an exponental expresson. - DM( A,B) ( ) = e SM A,B where s a steepness measure. The value =7 was found sutable for the practce n case of a one dmensonal unverse of dscourse. (2),20,00 Smlarty 0,80 0,60 0,0 SM SM2 0,20 0,00 0,00 0,2 0,2 0,36 0,8 0,60 0,72 0,8 0,96 Dstance Fg.. The functons () and (2) marked wth SM and SM2 are presented n Fg.. usng normalzed dstances. SM has a unform senstvty opposng to SM2 whch has a far hgher senstvty and capablty for dstncton n the frst quarter of the nterval. In case of a mult-dmensonal unverse of dscourse the approxmaton should be started wth a unversal dstance measure. Formng t needs the normalzaton of all lngustc varables for e.g. the nterval [0,]. It can be done by the help of pschtz functons [2]. The unversal dstance measure s determned as a weghted mean of the dstances measured along each dmenson (3). U n ( A,B) = w DM ( A, B) = DM (3) where n s the number of the nput lngustc varables, w s the weghtng for the th lngustc varable and DM s the dstance measured along the th dmenson. = 7 n = w () In a mult-dmensonal case the value of n (2) s determned by the formula () []. Instead of calculatng smlartes from dstances, by a small re-explanng of the meanngs of the fuzzy rules, we can use the dstances of fuzzy sets drectly for approxmate fuzzy reasonng.

Usng dstance based approxmate fuzzy reasonng has an mportant precondton. The dstance of fuzzy sets can be defned only on unverses where t s possble to defne full orderng and metrcs on every component of the unverse of dscourse of the fuzzy sets (any other case the noton of dstance s meanngless). A dstance functon DM: X x X can be consdered as metrcs, f the condtons specfed below are fulflled []: - DM(A,B) 0 A,B X - DM(A,B)=0 A=B A,B X - DM(A,B)=DM(B,A) A,B X - DM(A,B)+DM(B,C) DM(A,C) A,B,C X The Cty Block (5) and the Eucldean (6) are often used as metrcs for dstance measure n case of crsp values. n = DM = A B, (5) n = ( ) 2 DM = A B, (6) where n s the number of dmensons and s the seral number of the actual dmenson. 2 Non -cut based smlarty measures There are many useful dstance defntons of fuzzy sets n the lterature. The smplest one s the Dsconsstency Measure (S D ) of the fuzzy sets A and B (7) S D ( x) = supµ (7) x X A B A B s the mn t-norm, µ A B (x)=mn{µ A (x), µ B (x) } x X. It s where bascally the same measure as used n the mn-max composton. The dsconsstency measure s one crsp value n range of [0,]. In the followngs, some dstance measures, whch are used for expressng the smlarty of trapezodal shaped fuzzy sets (or fuzzy sets have membershp functons can be traced back to a trapezod form) wll be presented. In case of trapezodal shaped fuzzy sets, the fuzzy set can be charactersed by a vector of four values, by the upper and lower endponts of the core and support e.g. X=[x,x 2,x 3,x ].

Ths case the smlarty between sets A and B can be descrbed by the formula (8) proposed by Chen [5]. = a b SM(A,B) = (8) If the unverse of the fuzzy sets are normalzed, then SM(A,B) [0,]. The advantage of (8) s ts smplcty and low computatonal complexty. However, ts drawback s that t can easly lead to the same grade of smlarty n case of dfferent shapes, too. For nstance f the trapezod fuzzy set A=[0.2,0.,0.6,0.8] s compared to the trapezod term B=[0.,0.6,0.8,.0] and to the trangle shaped set C=[0.,0.7,0.7,.0] and to the D=[0.7,0.7,0.7,0.7] crsp value, the smlarty measure s.6 n each case. Chen and Chen proposed a method n [6], whch can be used n case of generalzed trapezod shaped fuzzy sets, too. Ths smlarty measure (9) s based on the calculaton of the Center Of Gravty. a b = = (9) * * max( ya, yb ) ( ) ( ) * * * * C A,B mn( ya, yb ) x x SM(A, B) A B where C(A,B) s defned as follows: x and * A C( A,B) = 0 a a a a + b + b b b > 0 = 0 y are the coordnates of the COG of the set A, respectve * A x and determne the COG of the set B. The dsadvantage of ths method s that t can not handle cases when the examned sets have the same COG, but ther shape s dfferent. The ncreased computatonal complexty can be consdered as an addtonal drawback. * B (0) * y B 3 -cut based smlarty measures 3. Smple dstance measures Most of the dstance defntons are based on the -cuts of the two fuzzy sets, for example:

Hausdorff Measure ( ): Hausdorff Measure (*): where HM ( A, B) sup HM( A, B ) 0 = () HM (, B) HM( A B ) *, A = (2) ( U, V ) = max sup nf d( u, v),supnf d( u, v) HM (3) v V and d ( u,v) s the Eucldean dstance. Kaufmann and Gupta Measure ( ): Kaufmann and Gupta Measure (*): where u U u U ( A, B) = sup ( A, B ) 0 v V () ( A, B) = ( A B ) (5) (, B ) *, ( a b + a b ) 2 2 A = (6) 2 ( β2 β) and [a, a 2 ], [b, b 2 ] are the supports of A, B, respectvely [β, β 2 ] s the support of both A and B, [0,]. Both the Hausdorff Measure and Kaufmann and Gupta Measure are a crsp value n range of [0, ]. 3.2 Kóczy s dstance measure The man problem of the dstance defntons presented above s, that the nformaton of the shape of the membershp functon of the fuzzy sets s mostly lost. It s mpossble to reconstruct from a gven fuzzy set A and from a gven Hausdorff or Kaufmann and Gupta dstance measure of two fuzzy sets A and B, the fuzzy set B. Ths type of reconstructon, at least n the one dmensonal case, has a great mportance n rule nterpolaton, because wthout t, from the dstances of the rule consequents and the fuzzy concluson we are lookng for, t s mpossble to reconstruct the shape of the fuzzy concluson.

Solvng these dffcultes a useful defnton s ntroduced by Kóczy [7]. Ths dstance s based on the -cuts of the two fuzzy sets too, but the dstance s not aggregated to one crsp value, so from ths knd of dstance and from one of the fuzzy sets the other set can be reconstructed. The dstance of two fuzzy sets s expressed by means of a fuzzy set whch s defned over the nterval [0,]. In the course of calculatons the Eucldean dstances between the end ponts of the -cuts are consdered. These are called lower ( d ) and upper ( (8) (Fg. 2.). d d d U U ) dstances and are calculated by formulas (7) and ( A, B) nf { B } nf { A } ( A, B) sup{ B } sup{ A } = (7) = (8) If the unverse of dscourse s mult-dmensonal, the dstances between nf{a }, nf{b } and sup{a }, sup{b } can be defned n the Mnkowsk sense: d d U w ( ) / w = k w ( d ) / w U A, B k ( A B) = d ( A, B ), ( A B) = ( ), = (9) (20) Fg. 2. Normalsed fuzzy dstance between the fuzzy sets A and B An mportant restrcton for the exstence of the Kóczy Dstance s that all the comparable fuzzy sets should be convex and normal, otherwse some -cuts are not connected or do not exsts at all, whch makes the dstance correspondng to these -cuts meanngless. The only dsadvantage of usng the Kóczy Dstance for nterpolatve fuzzy reasonng s that t s lttle bt dffcult to handle. Vague dstance of ponts n a vague envronment In the case of rule nterpolaton t would be useful such knd of dstance defnton, whch s easy to handle, for example the dstance of two fuzzy sets could be charactersed by one crsp number, and gve the chance of the reconstructon of

the membershp functon of a fuzzy set from another set and from ther dstance, at least n the one dmensonal case. These seem to be two contradctory condtons, but they can be satsfed, f we can fnd a way for handlng the dstance of the fuzzy sets and a knd of shape descrpton separately.. Connecton between smlarty of fuzzy sets and vague dstance of ponts n a vague envronment The concept of vague envronment s based on the smlarty or ndstngushablty of the elements. The x and x 2 values n the vague envronment are ε-dstngushable f ther dstance (δ(x,x 2 )) s greater than ε (2). The dstances n vague envronment are weghted dstances. The weghtng factor or functon s called scalng functon (s(x)). = x2 x ( x ) s( x) dx ε x (2) δ s, 2 > For fndng connectons between fuzzy sets and a vague envronment we can ntroduce the membershp functon µ A (x) as a level of smlarty of a to x. The - cuts of the fuzzy set descrbed by membershp functon µ A (x) (23) form the set whch contans the elements that are ( )-ndstngushable from a (Fg. 3.) (22): A δ (a, b) s (22) b ( x) = mn{ δ (a,b),} = mn ( ) s s x dx, a µ (23) Fg. 3. The vague dstance of ponts a and b (δ(a,b)) s bascally the Dsconsstency Measure (2) of the fuzzy sets A and B (where B s a sngleton): S ( x) δ (a, b) sup µ A B = s D = x X f δ(a,b) [0,] (2) Thus dsconsstency measures between member fuzzy sets of a fuzzy partton and a sngleton can be calculated, as vague dstances of ponts n the vague envronment of the fuzzy partton. The man dfference between the

dsconsstency measure and the vague dstance s, that the vague dstance s a crsp value n range of [0, ], whle the dsconsstency measure s lmted to [0,]. That s why t s useful n nterpolatve reasonng wth nsuffcent evdence. So f t s possble to descrbe all the fuzzy parttons of the antecedent and consequent unverses of the fuzzy rule-base, and the observaton s a sngleton, one can calculate the dsconsstency measures of the antecedent fuzzy sets of the rule-base and the observaton, and the dsconsstency measures of the consequent fuzzy sets and the consequence (we are lookng for) as vague dstances of ponts..2 Generatng vague envronments from fuzzy parttons The vague envronment s descrbed by ts scalng functon. For generatng a vague envronment we have to fnd an approprate scalng functon, whch descrbes the shapes of all the terms n the fuzzy partton [8]. The method proposed by Klawonn [9], for choosng the scalng functon s(x) (25), gves an exact descrpton of the fuzzy terms after ther reconstructon from the scalng functon. s x) = µ '( x) = dµ dx ( (25) µ Fg.. A fuzzy set and ts scalng functon Z PS PM P 0.5.5 2.5 X Fg. 5. Scalng functon descrbng all the fuzzy sets A scalng functon always can be found, f there s only one fuzzy set n the fuzzy partton (Fg..). Usually the fuzzy partton contans more than one fuzzy set, so ths method requres some restrctons (26) [9]. µ = f mn{µ (x),µ j }>0 ' ( x) ' ( x),j I (26) µ j

Generally the above condton s not fulflled, so the use of an approxmate scalng functon s proposed as a unversal functon descrbng all the fuzzy sets of a fuzzy partton..3 The approxmate scalng functon The approxmate scalng functon s an approxmaton of the orgnal scalng functons descrbng the fuzzy sets separately. The smplest way of generatng ths functon s the lnear nterpolaton. Supposng that the fuzzy sets are trangles, each of them can be charactersed by three values, two constant scalng functons, whch are the scalng factors of the left and the rght slope of the trangle and the value of the core pont (Fg. 6.). Fg. 6. Thus the approxmaton (s(x)) s a pecewse lnear functon (27), whch nterpolates the rght sde scalng factor of the left neghbourng term and the left sde scalng factor of the rght neghbourng term (Fg. 7.). where x s x s x s + + ( ) = + ( x x ) + s x [ x, x ), [, n ] x (27) s the core of the th term of the approxmated fuzzy partton s, s are the left and rght sde scalng factors of the th term n s the number of the terms n the approxmated fuzzy partton Fg. 7. The drawback of the approxmaton presented above s that t can not handle the bg dfferences between neghbourng scalng factors or crsp fuzzy sets correctly.

In case of bg dfferences, the bgger scalng factor domnates the smaller one (Fg. 8., 9.). If one of the neghbourng fuzzy set s crsp (ts scalng factor s nfnte), the slope of the lnearly nterpolated scalng functon s nfnte too, so both the fuzzy sets descrbed by ths scalng functon wll be crsp. Fg. 8. s A << s B Fg. 9. nearly nterpolated scalng functon of fuzzy sets shown n Fg. 8., and these sets as the approxmate scalng functon descrbes them (A,B ) As a soluton of ths problem the adopton of a non-lnear nterpolatve functon (28) s suggested [8]. k w ( x + - x +) k w ( x x +) k w ( x + - x +) k w ( x x +) w ( ) + s + s s k w + x + - x + s( x ) = (28) w + < ( ) w s s s k + x + - x + + w = s s (29) + where x [x,x + ), [,n-], s(x) s the approxmate scalng functon, x s the core of the th term of the approxmated fuzzy partton, s, s are the left and rght sde scalng factors of the th trangle shaped term of k n the approxmated fuzzy partton, constant factor of senstvty for neghbourng scalng factor dfferences, s the number of the terms n the approxmated fuzzy partton.

The above functon has same useful propertes. If the neghbourng scalng factors are equals, s(x) s lnear. If one of the neghbourng scalng factors (e.g. S ) s and the other one s fnte, n case of x [ x, ) s( ) x + 0 x x x x x and smlarly f s and + s s fnte and x [ x, ) s( x ) x + 0 x x x x + + Fg. 0. and. show some examples for the applcaton of the proposed nonlnear functon. Fg. 0. Approxmate scalng functon generated by the non-lnear functon wth k=, and the orgnal fuzzy partton (A,B) as ths scalng functon descrbes t (A,B ) Fg.. x =0, x 2 =, s, k= s2 5 Conclusons Dstance based smlarty measures of fuzzy sets have a hgh mportance n reasonng methods handlng sparse fuzzy rule bases. The rule antecedents of the

sparse fuzzy rule bases are not fully coverng the nput unverse. Therefore the appled smlarty measure has to be able to dstngush the smlarty of nonoverlappng fuzzy sets, too. The dstance based smlarty measures are such a measures. To gve an overvew of the dstance based smlarty measures of fuzzy sets, some of the man exstng concepts are brefly ntroduced n ths paper. 6 eferences [] J. Wllams, N. Steele: Dfference, dstance and smlarty as a bass for fuzzy decson support based on prototypcal decson classes, Fuzzy Sets and Systems 3 (2002) 35-6. [2] P. Damond, P. Kloeden: Metrc Spaces of Fuzzy Sets: Theory and Applcaton, World Scentfc, Sngapore, 99. [3]. T. Kóczy, K. Hrota: Orderng, dstance and closeness of fuzzy sets, Fuzzy Sets And Systems, 60:28-293, 993. [] W. A. Shuterland: Introducton to metrc and topologcal spaces, Oxford Unversty Press, Oxford, 977. [5] S. M. Chen: New methods for subjectve mental workload assesment and fuzzy rsk analyss, Cybernet, Systems: Internat, 996, J. 27, 9-72. [6] S. J. Chen, S. M. Chen: Fuzzy rsk analyss based on smlarty measures of generalzed fuzzy numbers, IEEE Trans. Fuzzy Systems,5-56, 2003. [7] Kóczy T. ászló, Tkk Domonkos: Fuzzy rendszerek, Typotex, 2000. [8] Sz. Kovács,. T. Kóczy.: The use of the concept of vague envronment n approxmate fuzzy reasonng, Tatra Mt. Math. Publ. 2, 77 82, 997. [9] F. Klawonn: Fuzzy sets and vague envronments, Fuzzy Sets and Systems 66:207-22, 99.