Case Mining from Large Databases

Size: px

Start display at page:

Download "Case Mining from Large Databases"

Sydney Warner
5 years ago
Views:

1 Case Mnng from Large Databases Qang Yang and Hong Cheng Department of Computer Scence, Hong Kong Unversty of Scence and Technology, Clearwater Bay, Kowloon Hong Kong {qyang, Abstract. Ths paper presents an approach of case mnng to automatcally dscover case bases from large datasets n order to mprove both the speed and the qualty of case based reasonng. Case mnng constructs a case base from a large raw dataset wth an objectve to mprove the case-base reasonng systems effcency and qualty. Our approach starts from a raw database of objects wth class attrbutes together wth a hstorcal database of past acton sequences on these objects. The object databases can be customer records and the hstorcal acton logs can be the techncal advses gven to the customers to solve ther problems. Our goal s to dscover effectve and hghly representatve problem descrptons assocated wth soluton plans that accomplsh ther tasks. To mantan effcency of computaton, data mnng methods are employed n the process of composng the case base. We motvate the applcaton of the case mnng model usng a fnancal applcaton example, and demonstrate the effectveness of the model usng both real and smulated datasets. 1 Introducton To motvate the case mnng problem, consder a fnancal applcaton example. Suppose that a certan ABC Bank s nterested n ntatng a marketng campagn to encourage ts customers to sgn up for a new loan program. Suppose that we are gven two datasets on (1) customer nformaton and (2) past marketng acton logs on customers n (1) and the fnal outcomes n loan-sgnup results. Tables 1 and 2 show an example of a customer database table together wth some examples of past marketng actons on customers. Suppose that we are nterested n buldng a campagn plan for a new customer Steve. Based on the past campagn actons for people lke Steve, there are many canddate actons that one can suggeston. For example, we can use two of the past cases for John and Mary. If we follow the plan for John, we can frst gve hm a personal phone call and then send hm a gft. Alternatvely, we can decrease the mortgage rate for Steve, followed by offerng a new Credt card. The latter plan follows that of Mary s success. In ether stuaton, the ABC bank mght have a relatvely hgh chance of convertng Steve from a reluctant customer to a wllng one. The above problem can be formulated as a case-based reasonng problem, where the key ssue s to fnd a good role model for customers such as Steve, and apply case 1

2 retreval and adaptaton technques [9,10] to fnd low-cost plans for these customers on a case-by-case bass. Our approach s to frst dentfy typcal problem descrptons as potental role models from the customer database (Table 1). Then for each typcal problem, a cost-effectve plan s found to be assocated wth each negatve-class role model (Table 2); they are stored n a case-base as a plan. Then, for each new problem, we fnd a smlar problem descrpton and adapt ts plan n the case base. Table 1. An example of customer database. The last attrbute s the class attrbute. Customer Salary Cars Mortgage Loan John 80K 3 None Y Mary 40K 1 300K N Steve 40K 1 None N Table 2. Acton-log database example contanng marketng actons and outcomes Advsee Acton1 State1 Acton2 State2 John Phone Featurevalues Send Feature- call for gft values for John John Mary Lower Featurevalues Credt Featurevalues mortgage for Mary card Mary for In ths paper, we focus on how to fnd the case base from a gven database effcently. A challengng ssue facng the constructon of the case base s the large sze of the database and large amount of data. To solve the scale-up problem, we explot data mnng technques, ncludng clusterng and assocaton rule mnng algorthms [2, 5, 8, 14]. Cluster center objects, also known as the medods, are obtaned from the customer database to produce the bass for cases from customers belongng to both the negatve class (problem descrptons for new problems) and the postve class (where Class=Yes n the fnal results table). Assocaton-rule mnng s appled to the actonlog database to produce typcal plans to be assocated wth the problem-objectve pars. The resultant case base s then used for problem solvng for new customers. Because we extract hgh-qualty cases from large databases usng data mnng technques, we call ths method case mnng. Emprcal results are obtaned to demonstrate that the case-mnng algorthms are scalable and acheve a balance between qualty and effcency. 2

3 2 Related Work Case mnng s closely related to case-base mantenance, whereby the case base structure, ndex and knowledge representaton are updated to mprove problemsolvng performance. The most recent specal ssue of Computatonal Intellgence Journal on case-based mantenance [6] hghlghted ths nterest and progress. Smyth and Keane [15], who defned case-base competences based on how cases n an exstng case base are related to each other, addressed the mantenance ssue by proposng a deleton-based algorthm. Ther mantenance polcy s amed at removng as much redundancy form a case base as possble by reducng the overlap between cases or groups of cases. Subsequently, Smyth and McKenna [22] and Yang and Zhu [16] explored how other polces can be appled to case-based mantenance. Leake, Kngly and Wlson [10, 11, 17] analyzed the case-base mantenance problems and cast them nto a general framework of revsng the content and structure of an exstng case base. Aamodt et. al [1] presented learnng algorthms for smlarty functons used n case retreval. In our formulaton, case mnng s targeted for large, un-ndexed raw databases rather than an exstng case base. The ndexng problems have been an ssue of ntensve study n the past, ncludng work n [4, 19, 20]. The ndexng lterature assumes that there s already a case base n exstence, whereas n ths paper we explore how to fnd the case base from hstorcal databases. Patterson et. Al [21] dscussed how to apply the K-means algorthm for case-base mantenance. The work s also related to case-based plannng [9, 18], whch derves a new plan based on plans stored n case bases. However, case-base plannng has manly focused on case adaptaton; nstead, we focus on how to explot the dstrbuton of massve data to obtan statstcally well-balanced case bases. 3 Mnng Problem Descrptons Our algorthm s dvded nto two phases. In phase one, we take the customer database and fnd a set of typcal cases by applyng clusterng algorthm to t. The result s a set of postve-class and negatve-class objects. In order to fnd the typcal cases, we apply two dfferent algorthms ncludng a clusterng algorthm and an SVM-based algorthm. In phase two, we fnd a plan for each negatve-class role model to form a potental case. Care s taken n the search process to fnd only hgh-utlty plans. We wsh to fnd K typcal cases to populate the case base, where the parameter K s user defned; later we dscuss another method (SVM) whch does not requre K to be specfed. These cases should be hghly representatve of the dstrbuton of the customer nformaton n the orgnal customer database. They wll serve as the ntal states for our planner to work on n the next phase. We consder all negatve nstances n the database as the tranng data for mnng the problem descrptons of the case base. Our frst method apples the clusterng algorthms to the negatve nstances, as descrbed n more detal n Table 3. The tranng database conssts of the negatve nstances of the orgnal database. 3

4 Table 3. Algorthm Centrods-CBMne (database DB, nt K) Steps Begn 1 casebase= emptyset; 2 DB = RemoveIrrelevantAttrbutes(DB); 3 Separate the DB nto DB+ and DB-; 4 Clusters+ = ApplyKMeans(DB-, K); 5 for each cluster n Clusters+, do 6 C = fndcentrod(cluster); 7 Insert(C, casebase); 8 end for; 9 Return casebase; End In the algorthm Centrods-CBMne n Table 3, the nput database DB s a raw database. There are two classes n ths database, where the negatve class corresponds to populaton of ntal states for a marketng plan. Step 2 of the algorthm performs feature extracton by applyng a feature flter to the database to remove all attrbutes that are consdered low n nformaton content. For example, f two attrbutes A1 and A2 n the database are hghly correlated, then one of them can be removed. Smlarly, f an attrbute A has very low classfcaton power for the data as can be computed usng nformaton theory, then t can be removed as well. In our mplementaton, we apply a C4.5 decson-tree learnng algorthm to the database DB. After a decson tree s constructed, the attrbutes that are not contaned n the tree are removed from the database; these are the rrelevant attrbutes. Step 3 of the algorthm separates the tranng database nto two parttons, a postve-class subset and a negatve-class subset. Step 4 of the algorthm performs the K- means clusterng on the negatve-class sub-database [5]. K-means fnds K locally optmal center objects by repeatedly applyng the EM algorthm on a set of data. Other good clusterng algorthms can also be used here n place of K-means. Step 6 of the algorthm fnds centrods of the K clusters found n the prevous step. These centrods are the bases of the case base constructed thus far, and are returned to the user. Fnally, Step 9 returns the case base as the output. The centrod-based case-mnng method extracts cases from the negatve-class cluster center objects and takes nto account only the negatve class dstrbuton. By consderng the dstrbuton of both the postve and negatve class clusters, we can sometmes do better. In general, t may be better to fnd the boundary between the two dstrbutons, and to select cases from the dense areas on that boundary. Ths s the ntuton behnd the SVM-based case mnng algorthm. The key ssue then s to dentfy the cases on the boundary between the postve and negatve cases, and select those cases as the fnal ones for the problem descrptons. Ths has two advantages. Frst, by usng the boundary objects as cases, from the past plans, we may determne that some objects are n fact not convertble to the postve class; these are the objects that are not smlar to any of the problem descrptons n the case base. The case-based reasonng algorthm cannot solve these cases. Second, 4

5 because the cases are the support vectors themselves, there s no need to specfy the nput parameter K as n the Centrods-CBMne algorthm; the parameter K s used to determne the number of clusters to be generated n K-means. Ths corresponds to parameter-less case mnng, whch s superor because n realty, the number of cases to be generated s typcally unknown ahead of tme. We observe that the cases along the boundary hyper-plane correspond to the support vectors found by an SVM classfer [7, 13]. These cases are the nstances that are closest to the maxmum margn hyper-plane n a hyperspace after an SVM system has dscovered the classfer [7, 13]. By explotng the above dea, we have a second problem-descrpton mnng algorthm called SVM-CBMne(). In the frst step, we perform SVM learnng on the database to locate the support vectors. Then we fnd the support vectors and nsert them nto the case base. Ths algorthm s llustrated n Table 4. Table 4. Algorthm SVM-CBMne (database DB, nt K) Steps Begn 1 casebase = empty set; 2 Vectors = SVM(DB); 3 for each negatve support vector C- n Vectors do 4 Insert(C-, casebase); 5 end for 6 Return casebase; End 4. Mnng the plans Havng obtaned the problem descrptons as seeds for the cases n the case base, we now consder the problem of fndng soluton plans to be assocated wth these descrptons wth an am to convert all negatve-class cases to postve. Our algorthm s a state-space search algorthm that search an AND-OR tree. AND-OR trees are used to deal wth the problem of uncertanty n plannng. The plans produced are probablstc n nature; they are amed at succeedng n convertng the customers to postve class wth low cost and hgh success probablty. Gven the customer table and the acton log database, our frst task s to formulate the problem as a plannng problem. In partcular, we wsh to fnd a method to map the customer records n the customer table nto states usng a statstcal classfer. Ths task n tself s not trval because t maps a large attrbute space nto a more concse space. When there are mssng values n the database, technques of data cleanng [14] can be appled to fll n these values. Next, the state-acton sequences n the acton log database wll be used for obtanng acton defntons n a state space, such that each acton s represented as a probablstc mappng from a state to a set of states. To make the representaton more realstc, we wll also consder the cost of executng each acton. 5

6 To summarze, from the two tables we can obtan the followng nformaton: f s ( ) = s j maps a customer record r to a state s j.ths functon s known as the customer-state mappng functon; p c ( s ) s a probablty functon that returns the probablty that state s s n a desrable class. We call ths classfer the state-classfcaton functon; p( sk s, a j ) returns the transton probablty that, after executng an acton a j n state s, one ends up n state s k. Our plannng algorthm dvdes the resultng plan n stages, where each stage conssts of one acton and a set of possble outcome states resultng from the acton. In each stage the states can be dfferent possble states as a result of the prevous acton, and the acton n the stage must be a same, sngle acton. In our formulaton, we defne a utlty functon of a plan to be u(s) for a state s, where the functon s defned as u( s) = max(( P( s' s, a)* u( s')) cost( a)) a s' S where, at a leaf node t, the functon u s defned as the utlty of the node whch s P(+ t), and cost(a) s the cost of the acton a. A major dffculty n solvng the plannng problem stems from the fact that there are potentally many states and many connectons between states. Ths potentally large space can be reduced sgnfcantly by observng that the states and ther connectons are not all equal; some states and acton sequences n ths state-space are more sgnfcant than others because they are more frequently traveled by traces n the actonlog table. Ths observaton allows us to use an approach n whch we explot plannng by abstracton. In partcular, sgnfcant state-acton sequences n the state space can be dscovered through a frequent strng-mnng algorthm [2]. We start by defnng a mnmumsupport threshold for fndng the frequent state-acton sequences. Support represents the number of occurrences of a state-acton sequence from the acton log database. More formally, let count(seq) be the number of tmes sequence seq appears n the database for all customers. Then the support for sequence seq s defned as the frequency of the sequence. Then, a strng-mnng algorthm based on movng wndows wll mne the acton log database to produce state-acton subsequences whose support s no less than a user-defned mnmum-support value. For connecton purpose, we only retan substrngs both begnnng and endng wth states, n the form of < s, a, s+ 1, a+ 1,..., sn >. Once the frequent sequences are found, we pece together the segments of paths correspondng to the sequences to buld an abstract AND-OR graph n whch we wll search for plans. If < s 0, a1, s2 > and < s 2, a3, s4 > are two segments found by the strng-mnng algorthm, then < s 0, a1, s2, a3, s4 > s a new path n the AND-OR graph. Snce each component of the AND-OR graph s guaranteed to be frequent, the AND-OR graph s a hghly concse and representatve state space. Based on the above heurstc estmaton methods, we can perform a best-frst search n the space of plans untl the termnaton condton s met. The termnaton condtons are determned by the probablty or the length constrants n the problem doman. For the ntal state s, f the expected value E(+ s) exceeds a predefned threshold Success_Threshold,.e. the probablty constrant, we consder the plan to be good enough and the search process termnates. Otherwse, one more acton s attached to ths plan and the new plans are nserted nto the prorty queue. E ( + s ) s the expected state-classfcaton probablty estmatng how effectve a plan s at transfer- 6

7 rng customers from state s. Its calculaton can be defned n the followng recursve way: E + s ) = p( s s, a )* E( + s ) ; f s s a non-termnal state; or ( k j k E + s ) = P( + s ) f s s a termnal state. ( We also defne a parameter Max_Step that defnes the maxmum length of a plan,.e. the length constrant. We wll dscard a canddate plan, whch s longer than the Max_Step but ts E + s ) value s less than the Success_Threshold. We now consder ( optmalty of the algorthm. 5 Effcency Experments We run tests on both artfcal data sets and realstc data sets to obtan the performance of the case mnng algorthm. In ths paper, we are prmarly nterested n the speed n whch to obtan the case bases, especally when the database reaches a large sze. Our other ongong work s amed at showng the qualty of the mned case bases. Thus, our nterest n these tests s how the CPU tme changes wth dfferent expermental parameters. In the sequel, we separately test the problem descrpton and the plannng algorthms. The experments are performed on an Intel PC wth one Ggahertz CPU. 5.1 Testng Problem Descrpton Mnng Our frst test s amed at establshng the effcency of the problem-descrpton mnng algorthm, whch determnes whether the case mnng framework can be scaled up or not. It uses an artfcal dataset generated on a two-dmensonal space (x, y), usng a Gaussan dstrbuton wth dfferent means and co-varance matrx for the postve (+) and the negatve ( ) classes (Fgure 1). Our purpose s to demonstrate the effect of data dstrbuton and model sze on the swtchng-plan qualty and effcency. When the means of the two dstrbutons are separated, we expect the class boundares are easy to dentfy by the SVM-based method. As can be seen, the tme t takes to buld and execute the model ncreases wth K, the number of cases n the case base (Fgure 2). As the two dstrbutons move close to each other such that there are no clear boundares, the SVM-based method selects nearly all the negatve examples as cases n the case base, resultng n a bloated case-base. Thus, ts tme expense s also very hgh (Fgure 2). In ths case, the CenrodCaseMne method s preferred. 7

8 Fgure 1. Dstrbuton of the class 1=postve (+) and class 2=negatve (*) data. CPU Seconds K=10 K=20 K=30 K=40 K=50 K=60 Sze of CaseBase K=70 K=80 K=90 K=100 SVM baselne Far Close Fgure 2. Comparson of CPU tme for dfferent dstrbutons between two dstrbutons that are ether Far apart or Closely mxed. For large-scale tests, we performed an experment usng the IBM QUEST synthetc data generator ( We generated the tranng dataset wth 9 attrbutes, 50% postve class and 50% negatve class. An excerpt of the database s shown n Table 5. In the dataset, the Salary attrbute s unformly dstrbuted from to , the commsson values are set so that f Salary >= 75000, Commsson = 0; otherwse t s unformly dstrbuted from to The Age attrbute s unformly dstrbuted from 20 to 80, the Educaton attrbute s unformly chosen from 0 to 4, and the Car attrbute ncludes the make of the car, and s unformly chosen from 1 to 20. 8

9 The ZpCode attrbute s unformly chosen from 9 avalable zp codes, and the House Value attrbute s unformly dstrbuted from 0.5*k* to 1.5*k*100000, where 0 <= k <= 9 and the House Value depends on the ZpCode. The YearsOwned attrbute s unformly dstrbuted from 1 to 30, and the Loan attrbute s unformly dstrbuted from 0 to Then we compared the runnng tme to tran a 50-case case base for the centrod-based method and CPU tme for the SVM based method. Our results are shown n Table 6. It s clear from the table that wth large data, the centrod-based method s able to scale up much better than the SVM based method. Table 5. An excerpt from the synthetc dataset. Salary Commsson Age Educaton Car Table 6. CPU-tme comparson of Centrod-based Method and SVM-based method. Log 10 (N), CPU Tme (Seconds) N=Database sze Centrod-based SVM , No result n 5 hrs 5 1,938.4 No result n 7 hrs 5.2 Testng Problem Descrpton Mnng In ths second test, we assume that we have obtaned a case base of typcal problem descrptons from the negatve examples n the tranng set. Our task now s to fnd a plan for each negatve descrpton. These plans wll serve as the cases that can be adapted for new problems n the future. In ths experment, we test the effcency of the algorthm for a sngle problem descrpton n a case. We agan used the IBM Synthetc Generator to generate a Customer dataset wth two classes and nne attrbutes. The postve class has 30,000 records representng successful customers and negatve has 70,000 representng unsuccessful ones. Those 70,000 negatve records are treated as startng ponts for Marketng-log data genera- 9

10 ton. We evaluated the qualty of the plans va smulaton. From ths nput, we set out to fnd a case base of marketng plans to convert customers to a successful class. Ths testng process corresponds to the testng phase for a traned model n machne learnng. In ths smulaton, f there s a plan sutable for convertng a customer record, a sequence of actons s carred out on that record. The plan wll then change the customer record probablstcally. At the end, the classfer s used to decde whether the changed record has turned nto a successful one. When Success_Threshold s low, many states are consdered postve, and thus plans can be easly and quckly found for most of the ntal states n the graph. We can observe ths from Fgure 3: plannng tme s low wth low Success_Threshold. As the Success_Threshold ncreases, so does the plannng tme. When Success_Threshold s too hgh, no plan can be found for some ntal states. The Tme s much hgher because the searchng process doesn t termnate untl all the plans expanded longer than Max_Step. The search effcency also depends on other parameters. MnSupport s another mportant factor. When MnSupport s low, more plans n the acton log qualfy to be n the abstract state space, and thus search takes a longer tme. We can also see from Fgure 4 that the CPU tme drop s greater when the MnSupport ncreases from one to 100. However, the plannng tme drop slows down between Mn- Support value of 100 and Ths suggest to us that we should use a support value of around 100, snce at ths value we can acheve the balance of a relatvely large number of plans to be ncluded n the search space and a low plannng tme. Tme(s) Fgure 3. CPU Tme vs. Success_Threshold. MnSupport = Tme(s) Fgure 4. CPU Tme vs. MnSupport. Success_Threshold=

11 6 Conclusons and Future Work In ths paper, we presented a case mnng algorthm for dscoverng a case base from a large database. The central ssue of the problem les n the dscovery of hgh-qualty case bases that balances the effcency for extractng the case base, the cost of the adaptaton plans and the probablty of success. Ths s what we call the case-mnng problem. For the data dstrbuton where the two classes are clearly separated, the SVM-CBMne algorthm, whch s an SVM-based method, should be used. When the data dstrbutons are not separated well by a boundary, the cluster-centrods based method s recommended. Furthermore, we desgned a plannng system for fndng probablstc plans where the probablty of success s taken nto account. The effcency of the case mnng algorthm s valdated aganst large databases. In the future, we wll contnue to explore qualty of case mnng results and the related problem of case adaptaton n ths framework. References [1] A. Aamodt, H. A. Sandtorv, O. M. Wnnem: Combnng Case Based Reasonng and Data Mnng - A way of revealng and reusng RAMS experence. In Lydersen, Hansen, Sandtorv (eds.), Safety and Relablty; Proceedngs of ESREL 98, Trondhem, June 16-19, Balkena, Rotterdam, [2] R. Agrawal and R. Srkant Fast algorthm for mnng assocaton rules. Proceedngs of the Twenteth Internatonal Conference on Very Large Databases. pp [3] C. L. Blake, C.J. Merz (1998). UCI Repostory of machne learnng databases Irvne, CA: Unversty of Calforna, Department of Informaton and Computer Scence. [4] A. Bonzano, P. Cunnngham and B. Smyth Usng Introspectve Learnng to Improve Retreval n CBR: A Case Study n Ar Traffc Contol, n Leake D., Plaza E. (eds.) `Case-Based Reasonng Research and Development, Second Internatonal Conference on Case-Based Reasonng ICCBR 1997, Sprnger Verlag, Berln [5] P. S. Bradley and U. M. Fayyad. Refnng ntal ponts for k-means clusterng. In Proceedngs of the Ffteenth Internatonal Conference on Machne Learnng (ICML '98), pages , San Francsco, CA, Morgan Kaufmann. [6] Computatonal Intellgence Journal, Specal Issue on Case-base Mantenance. Blackwell Publshers, Boston MA UK. Vol. 17, No. 2, May Edtors: D. Leake, B. Smyth, D. Wlson and Q. Yang. [7] G. C. Cowley. MATLAB Support Vector Machne Toolbox. v0.54b Unversty of East Angla, School of Informaton Systems, Norwch, Norfolk, U.K. NR4 7TJ,

12 [8] P. Domngos and M. Rchardson. Mnng the Network Value of Customers. Proceedngs of the Seventh ACM SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng. August ACM. N.Y. N.Y. USA [9] K. Hammond. Explanng and Reparng Plans that Fal. Artfcal Intellgence, 45(1-2): , [10] D. Leake. Case-based Reasonng -- Experences, Lessons and Future Drectons. AAAI Press/ The MIT Press, [11] D. Leake, Knley, A., & Wlson, D. (1995). Learnng to mprove case adaptaton by ntrospectve reasonng and CBR. In Proceedngs of Frst Internatonal Conference on Case-Based Reasonng. Sesmbra, Portugal. [12] C. X. Lng and C. L. Data mnng for drect marketng: Problems and solutons. In Proceedngs 4th Internatonal Conference on Knowledge Dscovery n Databases (KDD-98), New York, [13] J.C. Platt, Fast tranng of support vector machnes usng sequental mnmal optmzaton, n Advances n Kernel Methods - Support Vector Learnng, (Eds) B. Scholkopf, C. Burges, and A. J. Smola, MIT Press, Cambrdge, Massachusetts, chapter 12, pp , [14] J. Qunlan C4.5: Programs for Machne Learnng. Morgan Kaufmann Publshers, Inc., San Mateo, CA. [15] B. Smyth and M. T. Keane. Rememberng to forget: A competence--preservng deleton polcy for case--based reasonng systems. In Proceedngs of the 14th Internatonal Jont Conference on Artfcal Intellgence, pp , 1995 [16] Q. Yang and J. Zhu. A Case-addton Polcy for Case-based Reasonng. Computatonal Intellgence Journal, Specal Issue on Case-based Mantenance. Blackwell Publshers, Boston MA UK. Vol. 17, No. 2, May Pages [17] D. C. Wlson and D. B. Leake Mantanng Case-based Reasoners: Dmensons and Drectons. Computatonal Intellgence Journal. Vol. 17, No. 2. May 2001 [18] M. Veloso. (1994). Plannng and learnng by analogcal reasonng. Number 886 n Lecture Notes n Artfcal Intellgence. Sprnger Verlag. [19] D. Wettschereck and D.V. Aha. Weghtng features. In Proceedngs of the Frst Internatonal Conference on Case-Based Reasonng, ICCBR-95, pages , Lsbon, Portugal, Sprnger-Verlag. [20] Zhong Zhang and Qang Yang. Feature Weght Mantenance n Case Bases Usng Introspectve Learnng. Journal of Intellgent Informaton Systems, Kluwer Academc Publshers, 16, Pages , The Netherlands. [21] D. Patterson, N. Rooney, M. Galushka, S. S. Anand: Towards Dynamc Mantenance of Retreval Knowledge n CBR. In Proceedngs of the Ffteenth Internatonal Florda Artfcal Intellgence Research Socety Conference, 2002, Florda, USA. AAAI Press 2002, pages

13 [22] B. Smyth, and E. McKenna, Buldng Compact Competent Case-Bases. In Proceedngs of the thrd Internatonal Conference on Case-based Reasonng, Sprnger-Verlag, Munch, Germany, 1999, pp

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.