Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection

Size: px

Start display at page:

Download "Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection"

Conrad Fowler
5 years ago
Views:

1 E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu 1, Tsan-Yng Yu 2,3 1 Department of Computer and Communcaton, Natonal Kaohsung Frst Unversty of Scence and Technology, Kaohsung, Tawan 2 Insttute of Engneerng Scence and Technology, Natonal Kaohsung Frst Unversty of Scence and Technology, Kaohsung, Tawan 3 Department of Electrcal Engneerng, Kao Yuan Unversty, Lu Chu, Tawan wechh@ccms.nkfust.edu.tw, yotnyg@gmal.com do: /jct.vol5.ssue8.9 Abstract Support Vector Machnes (SVM) s a powerful classfcaton technque n data mnng and has been successfully appled to many real-world applcatons. Parameter selecton of SVM wll affect classfcaton performance much durng tranng process. However, parameter selecton of SVM s usually dentfed by experence or grd search (GS). In ths study, we use Taguch method to make optmal approxmaton for the SVM-based E-mal Spam Flterng model. Sx real-world mal data sets are selected to demonstrate the effectveness and feasblty of the method. The results show that the Taguch method can fnd the effectve model wth hgh classfcaton accuracy. 1. Introducton Keywords: Support Vector Machnes (SVM), Taguch method, Grd search (GS) Spammng s the abuse of electronc messagng systems to send unsolcted bulk e-mals or to promote servces or products, whch are usually undesred. Spammng s economcally vable because advertsers have no operatng costs beyond the management of ther malng lsts. The sender cannot be specfed, because the sender of spammng has only temporary e-mal address and the reply of them s not reached to the orgnal sender. Therefore, undesred Emals to us have been ncreased everyday, so that, t s not easy to read an mportant e-mal. Early on the spam flterng black and whte lst was appled usually. Although fast and smple wth the characterstcs, but the drawback s that users have to update the spam mal flterng rules and mantan a black lst. Spam flterng based on the textual content of e-mal messages can be regarded as a specal case of text categorzaton, wth the categores beng spam and normal (non-spam). Contentbased flters can be dvded nto rule-based methods and probablstc methods. Rule-based methods such as Rpper [1-2], Boostng [3], Decson Tree [4], Rough Sets [5] and so on strongly dependent on the exstence of key terms, therefore, specfc terms can cause the falure of flterng. Methods based on probablty and statstcs such as K-Nearest Neghbor [5] and Support Vector Machne (SVM) [6] and so on. Besdes, the prevalng machne learnng method for spam message flterng s the Bayesan approach [7] used wth good results. SVM proposed by Vapnk [6] n 1995, has been wdely appled n many applcatons such as functon approxmaton, modelng, forecastng, optmzaton control...etc, and has yelded excellent performance. It s a statstcal theory to deal wth the dual categores of classfcaton and can fnd the best hyperplane to partton a sample space. Huang [8] demonstrate that the SVM-based model s very compettve to back-propagaton neural network (BPN), genetc programmng (GP) and decson tree n terms of classfcaton accuracy. Selecton of kernel functon s a pvotal factor whch decdes performance of SVM. RBF kernel functon penalty parameter C s most wdely used n SVM and few control parameters are requred. There are two parameters n and the kernel parameterγ. However, for the SVM-based model, ts classfcaton performance s senstve to the parameters of the model, thus, parameters selecton s very mportant. The optmzaton parameters ths functon (C, γ) wll make the SVM have the best performance. In spam flterng, the Bayesan algorthm n the mal system s very extensve. Compared wth Bayesan algorthm, f SVM s used wth lnear kernel functon or default 78

2 Journal of Convergence Informaton Technology Volume 5, Number 8, October 2010 parameters, the Bayesan algorthm wll be better than the accuracy of SVM. In order to enhance the accuracy of SVM, t s necessary to develop a search mechansm to tune the hyperparameters. Most of the prevous researches focus on the grd search (GS), pattern search based on prncples from desgn of experments (DOE) such as Staeln [9] and genetc algorthm (GA) [8, 10] to choose the parameters. GS s smple and easly mplemented, but t s very tme-consumng. DOE s lke GS but t reduces the searchng grd densty and can reduce the computatonal tme greatly. Although GA does not requre settng an ntal search range, t ntroduces some new parameters to control the GA searchng process, such as the populaton sze, generatons, and mutaton rate. The Taguch method [11], a robust desgn approach, uses many deas from statstcal expermental desgn for evaluatng and mplementng mprovements n products, processes, and equpment. The fundamental prncple s to mprove the qualty of a product by mnmzng the effect of the causes of varaton wthout elmnatng the causes. One of the major tools used n the Taguch method s orthogonal array (OA) to reduce the number of experments and obtan good expermental results. The parameters (C, γ) of SVM are regarded as control factors n OA. Experment s conducted through Multlevel-column OA after selectng the parameters of SVM. We verfy the classfcaton results and compared wth GS. As far as we know, ths maybe the frst attempt to ntroduce Taguch method to optmze the SVM for spam flterng models. The remander of ths paper s organzed as follows. In Secton 2, the SVM and Taguch method are descrbed brefly. Secton 3 presents mplementaton for our approach to classfy the spam e-mals. Secton 4 gves expermental results and dscusson. Fnally, the research results are summarzed and present future work. 2. The ntroducton of SVM and Taguch method The proposed approach s based on SVM and Taguch method. In ths secton, SVM and Taguch method are ntroduced brefly The bref descrpton of SVM The textual and non-textual features representng an emal, obtaned through the method mentoned prevously, are as the nput to the spam emal flterng algorthm. In the approach, the flterng algorthm s represented by SVM. SVM s a powerful supervsed learnng paradgm based on the structured rsk mnmzaton prncple from statstcal learnng theory, whch s currently placed among of the bestperformng classfers and have a unque ablty to handle extremely large feature spaces (such as text), precsely the area where most of the tradtonal technques fal due to the curse of the dmensonalty. SVM has been reported remarkable performance on text categorzaton task. In our evaluaton, we used the Lbrary for SVM [12] to buld SVM models. In the followng, we gve a bref ntroducton to the theory and mplementaton of SVM classfcaton algorthm. Consder the problem of separatng the set of tranng set vectors belongng to two separate classes n some feature space. Gven one set of tranng example vectors: ( x, y ),...( x, y ), x R, y { 1, 1} (1) 1 1 l l n + we try to separate the vectors wth a hyperplane so that ( w x) + b = 1 (2) y [( w x) + b] 1,( = 1, 2,..., l) (3) The hyperplane wth the largest margn s known as the optmal separatng hyperplane. It separates all vectors wthout error and the dstance between the closest vectors to the 79

3 E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu hyperplane s maxmal. The dstance s gven by 2 d ( w, b) = (4) w Hence the hyperplane that separates the data optmally s the one that mnmzes the followng equaton: 1 2 Mnmze w (5) 2 subject to the constrants of (4). To solve above problem, Lagrange multplers α are ntroduced. Let = 1,2,,l and defne w( ) = α y x l α (6) Wth Wolfe theory the problem can be transformed to ts dual problem: = 1 1 max W ( α ) = α w( α) w( α), s. t. α 0 2 (7) α = y 0 (8) Wth the optmal separatng hyperplane found, the decson functon can be wrtten as: f ( x) = ( w x) + (9) 0 b 0 Then the test data can be labeled wth label x) = sgn( f ( x)) = sgn(( w x) + ) (10) ( 0 b0 Tranng vectors that satsfy y [( w0 x) + b0 ] = 1 are termed support vectors, whch are always correspondng to nonzero α. The regon between the hyperplane through the support vectors on each sde s called the margn band. In the case of lnearly non-separable tranng data, by ntroducng slack varables the prmal problem can be rewrtten as: 1 2 Mn w + C ξ (11) 2 subject to y [( w x) + b] 1 ζ, ζ 0. Smlarly, we can get the correspondng dual problem maxw ( α) = s. t. C α 0, α 1 2 w( α) w( α), α y = 0 (12) 80

4 Journal of Convergence Informaton Technology Volume 5, Number 8, October 2010 Problems descrbed as n Equaton(11) and Equaton(12) are typcal quadratc optmzaton questons, and have been approached usng a varety of computatonal technques. Recent advances n optmzaton methods have made support vector learnng n large-scale tranng data possble. All the tranng vectors correspondng to nonzero α are called support vectors, whch form the boundares of the classes. The maxmal margn classfer can be generalzed to nonlnearly separable data va transformng nput vectors nto a hgher dmensonal feature space by a map functon ϕ, followed by a lnear separaton there. The expensve computaton of nner products can be reduced sgnfcantly by usng a sutable kernel functon K( x, x j ) = ( ϕ( x ), ϕ( x j )). We mplemented the SVM classfer usng the LIBSVM lbrary [12] and adopted radal bass 2 functon (RBF) defned as the kernel K( x =, x j ) exp γ x x j. In ths study, the RBF s used as the basc kernel functon of SVM. There are two parameters assocated wth the RBF kernels: C and γ. Vapnk found that a dfferent kernel functon of SVM has lttle effect on the performance but parameters of kernel functon are key factor The bref descrpton of Taguch Method In ths secton, we brefly ntroduce the basc concept of the structure and Taguch method. Taguch method s qute common n the desgn of ndustral experments [13-14]. Taguch method requres a sgnfcantly small number of experments compared wth other statstcal technques[15]. Although some nformaton s lost due to these two approxmatons, t s stll worth choosng ths approach, consderng the tme consumng nature. OA s a very mportant tool for Taguch method. Many desgned experments use matrces called OA for determnng whch combnatons of factor levels to use for each expermental run and for analyzng the data. An OA s a fractonal factoral matrx, whch assures a balanced comparson of levels of any factor or nteracton of factors. It s a matrx of numbers arranged n rows and columns where each row represents the level of the factors n each run, and each column represents a specfc factor that can be changed from each run. The array s called orthogonal because all columns can be evaluated ndependently of one another. The general symbol for m-level standard OA s ( m where n=m k number of expermental runs; k a postve nteger whch s greater than 1; m number of levels for each factor; n-1 number of columns n the OA. ) n 1 L n (13) The letter L comes from Latn, the dea of usng OA for expermental desgn havng been assocated wth Latn square desgns from the outset. The two-level standard OA whch are most often used n practce are L 4 (2 3 ), L 8 (2 7 ), L 16 (2 15 ), and L 32 (2 31 ). Table 1 shows an OA L 8 (2 7 ). 81

5 E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu Table 1. L 8 (2 7 ) Orthogonal array L 8 (2 7 ) Experment No cloumn The number to the left of each row s called the run number or experment number and runs from 1 to Implementaton In ths paper, the flow chart of e-mal spam flterng based on SVM wth Taguch method for parameter selecton s shown n Fgure 1. Frst stage s data pre-processng as depcted n Fgure 2. Vector space model s a text representng approach, whch s wdely used and has good performance n text categorzaton. In ts smple form, spam flterng can be recast as text Input messages Data Pre-processng Select Parameters Select OA Calculatng the expermental accuracy for each run Calculatng the effects of SVM parameters (C, γ) Optmal SVM parameters (C, γ) are obtaned based on prevous step Spam or Normal Fgure 1. The flow chart of e-mal spam flterng based on SVM wth Taguch method for parameter selecton Fgure 2. Data Pre-processng categorzaton task where the classes to be predcted are spam and normal. Therefore, Emal can be regarded as a vector space, whch s composed of a group of orthogonal key words. For each emal, ts textual porton was represented by a concatenaton of the subject lne and 82

6 Journal of Convergence Informaton Technology Volume 5, Number 8, October 2010 the body of the message. Due to the prevalence of html and bnary attachments n modern emal a degree of pre-processng s requred on messages to allow effectve feature extracton. Therefore, we adopt the followng data pre-processng steps: 1) If there exst HTML tags, then remove HTML tags. Then tokenzaton s the process of reducng a message to ts colloqual components. 2) To avod treatng forms of the same word as dfferent attrbutes, a lemmatzer was appled to the corpora to convert each word to ts base form (e.g., "got" becomes "get"). 3) The stoppng process s adopted to remove the hgh frequent words wth low content dscrmnatng power n an emal document such as "to", "a","and","t", etc. Removng these words wll save spaces for storng document contents and reduce tme taken durng the subsequent processes. We obtan word frequences and convert nto vectors. We ntroduce Taguch method to our approach. In content-based spam flterng performance analyss, a commonly used evaluaton crtera measurng the effcency of the classfcaton s accuracy (Acc). It s regarded as response varable, defned as: A + D Acc = (14) N where N s the number of all messages; A s as spam and the actual system to determne the number of spam; and D that the actual system for normal mal and e-mal to determne the number of normal. Table 2. Descrpton of data sets Orgnal Our method Data set Non-Spam Spam Non-Spam Spam enronspam 16,545 17, lngspam 2, , PU PU PU3 2,313 1, PUA Table 3. Experment set-up and data for L 16 (8 8 2) Orgnal Columns 1, 2, 4 13, 6, 1 Exp. Modfed columns No. 1 2 Acc Factor Log 2 (c) Log 2 (γ) PU1 PU2 PU3 PUA lngspam enronspam

7 E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu Table 4. L 16 (8 8 2) OA and experment data PU1 PU2 PU3 PUA lngspam enronspam Factor Log 2 (C) Log 2 (γ) Log 2 (C) Log 2 (γ) Log 2 (C) Log 2 (γ) Log 2 (C) Log 2 (γ) Log 2 (C) Log 2 (γ) Log 2 (C) Log 2 (γ) Max Mn Effect In order to reduce the number of experments and the cost of desgn, we have to choose approprate OA by numbers of control factors and levels. To explan how to employ OA to obtan the soluton, on the other hand as the search scope s suggested by Ln [12] and we expand to dfferent combnatons of parameters C and γ wth 8 levels: log 2 (C) = (-15, -11, -6, -2, 3, 7, 12, 16) and log 2 (γ) = (-15, -11, -6, -2, 3, 7, 12, 16) to fnd the best combnaton. In thswork, both of the factor log 2 (C), log 2 (γ) are set at eght levels. Seven degrees of freedom (d.f.) are requred for each factor. Consder that 14 d.f. are requred n total, an OA type L 16 (2 15 ) wth 16 trals and 15 d.f., as ndcated n left sde Table 3, s adopted. A converson of the L 16 array of two levels to one multlevel wth 8 levels had to be performed to accommodate two factors log 2 (C), log 2 (γ) wth 8 levels. Ths modfcaton of the OA should be planned n such a way that respects the d.f. of the L 16. In general, three man concepts were used n the orthogonal arrays theory [16]. 1. Balance, for each factor the levels occur equally often. 2. Estmablty, every parameter could be capable of beng estmated. 3. Orthogonalty, a term whch mples that t s easy to extract and separate out the effect of dfferent factors equally. Multlevel factors could be created by the approprate multlevel columns n two-level arrays. Ths s generally acheved at expense of 3 columns whch are replaced by a new column whose levels drectly correspond to every level-combnaton of the orgnal 2 columns. The only requrement for the creaton of multlevel columns n ths way s that four nteracton columns must exst for the 3 sacrfced columns; these are deleted. Consequently, only one two-level column s leaved to reman after converson and L 16 (8 8 2) are acheved. In order to verfy whether the arthmetc s vald or not, we empoly 5-fold cross valdaton for our experment. 5-fold cross valdaton s to separate e-mals nto 5 parts. We make use of the 4 parts for tranng, and the remanng for testng. The procedure loops 5 tmes, so every part has been tested. Fnally, the average of tests values s used as the result of test for evaluaton. Each run of L 16 (8 8 2) wll proceed 5-fold cross valdaton. The accuracy for each run and the average accuracy for each level and each factor need to be evaluated. We pck the level wth maxmum accuracy for each factor. Therefore, we can obtan approxmaton results. 4. Experment results and dscusson In our test, the program runs wth LIBSVM toolbox provde by Ln [12] on an IBM compatble PC wth AMD Athlon CPU runnng at 1.8 GHz wth 1GB RAM. Sx publc data sets have been used n ths study. The experments were conducted on the PU corpora, the lngspam corpus and enronspam corpora. The four PU corpora, PU1, PU2, PU3 and PUA, respectvely, have been made publcly avalable by Androutsopoulos et al. They are encrypted data sets n order to promote standard benchmarks. Lngspam s a mxture of

8 Journal of Convergence Informaton Technology Volume 5, Number 8, October 2010 spam messages and 2412 messages sent va the Lngust lst, a moderated (hence, spam-free) lst about the professon and scence of lngustcs. Attachments, HTML tags, and duplcate spam messages receved on the same day are not ncluded. The enronspam corpus 1, whch are sx nonencoded data sets, contans ham messages of partcular users and fresh spam messages and ncludes spam messages from varous sources. We mx ths enronspam and take 500 normal messages and 500 spam messages randomly. Table 2 shows the summary of the data sets. Regardng messages n PU1, PU2 and the PUA are not many, so all spam and non-spam messages are put n our test. Data set Table 5. Results for dfferent OAs GS(8 8) L16(8 8 2) Log(C) Log(γ) Acc(%) Log(C) Log(γ) Acc(%) Acc. Dff.(%) pu pu pu pua lngspam enronspam Avg (a) GS(16 16) L32( ) Data set Acc. Log(C) Log(γ) Acc(%) Log(C) Log(γ) Acc(%) Dff.(%) pu pu pu pua lngspam enronspam Avg (b) GS(32 32) L 64 ( ) Data set Acc. Log(C) Log(γ) Acc(%) Log(C) Log(γ) Acc(%) Dff.(%) pu pu pu pua lngspam enronspam Avg Experment set-up and data for L 16 (8 8 2) s shown n Table 3. In ths table, the converson of L 16 (8 8 2) from L 16 (2 15 ) stll keep orthogonal. It ndcates that the accuracy of SVM wll become worse wthout careful selecton for parameters C and γ. We lst accuracy averages of both parameters log 2 (C) and log 2 (γ) for every level n dfferent data sets and evaluate effectve of control factors for all levels as llustrated n Table 4. Here accuracy s desrable as larger as n possble. The maxmum of both parameters log 2 (C) and log 2 (γ) accuracy average for each level each data set are marked. The dfference between maxmum accuracy and mnmum 1 The Enron-Spam datasets are avalable from and n both raw and pre-processed form. Lng-Spam and the pu corpora are also avalable from the same addresses 85

E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu accuracy of man effect for parameters log 2 (C) and log 2 (γ) mples the mpact for

9 E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu accuracy of man effect for parameters log 2 (C) and log 2 (γ) mples the mpact for accuracy. By observng the effectve and varance of control factor log 2 (γ) and log 2 (C) for all level, The dfference of parameters log 2 (γ) s lager than the one of parameters log 2 (γ). It means that parameter γ s more sgnfcant than parameter C for all data sets. The experment of both methods used (a) (b) (c) (d) (e) (f) Fgure 3. The contour plots of GS on C = (2-15, 2-14, 2-13, 2-12,..., 2 15, 2-16 ) and γ = (2-1 5, 2-14, 2-13, 2-12,..., 2 15, 2-16 ) for (a) PU1 (b) PU2 (c) PU3 (d) PUA (e) enronspam (f) lngspam dentcal tranng and testng sets wth 5-fold cross valdaton. The average classfcaton accuracy of 5-fold cross valdaton of both methods for In other data sets, the average accuracy for Taguch method s close to the results for GS but not good enough. Furthermore, we apply Taguch method wth more levels OAs as depcted n Table 5(b)(c). Ths mprovement s sgnfcant between Table 5(a) and (b). However, the mprovement s lttle between Table 5(b) 86

10 Journal of Convergence Informaton Technology Volume 5, Number 8, October 2010 and (c). Taguch approach wth more the number of levels has more effectve detectve ponts, so the accuracy wll get hgher. Meanwhle, the dfference n accuracy between GS and our proposed method wll decrease. The comparson of both methods s based on the same levels n Accuracy 100% 80% 60% 40% 20% 0% PU1 PU2 PU3 PUA lngspam eronspam Nave Bayesan 84.99% 90.93% 89.90% 90.19% 84.57% 78.27% SVM (Lnear) 78.78% 81.39% 83.18% 83.96% 89.70% 86.17% SVM (Taguch Method L32) 92.99% 92.56% 93.90% 91.30% 98.70% 94.97% SVM (GS 32 32) 94.27% 94.17% 95.60% 92.20% 98.89% 95.47% Fgure 4. Accuracy for dfferent methods ths experment. Fgure 3 s avalable by GS on C = (2-15, 2-14, 2-13, 2-12,..., 2 15, 2-16 ) and γ = (2-15, 2-14, 2-13, 2-12,..., 2 15, 2-16 ) for each data set. Hgher accuraces concentrated n the lower rght corner of the contour graph. These contrbutons are smlar for all data sets. Compared wth Taguch method, SVM wth lnear kernel, GS and Naїve Bayes algorthm for dfferent data sets, the results of our confrm test are shown n Fgure 4. The results ndcate to be set up for SVM wth lnear kernel but the accuracy wll lower than that of Naїve Bayes algorthms and our proposed method. As for tme complexty, GS requred searchng and computng = 1024 tmes but our proposed method need only 64 tmes. Our approach s 15 tmes faster and accuracy of our proposed method s very close to that of GS. The expermental results show that our proposed method can select good parameters for SVM wth kernel RBF and the accuracy s very close to that of GS. 5. Conclusons and future work Our proposed approach based on Taguch method does not lke other approxmaton methods or heurstcs may cause exhaustve parameter searches. On the other hand, our proposed approach sometmes may obtan approxmaton results but not optmal. However, compared wth much computatonal tme to fnd the optmal parameter values by the grd-search, t s worth for our methods to obtan approxmaton results at expense of lttle accuracy. From above experments, approprate OA could acheve hgh accuracy but hgh multlevel OA make lttle mprovement. In order to acheve approprate multlevel-column OA, we convert from 2-level OA and stll keep multlevel-column OA orthogonal. In our method the parameter selecton by orthogonal table wll obtan hgh accuracy. If we would lke to obtan hgher accuracy, we could extend OA L 64 to an OA such as L 128 to promote the accuracy. 6. References [1] W. W. Cohen, "Fast effectve rule nducton," n Proceedngs of the Twelfth Internatonal Conference on Machne Learnng, 1995, pp [2] W. C. Wllam, "Learnng rules that classfy e-mal," n Proceedngs of the 1996 AAAI Sprng Sympo-sum n Informaton Access, 1996, pp

11 E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu [3] I. Androutsopoulos, et al., Learnng to flter unsolcted commercal e-mal: "DEMOKRITOS", Natonal Center for Scentfc Research, [4] X. Carreras and L. Marquez, "Boostng trees for ant-spam emal flterng," n Proceedngs of RANLP-01,4th Internatonal Conference on Recent Advances n Natural Language Processng, Tzgov Chark, BG, [5] I. Androutsopoulos, et al., "Learnng to flter spam e-mal: A comparson of a nave bayesan and a memory-based approach," [6] V. N. Vapnk, The Nature of Statstcal Learnng Theory: New York: Sprnger-Verlag, [7] J. Provost, "Nave-bayes vs. rule-learnng n classfcaton of emal. The Unversty of Texas at Austn," Artfcal Intellgence Lab. Techncal Report AI-TR , [8] C. L. Huang and C. J. Wang, "A GA-based feature selecton and parameters optmzatonfor support vector machnes," Expert Systems Wth Applcatons, vol. 31, pp , [9] C. Staeln, "Parameter selecton for support vector machnes," Hewlett-Packard Company, Tech. Rep. HPL R1, [10] T. Howley and M. G. Madden, "The genetc kernel support vector machne: Descrpton and evaluaton," Artfcal Intellgence Revew, vol. 24, pp , [11] Taguch and S. Chowdhury, Robust engneerng: McGraw-Hll Professonal, [12] C. C. Chang and C. J. Ln, "LIBSVM -- A Lbrary for Support Vector Machnes," [13] G. Taguch, Introducton to qualty engneerng: Asan Productvty Organzaton Tokyo, [14] M. Phadke, Qualty engneerng usng robust desgn: Prentce Hall PTR Upper Saddle Rver, NJ, USA, [15] D. C. Montgomery, Desgn and analyss of experments. New York: Wley, [16] N. Logothets and H. P. Wynn, Qualty through desgn: expermental desgn, off-lne qualty control, and Taguch's contrbuton. Oxford: Clarendon Press,

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.