TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang* 1, Yong Xao, Pexn Qu 3, Xlong Qu 1 1 Department of Computer and Communcaton, Hunan Insttute of Engneerng, Xangtan 411104, Chna, Orent Scence & Technology College of Hunan Agrcultural Unversty, Changsha, 418, Chna, 3 School of Informaton and Engneerng, Henan Insttute of Scence and Technology, Xnxang, 453003, Chna, *Correspondng author, e-mal: cx543879@sohu.com*, Qupexn@163.com, quxlong@16.com Abstract In order to mprove network ntruson detecton precson, ths paper proposed a network ntruson detecton model based on smultaneous selectng features and parameters of support vector machne (SVM) by partcle swarm optmzaton (PSO) algorthm. Frstly, the features and parameters of SVM are coded to partcle, and then the PSO s used to fnd the optmal features and SVM parameters by collaboraton among partcles, lastly, the performance of the model was tested by KDD Cup 99 data. Compared wth other network models, the proposed model has reduced nput features for SVM and has sgnfcantly mproved the detecton precson of network ntruson. Keywords: network ntruson detecton, features selecton, model parameters, PSO Copyrght 014 Insttute of Advanced Engneerng and Scence. All rghts reserved. 1.Introducton Wth the tremendous growth of network-based servces and users of the Internet, t s mportant to keep the data and transactons n the Internet more secure. Intruson Detecton System (IDS) can detect the ntrusons of someone who s not authorzed to the present computer system automatcally, so Intruson detecton system has emerged as an essental component and an mportant technque for network securty. Support vector machnes (SVM) s a machne learnng method, t has shown growng popularty and has been successfully appled to network ntruson detecton [1]. Feature selecton s used to dentfy a powerfully classfed subset of network ntruson detecton features and reduces the number of features presented to the mnng process. By extractng as much nformaton as possble from a gven data set whle usng the smallest number of features, we can save sgnfcant computaton tme and establsh network ntruson detecton model that generalzes better for data set wth all of the features []. The dmenson of network features are very hgh and contans redundant and useless features, only a part of features nfluence the ntruson results, so these redundant and useless features need to deleton [3]. When SVM s used to establsh network ntruson detecton model, the parameters that should be selected nclude penalty parameter C and the kernel functon parameters [4]. When SVM s used to establsh network ntruson detecton model, n addton to the feature selecton, proper parameters settng can mprove the network ntruson detecton precson. These two problems are crucal n network ntruson detecton modelng, because the feature subset choce nfluences the approprate kernel parameters and vce versa [5]. Therefore, obtanng the optmal network ntruson detecton feature subset and SVM parameters must be selected smultaneously [6]. In the lterature, the Grd algorthm s an alternatve to select the best parameters of SVM, however, ths method s tme consumng and does not perform well [7]. Moreover, the Grd algorthm can not perform the feature selecton task. And a few algorthms have been proposed for network feature selecton such as GA (genetc algorthm), PSO (partcle swarm selecton) algorthm, and other algorthms [8]. However, these feature selecton algorthms focused on feature selecton and dd not deal wth parameters selecton for the SVM classfer. In order to mprove ntruson detecton precson, ths paper proposed a network ntruson detecton model (PSO-SVM) based on smultaneous selecton features and SVM Receved July 4, 013; Revsed August 9, 013; Accepted September 30, 013
TELKOMNIKA ISSN: 30-4046 1503 parameters by PSO algorthm, and the performance of the model was tested by KDD Cup 99 data.. Research Method.1. Prncple of SVM Let the gven tranng data sets be represented by(x, y ), =1,,n, where x R d s an nput vector, y R s ts correspondng desred output, and n s the number of tranng data. In SVM a lnear functon s constructed: f ( x) t g( x) b (1) Where, ω s a coeffcent vector and b s a threshold. SVM learnng can be obtaned by the mnmzaton of the emprcal rsk on the tranng data, and the ε-ntensve loss functon s used for the mnmzaton of emprcal rsk. The loss functon s defned as: L ( x, y, f ) y f ( x) max(0, y f ( x) () Where, ε s a postve parameter. The emprcal rsk s: 1 n 1 R emp ( ) L ( y f( x )) (3) n Other than the ε-ntensve loss, SVM tres to reduce the model complexty by mnmzng and. Ths can be descrbed by slack varables.. Subsequently, the SVM approxmaton s obtaned as the followng selecton problem [9]. n 1 mn C ( ) (4) 1 Where, C s a postve constant to be regulated. By usng the Lagrange multpler method, the mnmzaton of formula (4) causes the problem of maxmzng the followng dual selecton. n n n 1 max y ( ) y ( ) ( )( ) K ( x, x ) (5) j j j 1 1, j1 are Lagrange multplers, and kernel K( x, x ) s a symmetrc functon, Where, and here the Gaussan functon s used as kernel. j x y K( x, y) exp( ) (6) Then the approxmaton functon s represented by Lagrange multplers, namely: P j (7) 1 f ( x) ( ) k( x, x ) b.. PSO Algorthm Inspred by the socal behavors of brd flockng, PSO algorthm was developed by Kennedy and Eberhat [10]. The partcle s endowed wth two factors: velocty and poston whch Network Intruson Detecton Based on PSO-SVM (Changsheng Xang)
1504 ISSN: 30-4046 can be regarded as the potental soluton n the D dmenson problem space. In PSO algorthm, they can be updated by followng formulas: v ( t 1) v ( t) c r p ( t) v ( t) c r p ( t) x ( t) (8) d d 1 1d d d d gd d x ( t 1) x ( t) vx ( t 1) (9), j, j, j Where, w s the nerta weght factor. r 1d and r d are two random numbers. v d (t) and x d (t) are the velocty and poston of the current partcle. p s called "personal best", and ts dth-dmensonal part s p d. The "global best" p g s the best poston found n the whole partcles. c 1, c are the acceleraton constants..3. Partcle Desgn When the Gaussan functon s selected as kernel functon, (C, σ) and features are used as nput attrbutes. Therefore, the partcle comprses four parts: C, σ and the features mask. In Fgure1, C 1 ~C represents parameter C, σ 1 ~σ j represents the parameter σ, f 1 ~f m represents the feature mask. In the feature mask, the bt wth value 1 represents the feature s selected, and 0 ndcates feature s not selected. Fgure 1. Partcle Desgn.4. Ftness Functon Detecton precson and the number of features are used to desgn a ftness functon. Thus, for the partcle wth hgh detecton precson and a small number of features produce a hgh ftness value. We solve the multple crtera problem by creatng a sngle objectve ftness functon that combnes the two goals nto one. As defned by formula (11), the ftness has two predefned weghts: () w a for the detecton precson, () w f for the summaton of the selected feature. 1 N f f wa Accwf f (10) 1 Where, Acc s the network ntruson precson, f s defned as follow: f 1 0 feature s selected feature s not selected (11).5. Desgn of the Mult-classfer for Network Intruson Detecton Fgure. Mult-classfer for Network Intruson Detecton TELKOMNIKA Vol. 1, No., February 014: 150 1508
TELKOMNIKA ISSN: 30-4046 1505 SVM s for two classfer, but the network ntruson has a varety of nvason type, thus network ntruson detecton s mult-classfy problem, so mult-classfer for network ntruson detecton s constructed by "one" to "one" way n ths paper, and s as shown n Fgure..6. The Steps of Network Intruson Detecton Step1: The network data are collected and the ntal features set are extracted. Step: The ntal partcles are produced randomly, whch represented parameters of SVM and feature subset. Step 3: The partcles are decoded the parameters of SVM and feature subset, and then the network ntruson detecton model s establshed based on to the correspondng parameters and feature subset, and calculates the network ntruson detecton precson, the ftness value s obtaned accordng to formula (10). Step 4: each partcle s ftness value s compared wth the P, f better, and then the partcle takes place the poston of P. Step 5: For each partcle, ts ftness value s compared wth the P g, f better, and then the partcle takes place the poston of P g. Step 6: The veloctes and postons of the partcles are updated accordng to formula (8) and (9). Step 7: The teratve process doesn t stop proceedng untl the number of teratons acheves the maxmum number of teratons (N max ), and the optmal partcle s decoded nto the optmal parameters of SVM and features subset. Step 8: The tranng samples are dealt accordng to the optmal features subset and are nput nto SVM to establsh the optmal ntruson detecton model accordng to the optmal parameters of SVM. The work flow chart of network ntruson detecton model s as followng: Fgure 3. The Flow Chart of Intruson Detecton Model 3. Results and Dscusson 3.1. Experment Data The experment data are from DD Cup 99, whch contan about 5,000,000 connectng records. There are four categores of attacks: DOS, RL, UR, Probng. The parameters of PSO algorthm are set as: the numbers of partcle k=0, w=1, c 1 =c =, N max =00. Network Intruson Detecton Based on PSO-SVM (Changsheng Xang)
1506 ISSN: 30-4046 3.. Comparson Models and Evaluaton Crteron In order to make the detecton results of PSO-SVM comparable and persuasve, three comparson models are chosen, SVM1: whch features are selected by PSO algorthm whle the parameters of SVM are selected randomly, SVM: whch all of the features are selected whle the parameters of SVM parameters are selected by PSO algorthm, SVM3: whch the features are selected by PSO algorthm frstly, and then the parameters of SVM are select by PSO algorthm. The performances of models are evaluaton by precson, recall, and tran tmes. 3.3. Results and Analyss The PSO algorthm s a heurstc algorthm, the experment results are random. The numbers of features are appeared n 5 tmes experments, and the results are shown n Table 1. Table 1. The Appeared Tmes of Each Feature appeared tmes The number of feature 1 11, 14, 17, 19, 38, 16, 39 5, 8, 10, 13, 15, 18, 19,, 1, 7, 8, 37, 41 3, 3, 7, 9, 1, 36, 3, 6, 3, 35 4 4, 6, 0, 5, 9, 33, 34, 40 5 1, 4, 31 10 98 precson(%) 96 94 9 88 86 84 the number of experment Probe DoS UR the number of experment Fgure 4. The Detecton Precson and Recall of PSO-SVM 105 Probe DoS UR Probe DoS UR precson(%) 1 the number 3of experment 4 5 the number of experment Fgure 5. The Detecton Precson and Recall of SVM1 TELKOMNIKA Vol. 1, No., February 014: 150 1508
TELKOMNIKA ISSN: 30-4046 1507 105 precson(%) 1 the number 3 of experment 4 5 the number of experment Fgure 6. The Detecton Precson and Recall of SVM 105 precson(%) 1 the number 3of experment 4 5 the number of experment Fgure 7. The Detecton Precson and Recall of SVM3 In order to determne tmes of the features appeared have dfferent nfluence on the result of classfcaton, we carred out testng experments on three groups of features, the frst group: they appear fve tmes plus four attrbutes: 1, 4, 6, 0, 1, 5, 9, 31, 33, 34, 40. The second group: features appear fve tmes, four tmes and three tmes: 1,, 3, 4, 6, 7, 9, 1, 0, 3, 5, 6, 9, 31, 3, 33, 34, 35, 40. The thrd group: all of the 41 features. The results show that the optmal features are, 4, 9, 0, 1, 4, 9, 31, 33, 34, 40 and the correspondng optmal SVM parameter C =107.1, σ=1.05, n terms of classfcaton precson and tran tme. The optmal features subset and the optmal parameters of SVM are used to establsh to the network ntruson detecton model, the detecton results are shown n Fgure 4. The detecton results of comparson models SVM, SVM, SVM3 are shown n Fgure 5~7. In Matlab 01, the tc and toc are used to count tran tmes(s) of the all models, the results are shown n Table. Table. The Tranng Tmes of Dfferent Models ntruson type SVM1 SVM SVM3 PSO-SVM Probe 0.9 1.14 1.1 0.73 DoS 1.07 1.56 0.9 0.83 UR 1.14 1.1 1.56 0.58 RL 1.11 1.57 1.66 0.9 Normal 1.18 1.14 1.5 1.01 Network Intruson Detecton Based on PSO-SVM (Changsheng Xang)
1508 ISSN: 30-4046 We can see from Fgure 3 that the PSO-SVM can obtan hgh network ntruson detecton precson and recall, and t s an effectve network ntruson detecton model. The expermental results n table and fgure ~6 are analyzed and concluson can be obtaned as followng: (1) compared wth the SVM1, SVM, SVM3, the network ntruson detecton tmes of PSO-SVM sgnfcantly are shortened, and detecton effcent s mproved, the experment results show that the PSO-SVM can fnd the optmal features subset and the parameters of SVM, and elmnates the useless and redundancy of network ntruson detecton features, reduces the nput vectors of the SVM and computatonal complexty s deceased, t can more meet the real-tme requrement wth network ntruson detecton. () Compared wth the SVM1, SVM, SVM3, PSO-SVM has mproved detecton precson and recall of network ntruson, the results show that, there s relaton between the network feature and SVM parameters, and they are selected smultaneously by PSO whch can acheve the optmal network features and parameters of SVM smultaneously, so the performance of network ntruson detecton model can be greatly mproved and ensure the safe of the network. 4. Concluson Good parameters of SVM and features set are very crucal for ntruson detecton model, and the paper proposed a network ntruson detecton model based on smultaneous selecton features and SVM parameters by PSO algorthm. The results showed that PSO-SVM has obtaned better performance than other models, whch parameters and features subset are selecton separately, PSO-SVM can elmnate the useless and redundancy of features and reduces the nput vectors of the SVM, and has mproved detecton precson of network ntruson detecton and has wde applcaton prospect n the feld of network securty. Acknowledgements Ths research was supported by Hunan Pro Natural Scence Foundaton (13JJ)and Hunan Scence & Technology Foundaton (013GK309). References [1] Dennng D. An Intruson Detecton Model. IEEE Transacton on Software Engneerng. 010; 13(): -3. [] Zhang XF, Zhao Y. Applcaton of Support Vector Machne to Relablty Analyss of Engne Systems. Telkomnka. 013; 11(7): 335-3560. [3] Khan L, Awad M, Thurasngham B. A new ntruson detecton system usng support vector machnes and herarchcal clusterng. The VLDB Journal. 007; 16(): 507-51. [4] Palomo EJ, Domnguez E, Luque RM, et al. A new GHSOM model appled to network securty. Lecture Notes n Computer Scence Sprnger. 008; 51(18): 6-689. [5] Durga Prasad, Nkhl Pal, Jyotrmoy Das. Genetc programmng for smultaneous feature selecton and classfer desgn. IEEE Transactons on Systems. 009; 36(1): 106-117. [6] Huang ChengLung, Wang ChehJen. A GA-based feature selecton and parameters optmzaton for support vector machnes. Expert Systems wth Applcatons. 009; 31(): 31 40. [7] Natesan P, Balasubramane P, Gowrson G. Improvng attack detecton rate n network ntruson detecton usng adaboost algorthm wth multple weak classfers. Journal of Informaton and Computatonal Scence. 01; 8(8): 39-51. [8] Saravanan C, Shvsankar M. An optmzed feature selecton for ntruson detecton usng layered condtonal random felds wth MAFS. Internatonal Journal of Moble Network Communcatons & Telematcs. 011; 1(3): 79-. [9] Han FQ, L HM, et al. A new ncremental support vector machne algorthm. TELKOMNIKA Indonesan Journal of Electrcal Engneerng. 01; 10(6): 1171-1178. [10] Je He, Hu Guo. A Modfed Partcle Swarm Optmzaton Algorthm. TELKOMNIKA Indonesan Journal of Electrcal Engneerng. 013; 11(11): 609-615. TELKOMNIKA Vol. 1, No., February 014: 150 1508