Optimizing Naïve Bayes Algorithm for SMS Spam Filtering on Mobile Phone to Reduce the Consumption of Resources

Size: px
Start display at page:

Download "Optimizing Naïve Bayes Algorithm for SMS Spam Filtering on Mobile Phone to Reduce the Consumption of Resources"

Transcription

1 Journal of Computers Vol. 28, No. 3, 2017, pp do: / Optmzng Naïve Bayes Algorthm for SMS Spam Flterng on Moble Phone to Reduce the Consumpton of Resources L-qun Bao 1*, Ln-xa LV 2, and Jn-Long L 1 1 College of Electroncs and Informaton Engneerng, Lanzhou Insttute of Technology Lanzhou, Chna baolqun1983@163.com 2 College of Software Engneerng, Lanzhou nsttute of technology Lanzhou, Chna Receved 25 May 2016; Revsed 09 December 2016; Accepted 09 December 2016 Abstract. Moble phone s a embedded system wth lmted energy, memory and computng ablty. The tradtonal SMS spam flterng algorthms only focused on the accuracy rate of spam flters and dd not consder the consumpton of moble phone s hardware resources. Ths paper proposes an optmzed Naïve Bayes SMS Spam flterng algorthm, the man purpose s to reduce the consumpton of hardware resources and reflect user s ndvdual preference. A feature lbrary s desgned to save weght of features, condtonal probabltes and pror probabltes. The flterng algorthm s dvded nto a feature lbrary updatng algorthm and an onlne SMS flterng algorthm. The feature lbrary s updated regularly by the feature lbrary updatng algorthm. The feature lbrary updatng algorthm does not need to run mmedately, and can be performed n free tme of moble applcaton or coped to PC clent asynchronously. The onlne SMS flterng algorthm runnng on a moble phone performs only the mnmum amount of wor whch does not requre much resources. It can be qucly adapted to the changng of user s preference. Expermental results show that the algorthm can mprove the speed of classfcaton whle mantanng a hgh classfcaton accuracy. t taes up less storage space and can be used on common moble platforms. Keywords: feature lbrary, Naïve Bayes, onlne SMS spam flterng algorthm, short messages, updatng algorthm 1 Introducton Wth the development of wreless communcaton, the number of moble phone users n the world ncreased rapdly. Accordng to the Forrester s forecasts, moble phone users wll reach 48 bllon n the world by the end of 2016 [1]. Short Message Servce (SMS) s a popular means of moble communcaton. Due to huge success of Short Message Servce, SMS has now become the preferred [2] and cheapest medum for advertsers to reach masses especally n developng countres le Chna. SMS traffc has brought huge economc losses [3]. In August 2015, Badu moble guards released the frst half of 2015, Chna Moble Internet Securty Report [4]. Accordng to the report, n the frst half of 2015, advertsng s the largest nd of spam messages, accountng for 30% of the total, followed by telecommuncatons busness promoton, fnancal Servces, news broadcasts, etc. Spam can be defned as unsolcted emal for a recpent or any messages that the users do not wanted to have n ther nbox. The desgn of current moble phone nbox dd not provde flexblty to delete a spam SMS wthout loong at t. Due to unnecessary notfcatons and content, spam SMS has become a source of annoyance, a waste of tme and frequent dsturbance for the recever, and become a wdespread * Correspondng Author 174

2 Journal of Computers Vol. 28, No. 3, 2017 socal problem. Operators of short messages have no perfect spam message montorng system so far. They stll use spam flterng methods. SMS spam flters generally use four technques: Blac and whte lst flterng, eyword matchng flterng, flterng based on rules and content based SMS spam flterng. The frst three technques are smple, effcent and easy to mplement and requre lttle computng resources [5]. Blac and whte lst flterng, eyword matchng flterng all need manual settng of phone numbers or ey words, such as Free, Account, etc. Users preset telephone numbers of spammers and eywords frequently appear n spam messages. Thus, f a short message comes from those telephone numbers or contans those eywords, t wll be saved nto the spam folder and the user wll not be alerted by ths spam message. Furthermore, Manual updatng of blac lst s always later than the emergence of new SMS spam senders. The SMS spam senders wll feel the rule. They are very wea when t comes to ntentonal modfcaton of eywords or phone numbers. Rule-based flterng has the advantages of smple, fast, and needng no tranng phases, but fxed rules can t adapt to the new characterstcs of spam messages. If the user s nterest changes or the categores of spam vary, these rules need to be changed accordngly. So the accuracy rate of these flters are often poor. On the other hand, content based SMS spam flterng technques show better accuracy results compared to other methods. However, theses methods requre much more computaton resources. At present, studes on SMS spam flterng are manly focused on content based spam flterng algorthms. Another ssue needng to be concerned about s the dfference between short messages and e- mal. Short messages are short and lac of characterstc data whch are dfferent from emals. A emal conssts of certan structured nformaton such as subject, mal header, sender s address etc, but a short message does not have such structured nformaton and often conssts of few words, whch brng dffcultes to SMS spam flterng. The tradtonal content based spam flterng algorthms cannot be drectly appled to SMS spam flterng, feature selecton s partcularly mportant [6]. It s necessary to develop an effcent SMS Spam flterng method whch classfy short messages nto spam and non-spam SMS effcently. The scentfc contrbutons of ths research wor nclude: (1) Desgnnng SMS feature lbrary, the content-based SMS spam flterng algorthm s dvded nto feature lbrary updatng algorthm and real-tme SMS classfcaton algorthm. (2) The feature lbrary updatng algorthm adopts the method of asynchronous updatng, whch reduces the amount of calculaton of clent real-tme processng. (3) Improvng the Nave Bayesan classfcaton algorthm, and appled to real-tme SMS flterng, t can mmedately adapt to the changng content of short messages and flterng needs of the user. The rest of the paper s organzed as follows. Secton II present the related wor on the SMS Spam flterng. Secton III specfes the desgn and mplementaton of the system ncludng the system archtecture, feature lbrary, updatng algorthm and SMS classfcaton algorthm. The experments, testng and results are reported and dscussed n Secton IV. Fnally, the concluson s presented n Secton V. 2 Related Wor 2.1 Content Based SMS Spam Flterng Content based spam flterng algorthms nclude artfcal mmune [7], support vector machne, decson tree [8], Neural Networs and Naïve Bayes algorthm. The mmune system s a complex networ of organs and cells whch s responsble for the organsm s defense aganst alen partcles. The capacty of dstngushng between self and nonself genes s the man features of the mmune system. Support Vector Machnes s a lnear maxmal margn bnary classfer. It produces (lnear) vectors that try to maxmally separate the target classes (spam versus legtmate). It has shown excellent results n text classfcaton applcatons. Decson tree s a classfer n the form of a tree structure to show the reasonng process. Each node n decson tree structures s ether a leaf node or a decson node. Artfcal Neural Networs s also a wdely used machne learnng based flter whch are computer models that try to smulate the same tass that the bran carres out [9]. Naïve Bayes classfer s a smple probablstc classfer. The man advantage s that naïve Bayes classfers can be traned very effcently n a supervsed learnng. Al-Hasan and El-Alfy [10] developed a novel approach based on DCA that fuses the output from two 175

3 Optmzng Naïve Bayes Algorthm for SMS Spam Flterng on Moble Phone to Reduce the Consumpton of Resources classfers. The emprcal results showed sgnfcant mprovement can be acheved when applyng the proposed approach. Joe [11] proposed a SVM spam flter model whch selects meanngful features among thousands of orgnal ones by ch-square statstcs after preprocessng. Healy et al. [12] studed the problems of performng spam classfcaton on short messages by comparng the performance of the well-nown K-Nearest-Neghbor (KNN), Support Vector Machnes (SVM), and Nave Bayes classfers. They conclude that, for short messages, the SVM and Naïve Bayes classfers substantally outperform the KNN classfer. Hdalgo et al. [13] also accomplshed content flterng experments on Englsh and Spansh spam SMS corpora, whch proved that Bayesan flterng methods are stll effectve aganst spam SMS messages. Shah [14] also beleves that Naïve Bayes and Support Vector Machnes (SVM) based classfcatons are the two most successful technques. These methods all requre tranng samples, but because of prvacy ssues nvolved, lac of tranng sample lbrary s a major problem at present, whch has brought dffcultes to researches on SMS spam flterng [15]. In addton, for exstng two categores-based Bayesan spam flterng, f a user changes the category of some messages, the pror probablty of spam messages and normal SMS, the class condtonal probablty of features all need to be recalculated, whch wll ncrease the burden of clent processng. In ths paper, a feature lbrary s desgned and on whch an optmzed Naïve Bayes algorthm s proposed. 2.2 The Weanesses of Earler Study The flterng algorthm made no consderaton of moble phone s hardware resources. Moble phone s a embedded system wth lmted hardware resources. It brng dffcultes to run complex SMS spam flterng system. Most of prevous researches dd not consder the actual runnng envronment and the algorthm testng were conducted on personal computers. They only focused on the accuracy, precson and recall of spam flters from the vew pont of the algorthm used [10, 14], but a spam flter on a moble phone must consder the consumpton of resources, because energy, memory and computng ablty all are lmted. The flterng algorthm dd not tae nto account the user s ndvdualzed requrements. All users are gven the same classfcaton results [16]. Due to the specal nature of SMS, everyone has dfferent opnon. Frst of all, dfferent users gve dfferent classfcaton results to the same message, for example, a real estate sales based SMS s an mportant message to users nterested n buyng houses, but others may consder t to be a spam message. Secondly, user s preferences change over tme. At one stage, a message s consdered as a spam message. At another stage, t may be determned to be a normal SMS. 2.3 User s Expectatons from a SMS Spam Flterng Soluton From the vew of mplementaton, there are two nds of technologcal solutons used for SMS spam flterng, whch nclude Short Message Servce Center (SMSC) based flterng and flter by nstallng related software n the moble phone [17-18]. For SMSC based flterng, all the spam SMSes wll be fltered by moble operators. Once the classfcaton error occurs at the servce center, a normal message wll be lost and could not reach the moble phone user. It also nvolves the nterests of the operators, so ths nd of flter s not easy to mplement. In order to understand users requrements, behavor and perceptons related to spam SMS, Yadav et al. [19] conducted a survey from 458 partcpants of dfferent age groups and ctes n Inda, the result show that there s a need for personalzed SMS spam flterng whch taes nto account wth user preferences and nformaton requrements. Due to neffectveness of avalable solutons, moble users showed strong wllngness to use a technologcal soluton for SMS spam flterng. The purpose of ths paper s to propose an optmzed Naïve Bayes classfer for SMS Spam flterng on moble devces n order to reduce the consumpton of resources, that means the flter can run on common moble termnals wth lmted energy, memory and computng ablty. Even more, we expect our approach to bloc moble spam messages and have hgher accuracy and operatng effcency. The flter can also adapted to the preference of ndvduals. The ey research problem of ths paper ncludes: (1) How to desgn the feature lbrary. (2) How to update the feature lbrary wthout affectng the effcency of flterng system. (3) How to desgn the real-tme SMS spam flterng algorthm. 176

4 Journal of Computers Vol. 28, No. 3, Proposed System 3.1 System descrpton Ths paper proposed an optmzed Naïve Bayes classfer for SMS Spam Flterng on moble phones for the purpose of reducng the consumpton of resources. As shown n Fg. 1, the flterng system conssts of feature lbrary, feature lbrary updatng algorthm and onlne SMS spam flterng algorthm. The feature lbrary s generated on the PC sde by a large number of tranng samples after beng preprocessed and selected. The feature lbrary s updated regularly by the feature lbrary updatng algorthm, and the updatng cycle can be set by users. The feature lbrary updatng algorthm does not need to perform onlne, but performs n free tme of moble applcaton or be coped to PC sde asynchronously. Features of a new short message are sent to onlne SMS spam flterng algorthm. Meanwhle, the user can get real-tme classfcaton results. Fg. 1. System confguraton The updatng algorthm taes on the more complex and tme consumng jobs, such as computng and updatng of condtonal probabltes, whle the onlne SMS spam flterng algorthm performs only the mnmum amount of wor for flterng whch gets the data from the feature lbrary. The onlne SMS flterng algorthm requres mnmal resources whch can run on moble phones and can adapt tself qucly to the changng of the user s preference. 3.2 Features of Proposed System Tang up less moble storage capacty. After preprocessng and feature extracton, removng the SMS message s nterference and rrelevant nformaton, the feature lbrary only store the features and the number of feature occurrence n the sample database, not the actual content of short messages, whch can reduce the storage capacty. The feature lbrary can also be shared between flterng systems. Consumng less computng resources. Although the number of tranng samples ncreases, the amount of wor that a moble phone has to deal wth should almost be lttle. The flter consumes less tme and memory capacty, whch can be used on moble phones wth less computng and storage resources, that means the flter on the moble phone s free from burden of tranng process. Classfyng SMS messages accordng to user s preferences. Ths paper establshes feature lbrary on moble phone sde, the flterng system classfy short messages accordng to user s preferences to avod the lmtatons of tradtonal flterng system by whch all users are gven the same classfcaton results. 177

5 Optmzng Naïve Bayes Algorthm for SMS Spam Flterng on Moble Phone to Reduce the Consumpton of Resources 4 Implementaton 4.1 Feature Selecton Snce havng a good feature representaton s one of the most mportant parts for gettng a good classfer, we have to face the fact that SMS messages don t have the same structure and characterstcs than emal messages, and that could be a problem because fewer words extracted from SMS means less nformaton to wor wth. Many feature selecton technques are used n the area of text classfcaton such as mutual nformaton, CHI statstcs, Informaton Gan, Term strength, document frequency, etc. The purpose of feature selecton n ths paper s to select a subset of features from all the features n SMS samples whch should contan as much classfcaton nformaton as possble. Some of the features appear rarely n short messages, but they are very mportant for classfcaton, such as gun. Some of the features appear frequently n all short messages, such as good, but t does lttle to classfcaton. Mutual nformaton functon descrbes the relevance of a feature to a category. In SMS classfcaton, f the value of mutual nformaton functon of feature t to categores C s hgh, that s not to say the value of mutual nformaton functon of feature t to categores C j s low. So f t frequently appears n multple classes, ts effect to the classfcaton result s lttle. It even produces the nterference n the process of probablty calculaton, whch wll reduce the flter s performance. In ths paper, the mutual nformaton functon s mproved, If the feature t s frequently appear n only one category, t has a hgher weght value. wt ( ) s the value of feat_val. It s calculated accordng to the formula (1), (2), and (3). n 2 = = = 1 avg (1) w( t ) ( MI( t, C ) MI ( t )) Pt ( C) MI( t, C ) = log2 Pt ( ) PC ( ) (2) n MI ( t ) = P( C ) MI ( t, C ) (3) avg = 1 Pt ( ) represents the quotent whch the occurence of t belongs to the occurence of all features. PC ( ) represents the quotent whch the number of messages belong to type C dvde the the number of messages of the tranng set. It can be calculated accordng to the formula (4). The onlne SMS spam flterng algorthm reads ts value from table SMS_category. Pt ( C ) represents the quotent whch the number of occurrences of feature t n class C dvde the number of occurrences of all features n class C ( t s the probablty of term t gven presence of class C ), the value of t can be read form table Feature_cat. wt ( ) s the weght of feature t, t descrbes the ablty of feature t to dstngush C from other categores. Some of the features s selected and nserted nto the feature lbrary accordng to the value of wt ( ). The larger value of wt ( ), the hgher prorty for beng selected. 4.2 Desgn of the Feature Lbrary In order to reduce the CPU consumpton on the executon of SMS spam flterng algorthm and reduce memory capacty occuped by characterstc data of SMS samples, ths paper desgns the feature lbrary to save the characterstc data of the SMS samples, whch nclude SMS categores table SMS_category, feature nformaton table Feature_nfo and feature categores table Feature_cat. SMS_category (cat_d, category, samp_nu, flag, PC). Cat_d s an dentfer of a SMS category, and s the prmary ey of table SMS_category. Category s the name of a SMS category. Short messages can be set by users nto dfferent categores. For example, messages are dvded nto categores of merchandsng, pornographc and llegal nformaton, daly greetngs, busness communcaton, real 178

6 Journal of Computers Vol. 28, No. 3, 2017 estate sales, nsurance. Wheren merchandsng, pornographc and llegal nformaton categores are of spam messages. Daly greetngs and busness communcaton categores are of normal messages. Real estate sales and nsurance categores can be set as normal messages or spam messages by users accordng to ther preferences. Samp_nu s the number of the samples belong to the SMS category dentfed by the cat_d. Flag represents whether ths category s spam messages, where 1 represents the spam category and 0 represents the normal category. PC s the pror probablty of type C. PC ( ) s the value of PC whch can be calculated accordng to the formula (4). If a user changes the flag attrbute of a SMS category, the probablty of a message belongng to that category does not need to change accordngly, so does other relevant data n the feature lbrary, they can be automatcally updated. SMS spam flterng system can also adapt to the new category settngs on real-tme. SC PC ( ) = (4) S Where SC K s the number of messages of class C. S s the number of all the messages. Feature_nfo(feat_d, feat, feat_val, rec_tme). Feat_d s the dentfer of a feature, and s the prmary ey of table Feature_nfo. Feat s the name of the feature dentfed by the feat_d and feat_val s ts weght value of classfcaton. Rec_tme records the last tme of the feature beng accessed by the on lne SMS spam flterng system. In order to control growth rate of the storage space occuped by the feature lbrary and reduce the storage burden on the clent, ths artcle remove features from the feature lbrary perodcally accordng to the value of rec_tme. Feature_cat(feat_d, C, feat_nu, feat_c). Feat_d s the dentfer of a feature. C s the dentfer of a SMS category. Feat_nu s the number of tmes a feature characterzed by feat_d appears n short messages of class CK. Feat_C s the probablty the feature appears n short messages of CK class. The value of feat_c s Pt ( C) whch s calculated accordng to the formula (5). Feat_d reference feat_d property n table Feature_nfo. C reference cat_d property n table SMS_category. Word_d plus cat_d consttute the prmary ey of ths table. The value of feat_c s saved n the feature lbrary n order to avod the repeated calculaton durng the executon of the SMS spam flterng algorthm. It can be automatcally updated by the feature lbrary updatng algorthm descrbed n secton 4.2. N K Pt ( C) = N + m t _ c + 1 Nc K s the total number of occurrence of all features belong to C class. Nt _ c K s the number of occurrence of t belongs to C class, m s the total number of features n the feature lbrary and repeated features are calculated only once. 4.3 Feature Lbrary Updatng Algorthm The feature lbrary updatng algorthm s descrbed as follows: Algorthm FLUP(X) Input: New SMS features table X, feature lbrary L Output: Updated feature lbrary L Read category nformaton from X, update samp_nu, flag, PC( PC ( ) f class( csms ) = f class( csms ) C C ck ) n SMS_category table. Nc + 1 N 1 all PC ( ) = = PC ( ) + N + 1 N + 1 N + 1 ( C = C ) (6) all all all (5) Nc N all PC ( ) = PC ( ) N + 1 = N + 1 all all ( C C ) (7) 179

7 Optmzng Naïve Bayes Algorthm for SMS Spam Flterng on Moble Phone to Reduce the Consumpton of Resources For each feature n X: f t exsts n the Feature_nfo table then update feat_val n Feature_nfo table; update feat_nu, feat_c( Pt ( C ) ) n Feature_cat table; f class( csms ) = C and t csms Nt _ c + N K t_ csms Pt ( C) = ( C = C 並且 t csms ) Nc + N K all_ csms Nc N t _ csms = Pt ( C) + N + N N + N c all _ csms c all _ csms (8) 180 f class( csms ) = C and t csms N Pt ( C) Pt ( C) c = Nc + N all _ csms ( C C = 並且 t csms ) (9) else nset t nto Feature_nfo table and Feature_cat table; If the number of features Nall s greater than N max Remove N features of lowest weght value from Feature_nfo table n L; Cascadng delete the data from Feature_cat table. Csms s a newly classfed short message watng to be nserted nto the feature lbrary. s the number of the features of class C. Nall s the total number of the features n the feature lbrary. Nt _ csms s the number of occurrences of t n csms. 4.4 Classfcaton Algorthm N all _ csms Nc s the number of occurrences of all features n csms. The steps of optmzed Naïve Bayes onlne SMS spam flterng algorthm s as follows. Algorthm IMNB(nsms) Input: a new short message nsms, feature lbrary L Output:the category of nsms(spam message or normal message) Extract the feature t 1 t 2... t n from nsms For each C = cat_d n SMS_category table: Read PC ( ), Pt ( C) from L, compute PC ( nsms) = n PC ( ) Pt ( C) Fnd C, C where j = 1 n PC ( ) Pt ( C) = 1 P( C nsms) = arg max P( C nsms) and cat_ d&& cat_ d PC ( nsms) = arg max PcC nsms) j If ( C nospam and Cj nospam) The class of nsms s nospam; If ( C spam and Cj spam) The class of nsms s spam; If (( C spam and Cj nospam)&&( PC ( nsms )/ PC ( j nsms ))>R) The class of nsms s spam; else the class of nsms s nospam; If ( C nospam and Cj spam) The class of nsms s nospam.

8 Journal of Computers Vol. 28, No. 3, 2017 Nsms s a new short message watng to be classfed. n s the number of features extracted from the nsms. 5 Experments and Analyss 5.1 Expermental Envronment The experment use Tny210 development board as the hardware platform whch s based on ARM CortexTM-A8. Samsung S5PV210 s the man processor. It s runnng speed can reach 1 GHZ, ts memory s 512 MB. The SMS spam flterng algorthm s runnng on the embedded Lnux operatng system, SQLte s used as the embedded database to save the feature lbrary. The host machne nstall Lnux operatng system and arm-lnux-gcc cross complaton envronment. 5.2 Testng of the Feature Lbrary sze Because of lmted storage capacty and processng power, t s mportant to nown the storage space occuped by the feature lbrary. Ths experment test the sze of feature lbrary fle durng the ncreasement of features. The results are shown n Fg. 2. When the number of features s 25, the sze of feature lbrary fle s 5120 bytes. When the number of features s 54, the sze of feature lbrary fle s 6003 bytes. When the number of features s 82, the sze of feature lbrary fle s 7168 bytes. When the number of features reaches 170, the sze of feature lbrary fle s up to bytes. If 10 features are extracted from each of short messages, the growth of the feature lbrary fle s about bytes at every ncrease of one SMS sample. When 2000 short messages are nserted nto the feature lbrary, the fle sze of t s about K. It s completely feasble to run on the general moble platforms. Ths s acceptable because t can completely run on a moble platform even the cheapest moble phones. 5.3 Testng of the Accuracy Rate of the Algorthm Fg. 2. The feature lbrary sze versus feature number There s no publc corpus for Chnese SMS spam flterng. The experment uses the SMS samples collected by ourselves, ncludng 706 spam messages and 813 normal messages. The SMS samples s dvded nto tranng set and testng set. The tranng set ncludes 400 spam messages and 430 normal messages. The testng set conssts of 306 spam messages and 383 normal messages. Ths artcle uses an evaluaton ndex wdely used n spam flterng. The flterng accuracy s calculated by the formula (10). OA = ( TP + TN) ( TP + FP + FN + TN ) TP s the number of spam messages correctly classfed. TN s the number of normal messages correctly classfed. FP s the number of normal messages msclassfed as spam messages. (10) 181

9 Optmzng Naïve Bayes Algorthm for SMS Spam Flterng on Moble Phone to Reduce the Consumpton of Resources FN s the number of spam messages msclassfed as normal SMS. The accuracy rate of the optmzed Naïve Bayes SMS spam flterng algorthm proposed n ths paper s tested. The result s shown n Fg. 3. When the number of the features extracted from each of messages s between 9 to 13, the average accuracy rates of the algorthm are all over 90%, the accuracy rate reaches a maxmum when the number of features s 11. When the number of features s more than 13, the accuracy rate shows a downward trend, whch means that features wth low weght value reduce the performance of the flterng algorthm. Fg. 3. Accuracy rate vs. number of features 5.4 Testng of Tme Performance of the Flterng System The 689 short messages n the testng set are classfed usng the algorthm proposed n ths paper. The classfcaton tme of each message s recorded. Its value range from 0.05 seconds to 0.18 seconds, wth an average tme of 0.1 seconds. For comparson, the tradtonal Bayesan algorthm s also tested on tny210 platform, the average classfcaton tme of a messages s 1.1 seconds. Because the mproved algorthm omts a lot of calculaton, the processng speed s greatly mproved. We gnore the runnng tme of the feature lbrary updatng algorthm because t s assumed that the updatng could happen offlne and hence the tme spent by the updatng algorthm s not relevant. 6 Concluson and Future Wor Ths paper proposed a SMS spam flterng algorthm that requres mnmal resources and reflects user s ndvdual preference on an ndependent moble phone whch does not depend on another server or moble operators. It obtaned the reasonable accuracy, low storage consumpton and quc flterng speed. In addton to these, our approach ensures securty and prvacy because the flterng algorthm runs on a moble phone and each user has ts own feature lbrary, spammers have no way to access to the feature lbrary. The lmtatons of the algorthm proposed n ths paper s that t apples only to the text nformaton flterng. Feature selecton plays a very mportant role n short message classfcaton, the more effectve feature extracton and feature selecton approaches reman as nterestng future wors. Acnowledgement The research s supported by hgher school scentfc research project of Gansu Provnce (Grant No.2013A-127). The authors would le to than the Assocate Edtor and the revewers for ther valuable comments. References [1] Yesy YORK News, The global moble phone users wll amount to 4.8 bllon by the end of 2016, < 182

10 Journal of Computers Vol. 28, No. 3, / shtml>, [2] M.A. Balubad, U. Manzoor, B. Zafar, A. Quresh, N. Ghan, Ontology based SMS controller for smart phones, Internatonal Journal of Advanced Computer Scence and Applcatons 6(1)(2015) [3] T.A. Almeda, J.M.G. Hdalgo, A. Yamaam, Contrbutons to the study of SMS spam flterng: new collecton and results, n: Proc. the 11th ACM Symposum on Document Engneerng n DocEng 11, [4] Xnhua Net, Chna Moble Internet securty report n the frst half of Internet securty center, < xnhuanet.com/tech/ /29/c_ htm>, [5] H. Zhang, W. Wang, Applcaton of Bayesan method to spam SMS flterng, n: Proc. Internatonal Conference on Informaton Engneerng and Computer Scence, [6] I. Santos, C. Laorden, B. Sanz, P. G. Brngas, Reversng the effects of toensaton attacs aganst content-based spam flters, Internatonal Journal of Securty and Networs 8(2)(2013) [7] T.M. Mahmoud, A.M. Mahfouz, SMS spam flterng technque based on artfcal mmune system, Internatonal Journal of Computer Scence Issues 9(2)(2012) [8] L. Sh, Q. Wand, X. Ma, M. Weng, H. Qao, Spam emal classcaton usng decson tree ensemble, Journal of Computatonal Informaton Systems 8(3)(2012) [9] A. Rodan, H. Fars, J. Alqatawna, Optmzng feedforward neural networs usng bogeography based optmzaton for e- mal spam dentfcaton, Internatonal Journal Communcatons, Networ and System Scences 9(1)(2016) [10] A.A., Al-Hasan, E.-S.M. El-Alfy, Dendrtc cell algorthm for moble phone spam flterng, Proceda Computer Scence 52(2015) [11] I. Joe, H. Shm, An SMS spam flterng system usng support vector machne, n: Proc. Future Generaton Informaton Technology, [12] M. Healy, S. Delany, A. Zamolotsh, An assessment of case-based reasonng for short text message classfcaton, n: Proc. AICS 04, [13] J.M.G. Hdalgo, G.C. Brngas, E.P. Sanz, F.C. Garc, Content based SMS spam flterng, n: Proc. ACM Symposum on Document Engneerng, [14] T.B Shah, A. Yadav, Moble SMS spam flterng for Nepal text usng naïve Bayesan and support vector machne, Internatonal Journal of Intellgence Scence 4(1)(2014) [15] T.A. Almeda, J.M.G. Hdalgo, A. Yamaam, Contrbutons to the study of SMS spam flterng: new collecton and results, n: Proc. the 2011 ACM Symposum on Document Engneerng (ACM DOCENG 11), [16] H. Xa, P. Luo, M. Gao, Flterng algorthm for SMS spam based on desgn scence, Computer Engneerng and Applcatons 52(15)(2016) [17] Y.-H. Xu, M.-Y. Lu, Content-based jun short message flterng for moble phone, Journal of Bejng Informaton Scence and Technology Unversty 28(1)(2013). [18] H. Song, H. Km, K. Han, J. Cho, et al. A Sub-2 db NF dual-band CMOS LNA for CDMA/WCDMA applcatons, IEEE Mcrowave and Wreless Components Letter 18(3)(2008) [19] K. Yadav, A. Malhotra, P. Kumaraguru, R. Khurana, D.K. Sngh, Tae control over your SMSes: a real-world evaluaton of a moble-based spam SMS flterng system, < (accessed ). 183

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Security Enhanced Dynamic ID based Remote User Authentication Scheme for Multi-Server Environments

Security Enhanced Dynamic ID based Remote User Authentication Scheme for Multi-Server Environments Internatonal Journal of u- and e- ervce, cence and Technology Vol8, o 7 0), pp7-6 http://dxdoorg/07/unesst087 ecurty Enhanced Dynamc ID based Remote ser Authentcaton cheme for ult-erver Envronments Jun-ub

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection

Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton

More information

A User Selection Method in Advertising System

A User Selection Method in Advertising System Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Impact of a New Attribute Extraction Algorithm on Web Page Classification

Impact of a New Attribute Extraction Algorithm on Web Page Classification Impact of a New Attrbute Extracton Algorthm on Web Page Classfcaton Gösel Brc, Banu Dr, Yldz Techncal Unversty, Computer Engneerng Department Abstract Ths paper ntroduces a new algorthm for dmensonalty

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility IJCSNS Internatonal Journal of Computer Scence and Network Securty, VOL.15 No.10, October 2015 1 Evaluaton of an Enhanced Scheme for Hgh-level Nested Network Moblty Mohammed Babker Al Mohammed, Asha Hassan.

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Audio Content Classification Method Research Based on Two-step Strategy

Audio Content Classification Method Research Based on Two-step Strategy (IJACSA) Internatonal Journal of Advanced Computer Scence and Applcatons, Audo Content Classfcaton Method Research Based on Two-step Strategy Sume Lang Department of Computer Scence and Technology Chongqng

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment

Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment TELKOMNIKA, Vol.10, No.5, September 2012, pp. 1087~1092 e-issn: 2087-278X accredted by DGHE (DIKTI), Decree No: 51/Dkt/Kep/2010 1087 Parallel Implementaton of Classfcaton Algorthms Based on Cloud Computng

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Experiments in Text Categorization Using Term Selection by Distance to Transition Point

Experiments in Text Categorization Using Term Selection by Distance to Transition Point Experments n Text Categorzaton Usng Term Selecton by Dstance to Transton Pont Edgar Moyotl-Hernández, Héctor Jménez-Salazar Facultad de Cencas de la Computacón, B. Unversdad Autónoma de Puebla, 14 Sur

More information

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Improving anti-spam filtering, based on Naive Bayesian and neural networks in multi-agent filters

Improving anti-spam filtering, based on Naive Bayesian and neural networks in multi-agent filters J. Appl. Envron. Bol. Sc., 5(7S)381-386, 2015 2015, TextRoad Publcaton ISSN: 2090-4274 Journal of Appled Envronmental and Bologcal Scences www.textroad.com Improvng ant-spam flterng, based on Nave Bayesan

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

Research of Dynamic Access to Cloud Database Based on Improved Pheromone Algorithm

Research of Dynamic Access to Cloud Database Based on Improved Pheromone Algorithm , pp.197-202 http://dx.do.org/10.14257/dta.2016.9.5.20 Research of Dynamc Access to Cloud Database Based on Improved Pheromone Algorthm Yongqang L 1 and Jn Pan 2 1 (Software Technology Vocatonal College,

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling , pp.40-45 http://dx.do.org/10.14257/astl.2017.143.08 Applcaton of Improved Fsh Swarm Algorthm n Cloud Computng Resource Schedulng Yu Lu, Fangtao Lu School of Informaton Engneerng, Chongqng Vocatonal Insttute

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults 1 An Improved Neural Network Algorthm for Classfyng the Transmsson Lne Faults S. Vaslc, Student Member, IEEE, M. Kezunovc, Fellow, IEEE Abstract--Ths study ntroduces a new concept of artfcal ntellgence

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Improved Resource Allocation Algorithms for Practical Image Encoding in a Ubiquitous Computing Environment

Improved Resource Allocation Algorithms for Practical Image Encoding in a Ubiquitous Computing Environment JOURNAL OF COMPUTERS, VOL. 4, NO. 9, SEPTEMBER 2009 873 Improved Resource Allocaton Algorthms for Practcal Image Encodng n a Ubqutous Computng Envronment Manxong Dong, Long Zheng, Kaoru Ota, Song Guo School

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty sertekn@cse.psu.edu ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Remote Sensing Image Retrieval Algorithm based on MapReduce and Characteristic Information

Remote Sensing Image Retrieval Algorithm based on MapReduce and Characteristic Information Remote Sensng Image Retreval Algorthm based on MapReduce and Characterstc Informaton Zhang Meng 1, 1 Computer School, Wuhan Unversty Hube, Wuhan430097 Informaton Center, Wuhan Unversty Hube, Wuhan430097

More information

Research and Application of Fingerprint Recognition Based on MATLAB

Research and Application of Fingerprint Recognition Based on MATLAB Send Orders for Reprnts to reprnts@benthamscence.ae The Open Automaton and Control Systems Journal, 205, 7, 07-07 Open Access Research and Applcaton of Fngerprnt Recognton Based on MATLAB Nng Lu* Department

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Association Rule Mining with Parallel Frequent Pattern Growth Algorithm on Hadoop

Association Rule Mining with Parallel Frequent Pattern Growth Algorithm on Hadoop Assocaton Rule Mnng wth Parallel Frequent Pattern Growth Algorthm on Hadoop Zhgang Wang 1,2, Guqong Luo 3,*,Yong Hu 1,2, ZhenZhen Wang 1 1 School of Software Engneerng Jnlng Insttute of Technology Nanng,

More information

An IPv6-Oriented IDS Framework and Solutions of Two Problems

An IPv6-Oriented IDS Framework and Solutions of Two Problems An IPv6-Orented IDS Framework and Solutons of Two Problems We LI, Zhy FANG, Peng XU and ayang SI,2 School of Computer Scence and Technology, Jln Unversty Changchun, 3002, P.R.Chna 2 Graduate Unversty of

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

A Robust Webpage Information Hiding Method Based on the Slash of Tag

A Robust Webpage Information Hiding Method Based on the Slash of Tag Advanced Engneerng Forum Onlne: 2012-09-26 ISSN: 2234-991X, Vols. 6-7, pp 361-366 do:10.4028/www.scentfc.net/aef.6-7.361 2012 Trans Tech Publcatons, Swtzerland A Robust Webpage Informaton Hdng Method Based

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1 A New Feature of Unformty of Image Texture Drectons Concdng wth the Human Eyes Percepton Xng-Jan He, De-Shuang Huang, Yue Zhang, Tat-Mng Lo 2, and Mchael R. Lyu 3 Intellgent Computng Lab, Insttute of Intellgent

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

An Anti-Noise Text Categorization Method based on Support Vector Machines *

An Anti-Noise Text Categorization Method based on Support Vector Machines * An Ant-Nose Text ategorzaton Method based on Support Vector Machnes * hen Ln, Huang Je and Gong Zheng-Hu School of omputer Scence, Natonal Unversty of Defense Technology, hangsha, 410073, hna chenln@nudt.edu.cn,

More information

Application of Clustering Algorithm in Big Data Sample Set Optimization

Application of Clustering Algorithm in Big Data Sample Set Optimization Applcaton of Clusterng Algorthm n Bg Data Sample Set Optmzaton Yutang Lu 1, Qn Zhang 2 1 Department of Basc Subjects, Henan Insttute of Technology, Xnxang 453002, Chna 2 School of Mathematcs and Informaton

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY Proceedngs of the 20 Internatonal Conference on Machne Learnng and Cybernetcs, Guln, 0-3 July, 20 THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY JUN-HAI ZHAI, NA LI, MENG-YAO

More information

A fast algorithm for color image segmentation

A fast algorithm for color image segmentation Unersty of Wollongong Research Onlne Faculty of Informatcs - Papers (Arche) Faculty of Engneerng and Informaton Scences 006 A fast algorthm for color mage segmentaton L. Dong Unersty of Wollongong, lju@uow.edu.au

More information

GA-Based Learning Algorithms to Identify Fuzzy Rules for Fuzzy Neural Networks

GA-Based Learning Algorithms to Identify Fuzzy Rules for Fuzzy Neural Networks Seventh Internatonal Conference on Intellgent Systems Desgn and Applcatons GA-Based Learnng Algorthms to Identfy Fuzzy Rules for Fuzzy Neural Networks K Almejall, K Dahal, Member IEEE, and A Hossan, Member

More information

Professional competences training path for an e-commerce major, based on the ISM method

Professional competences training path for an e-commerce major, based on the ISM method World Transactons on Engneerng and Technology Educaton Vol.14, No.4, 2016 2016 WIETE Professonal competences tranng path for an e-commerce maor, based on the ISM method Ru Wang, Pn Peng, L-gang Lu & Lng

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

The Study of Remote Sensing Image Classification Based on Support Vector Machine

The Study of Remote Sensing Image Classification Based on Support Vector Machine Sensors & Transducers 03 by IFSA http://www.sensorsportal.com The Study of Remote Sensng Image Classfcaton Based on Support Vector Machne, ZHANG Jan-Hua Key Research Insttute of Yellow Rver Cvlzaton and

More information

Querying by sketch geographical databases. Yu Han 1, a *

Querying by sketch geographical databases. Yu Han 1, a * 4th Internatonal Conference on Sensors, Measurement and Intellgent Materals (ICSMIM 2015) Queryng by sketch geographcal databases Yu Han 1, a * 1 Department of Basc Courses, Shenyang Insttute of Artllery,

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD Analyss on the Workspace of Sx-degrees-of-freedom Industral Robot Based on AutoCAD Jn-quan L 1, Ru Zhang 1,a, Fang Cu 1, Q Guan 1 and Yang Zhang 1 1 School of Automaton, Bejng Unversty of Posts and Telecommuncatons,

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information