CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University

Size: px
Start display at page:

Download "CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University"

Transcription

1 CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made to learn. In the feld known as data mnng, text categorzaton has been extensvely studed by the machne learnng communty, as t s a classc example of supervsed learnng by machnes. In supervsed learnng, computers learn from a categorzaton functon whch s calculated usng the nformaton from the labeled tranng data provded by the supervsor. Then, t can classfy any unlabeled document nto predefned categores. It s known that the learner s approxmaton of the functon wll mprove wth the amount of tranng data suppled to t. However supplyng tranng data to the learnng machne s expensve n terms of tme and money snce t s generally done by humans. In ths paper, our motvaton s to decrease the number of tranng data needed by the computers n order to decrease the human nterventon as much as possble. Ths paper talks about a very effcent method to select the tranng data from the unlabeled data pool to tran the classfer machnes. We present expermental results showng that wth carefully selected tranng data, the need for labeled data can be sgnfcantly reduced, and the learner s classfcaton performance s preserved, even ncreased n some cases. 1. INTRODUCTION Machne learnng s the study of computer algorthms that mprove machnes ablty to act lke human bengs automatcally through experence. Applcatons of machne learnng range from desgn prncples of new generaton of robots to ntellgent search engnes whch have become ndspensable tools n almost everybody s daly and professonal lves. Engneers who work on desgnng ntellgent machnes have turned to machne learnng methods because t s more effectve and more practcal than havng to use computer programmng to code every scenaro a machne mght encounter. In recent years, there has been an exploson n computaton and nformaton technologes. Moreover, the amount of textual nformaton avalable n electronc format has ncreased sgnfcantly due to the wde use of computers and mproved storage facltes. For example, e- busnesses are challenged wth understandng and fndng useful patterns from the terabytes of customer nformaton that they collect. In addton to the stored data, the World Wde Web has become an ndspensable source of nformaton n people s lves. The vast amount of accumulatng news data from the daly onlne news portals have already attracted the attenton of data mners. The need to have access to the rght and relevant nformaton n the shortest tme and the desre to reach to organzed and categorzed text has ncreased tremendously. As a consequence, there s an ncreasng nterest n machne learnng methods to perform varous natural language processng tasks n order to make effcent use of these large amounts of nformaton. One way for computers to learn to categorze documents automatcally s a supervsed learnng method. In supervsed learnng, t s generally accepted that the almost certan way of mprovng categorzaton performance s to ncrease the sze of the tranng dataset. The reason for ths, n text doman, s that the sparsty and the varety of the language makes t mpossble to construct a tranng dataset that covers all possble cases. The prmary motvaton for actve learnng comes from the dffculty and cost of obtanng labeled tranng examples snce those samples have to be labeled by human experts. In some domans, such as ndustral process modelng, a sngle tranng example may requre several days and cost thousands of dollars to generate. In other domans lke text categorzaton or emal flterng, obtanng examples s not expensve, but may requre the user to spend hours on the tedous work of labelng them. In order to avod the excessve tme and money wasted on creatng labeled tranng data, actve learnng s used to fnd the most nformatve samples among the avalable unlabeled data. In actve learnng, nstead of ly pckng documents and manually labelng for our tranng set, we have the opton of more carefully choosng (or queryng) documents from the unlabeled pool of documents. We can choose our next sample from the unlabeled data pool,

2 Fgure 1. Passve Learner based upon the answers to our prevous queres [9], [7]. Expermental results show that, by carefully selectng the tranng data, the computers wll be n need of less tranng data to learn how to classfy documents to predefned categores. Less tranng data means less tme s spent by human supervsors for data labelng and learnng functon s calculated n a faster perod of tme snce tranng tme s drectly proportonal to the number of tranng data. In the lterature, there has been a great deal of research n actve learnng. For example, Cohn et al. [2] mnmze the varance component of the estmated generalzaton error. Freund et al. [4] employ a commttee of classfers, and query a pont whenever the commttee members dsagree. We use a well known machne learnng algorthm, called support vector machnes (SVMs) to select the most nformatve documents for actve learnng wthout knowng ther labels from the unlabeled pool of documents. A smlar work by Tong et al. also uses SVMs to select queres to mnmze the verson space sze, lke we do. The remander of the paper s organzed as follows. Sectons 2 and 3 descrbe supervsed learnng and actve learnng respectvely. Secton 4 brefly ntroduces SVM algorthm. Secton 5 provdes the bascs on usng SVMs for actve learnng. Experments on real text data are provded n Secton 6. Secton 7 dscusses the expermental results and future work. We conclude the paper by summarzng the contrbuton of the paper to the lterature, socety and ndustry wth possble applcatons of the method to the emergng applcatons of herarchcally structured sets of categores such as Yahoo and Google. 2. SUPERVISED LEARNING The automated categorzaton (or classfcaton) of texts nto topcal categores has a long hstory, datng back at least to the early s. In the begnnng, the most effectve approach to the problem seemed to be that of manually buldng automatc classfers by means of knowledge engneerng technques. Those technques consst of manually defnng a set of rules encodng expert knowledge on how to classfy documents under a gven set of categores. However, manually defned rules are generally doman dependent and perform poorly for the newly encountered patterns n the unseen data. In the 90s, wth the boomng producton and avalablty of onlne documents, automated text categorzaton has Fgure 2. Actve Learner wtnessed more smarter methods lke supervsed learnng. In supervsed learnng, a general nductve process, called the learner, automatcally bulds a classfer by learnng the characterstcs of the categores from a set of prevously classfed documents. For ths purpose, frst, a tranng dataset s formed by gatherng a sgnfcant quantty of data that s ly sampled from the underlyng populaton dstrbuton. In tradtonal supervsed learnng, these large numbers of tranng data have to be prepared n advance and the documents categores are generally labeled manually. Wth the labeled tranng data, we use a learner to generate a mappng from documents to topcs. We call ths mappng a classfer. We can, then, use the classfer to label new, unseen documents. Ths methodology s called passve supervsed learnng [9]. A passve learner (Fg. 1) receves a data set from the world and then outputs a classfer (or model). Often, the most tme-consumng and costly task n these applcatons s the gatherng of data. In many cases, we have lmted resources for collectng such data. On the other hand, n some cases, such as web page categorzaton, we can collect necessary data easly, but t can take tremendous amount of tme to label them. Hence, t s partcularly valuable to determne ways n whch we can make use of these resources as much as possble. 3. ACTIVE LEARNING Actve learnng (Fg. 2) dffers from passve learnng (Fg. 1) n that the learnng algorthm tself attempts to select the most nformatve examples for tranng. Snce supervsed labelng of data s expensve, actve learnng attempts to reduce the human effort needed to learn an accurate result by selectng only the most nformatve examples for labelng. In [1], actve learnng s defned as any form of learnng n whch the learnng algorthm has some control over whch part of the nput space t receves nformaton from. An actve learnng strategy therefore allows the learner 1 to dynamcally select tranng examples, durng tranng, from a canddate set as receved from the supervsor 2. The learner captalzes on 1 Learner can also be thought as a computer or a machne. 2 Supervsor can be thought as a human who provdes labeled tranng data to the computer.

3 Fgure 3. Possble hyperplanes n bnary classfcaton settng current attaned knowledge to select examples from the canddate tranng set that are most lkely to solve the problem, or that wll lead to a maxmum decrease n error. Rather than passvely acceptng tranng patterns from the supervsor, the system s allowed to have some determnstc control over whch examples to accept and to gude the search for the most nformatve patterns. In pool-based actve learnng for classfcaton [6], the learner has access to a pool of unlabeled data and can request the true class label for a certan number of nstances n the pool. In many domans ths s a reasonable approach snce a large quantty of unlabeled data s readly avalable. The man ssue wth actve learnng s fndng a way to choose good requests or queres from the pool. 4. SUPPORT VECTOR MACHINES We employ Support Vector Machnes (SVMs) as our base learnng algorthm for ther effectveness n many learnng tasks, partcularly those nvolvng text classfcaton [3], [5]. We consder SVMs n the bnary classfcaton settng. As t can be seen from Fgure 3., one can fnd nfnte numbers of hyperplanes whch separates two dfferent labeled data f they are lnearly separable. On the other hand, SVMs fnd the hyperplane that separate the tranng data by a maxmal margn (Fg. 4). All vectors lyng on one sde of the hyperplane are labeled as 1 (or negatve), and all vectors lyng on the other sde are labeled as +1 (or postve). The tranng nstances that le closest to the hyperplane are called support vectors. A formal defnton of SVM s as follows. We are gven tranng data {x 1 x n } that are d vectors n some space X R. We are also gven ther labels {y 1... y n } where y { 1, 1}. SVMs allow us to project the orgnal tranng data n space X to a hgher dmensonal feature space F va a Mercer kernel operator K. Generally, ths transformaton s done f the data s Fgure 4. Support vector machne hyperplanes separate the tranng data by a maxmal margn lnearly nseparable n the orgnal lower dmensonal space. In other words, we consder the set of classfers of the form: f ( x) K( x, x). n = α When ( x) 0 = 1 f we classfy x as + 1; otherwse we classfy x as 1. When K satsfes Mercer s condton we can wrte:, ) = Φ( ) Φ( ) where Φ : X F and K( u v u v denotes an nner product. We can then rewrte f as: f ( x) = w r n r Φ( x), where w= α Φ( = 1 x In equaton (1), w r s the normal unt vector of the hyperplane that separates the data wth maxmal margn. SVM computes the α s that correspond to the maxmal margn hyperplane n F. By choosng dfferent kernel functons we can mplctly project the tranng data from X nto dfferent feature spaces whch we can perform lnear classfcaton on data. (A hyperplane n F maps to a more complex decson boundary n the orgnal space X.) Support vector machne classfers have met wth sgnfcant successes n numerous real world classfcaton tasks. We present an algorthm for performng actve learnng wth SVMs. We apply our method to text categorzaton doman and show that our method can sgnfcantly reduce the need for tranng data. 5. ACTIVE LEARNING WITH SVMS Support vector machnes (SVMs) are generally appled usng a ly selected tranng set classfed n advance. The theoretcal advantages and the emprcal success of Support Vector Machne makes t an attractve choce as a learnng method to use wth actve learnng. ) (1)

4 Gven an unlabeled pool U, an actve learner l has three components: ( f, q, X ). The frst component s a classfer, f : X { 1, + 1}, traned on the current set of labeled data X. The second component q (X ) s the queryng functon that, gven a current labeled set X, decdes whch nstance n U to query next. The actve learner can return a classfer f after each query or after some fxed number of queres. Ths brngs us the ssue of how to choose the next unlabeled nstance to query. We perform our experments n three settngs: Random pck: The pck method smply ly chooses the next query pont from the unlabeled pool. Ths method reflects what happens n the regular passve learnng settng. Smple margn actve learnng: Ths method chooses the closest document to the current hyperplane to ask for ts label. Clearly, selectng unlabeled nstances far away from the hyperplane s not helpful, snce ther class membershps are certan. The most nformatve nstances for refnng the hyperplane are the unlabeled nstances near the hyperplane, wthn the margn. The documents that smple margn algorthm focuses on le n the yellow regon(band) n Fgure 4. Smple actve learnng: Ths fnal method selects the next query wth the same method lke smple margn method does. However, n ths case, nstead of examnng all the unlabeled pool of data to see whch one s closest to the hyperplane, a small constant number of ly chosen examples are examned. The zed search frst samples M tranng examples and select the best one among these M examples. We choose M as 50 n our experments because mathematcally t can be proved that the best among 50 tranng examples has 95% chances to belong to the best 5% examples n the whole tranng set. The mathematcal proof s beyond the scope of ths paper, t can be found n [8]. 6. EXPERIMENTS The emprcal evaluaton s done on a collecton of news stores, the Reuters corpus, whch s known as the standard real-world benchmark for natural language processng, nformaton retreval and machne learnng systems. The frst step n text categorzaton s to transform documents from strngs of characters nto a representaton sutable for the learnng algorthm and the classfcaton task. Informaton Retreval research suggest that word stems work well as representaton unts. After preprocessng, the tranng corpus contans around 9000 dstnct terms. Ths number corresponds to the dmensons of the orgnal tranng data space X whch s mentoned n Secton 4. Each document s represented as a vector n the space and each dstnct word they contan s the vector component. Actve learnng s used n the context of text classfcaton throughout our experments. Reuters dataset conssts of text documents whch are already categorzed nto predefned categores. We wll act as f we don t know ther labels and perform automatc text classfcaton wth SVMs usng actve learnng and compare our results wth the actual labels of documents. At each experment, we consder the documents of one category as postve class and all the rest of the documents belongng to other categores as negatve class. Thus we can form a bnary classfcaton settng for SVMs. The dataset contans 93 tranng and 3299 test documents. We start wth ten labeled documents, 5 postve and 5 negatve documents. Ths mples that our unlabeled data pool conssts of 9593 documents ntally. Ths number goes down gradually as the documents from the pool are labeled and added to the tranng data. At each step, a query s selected from the unlabeled data pool accordng to the methods descrbed n Secton 5. Ths query s ncluded n the tranng set wth ts label. Then the model s created wth SVMs usng the current tranng set followed by the predcton step where 3299 test documents labels are predcted by the classfer. By comparng the predcted labels wth the actual labels we get the classfcaton accuracy values. An accuracy measurement technque called Precson-Recall Breakeven Pont () s used throughout the experments. Ths performance measure s chosen because t suts well when the postve and the negatve sets are unbalanced n number. The accuracy results for four categores among the most populated ten categores of Reuters dataset s presented n Fgure 5. Random pck and smple curves are the averages of ten runs. Our man nterest n these curves are the ones that belong to actve learnng methods. Random pck method just reflects what happens n the regular passve learnng settng. The curves reached ther peak values by smple margn actve learnng method by only labelng 250 documents on average. Wth smple actve learnng method, the peak values are reached wth 350 labeled documents on average. In practce ths means, users only have to label 250 documents wth smple margn method and 350 documents wth smple method nstead of 93 docs. Smple method s speed s evdently hgher than the smple margn method. Smple s curve reaches ts peak value 6 tmes faster than that of the smple margn s curve. Snce smple method does not have to examne every document n the unlabeled pool but deals wth 50 documents only, ts computaton complexty s much lower.

5 7. DISCUSSIONS a b c smplemargn smple smplemargn smple smplemargn smple smple margn smple d. Fgure 5. Classfcaton accuracy values n percentages for categores a.) Gran, b.) Money-fx, c.)trade d.) Crude As we can see n Fgure 5, n actve learnng setups, after addng certan number of labeled tranng data to the tranng set, the accuracy graphs reach saturaton at some value. In other words, addng more tranng data does not ncrease the accuracy of the classfer after some pont. The reason for ths s that by growng the sze of the tranng set, the learner s knowledge about large regons of the nput space becomes more and more confdent so that addtonal samples from these regons are bascally redundant; hence they do not contrbute consderably to an mprovement n ts generalzaton ablty. Wth the actve learnng methods, the most nformatve documents are already added nto the tranng set n the former steps, the remanng documents n the pool are not useful snce they don t gve extra nformaton to the system. In other words, ther addton to the tranng set does not change the already produced hyperplane whch s a well bult separator between two dfferent classes. Wth pck method, snce canddate documents for labelng are not selected wsely, t s not guaranteed that the nformatve nstances wll be added frst. Therefore, n ths case, the generalzaton ablty of the classfer tends to ncrease slower as the new data comes. In the prevous secton, we also observed an unusual phenomenon at some of the learnng curves. When tranng examples were added at, generalzaton ncreased monotoncally untl all avalable examples are added. When tranng examples were added by usng actve learnng method, generalzaton peaked to a level above the pont that s acheved by the learner when all data had fnally been added. We showed that better performance can be acheved from a small subset of the data that we can acheve usng all avalable data wth some categores (Fg. 5.a. & 5.b.). Based on our observaton, ths stuaton occurs f ether or both the postve and negatve sets n the classfcaton are nosy. Nose, n classfcaton systems corresponds to falsely labeled documents. The peak condton at the curves wll be nvestgated n detal n our future work. Future work can also nvolve determnng the place that ths peak performance s acheved durng the run tme so that the system stops automatcally addng examples. In ths way, we can get a hgher classfcaton accuracy by labelng and usng only the most nformatve documents. We beleve that the nformatve documents le n between the support vectors of each data set whch s represented as the yellow area n Fgure 4. Thus f we can detect the tme that the documents between current support vectors are exhausted, we can make the system stop for searchng new queres. Snce the documents are represented as vectors n the text doman, we can compare the new selected document s coordnate and ts dstance to the hyperplane wth those of the exstng support vectors.

6 Consderng the runnng tmes of the two actve learnng method, smple method has an evdent superorty over the smple margn method. Snce smple method does not have to examne the whole unlabeled data pool to select one query, t seems to be a very promsng actve learnng method for very large databases. 8. CONCLUSIONS Collectng tranng data n machne learnng can be dffcult, expensve and tme consumng. Ths paper advocates an approach whch facltates collectng tranng data. Wth the help of actve learnng, classfers can work wth less tranng data wthout any accuracy loss. Furthermore, computers learn faster wth less but very nformatve tranng data. It s also worthwhle to hghlght several contrbutons of the actve learnng method descrbed n ths paper: Smple margn and smple results mply that the actve learnng decreases the number of tranng documents needed by the learner machne. Consderng the tme ssues, the advantage of usng lesser tranng data s two folds: Frst, the need for manual labelng of documents decreases, so the effort and money used for ths purpose can be channeled to other drectons. Second, the tranng tmes of the classfer decrease snce the tranng tme of the classfer s very much dependent on the number of tranng nstances. Faster workng machnes are always desred n every part of lfe. Besdes, ths speed ncrease makes possble of dealng very large databases whch otherwse would be mpossble due to the computaton tmes. The classcal actve learnng methods consder all of the unlabeled documents n the pool. There s a prevalent belef that, all the unlabeled documents must be evaluated by the actve learnng process, to select the best document to ask to the user for ts label. Smple method shows the contrary. Regardless of the sze of the whole unlabeled tranng data pool, each teraton of the actve learnng step, we can ly select a small group of nstances and fnd the best document to ask for ts label wthn ths small group. Ths s also a reason to make actve learnng applcable to very large databases. Most mportantly, t can be used n emergng applcatons such as herarchcally structured sets of categores provded by companes lke Yahoo and Google. Those search engne companes work also on producng drectory structures for the whole World Wde Web. The crawlers of those companes download mllons of web pages nto ther servers everyday. Wth the method presented n ths paper, they can create the tranng set of ther classfers by labelng only a small number of web pages. Once they create a relable classfer, the rest of the Web can be categorzed nto predefned categores. In ths paper, the actve learnng algorthms are dscussed n the context of text categorzaton. Actve learnng s not restrcted only to the text doman, they can be appled to other domans as well. Image retreval (or categorzaton), handwrtten dgt recognton, proten classfcaton, recommendaton systems are the just few examples of the areas that actve learnng can be used. Experments on text categorzaton ndcate that the actve learnng method can be hghly effectve n makng computers learn faster. 9. ACKNOWLEDGEMENTS Specal thanks go to my advsor Prof. Lee Gles for hs advce and helpful comments throughout my studes... REFERENCES [1] Cohn, D. A., Atlas, L., Ladner, R., Improvng generalzaton wth actve learnng, Machne Learnng, 15 (1994), pages [2] Cohn, D. A., Ghahraman, Z., & Jordan, M. I., Actve learnng wth statstcal models, Journal of Artfcal Intellgence Research, 4, 1996, pages [3] Dumas, S. T., Platt, J. Heckerman D. Saham M., Inductve learnng algorthms and representatons for text categorzaton, In Proceedngs of the Seventh Internatonal Conference on Informaton and Knowledge Management, ACM Press, 1998 [4] Freund, Y., Seung, H. S., Shamr, E., & Tshby, N., Selectve samplng usng the query by commttee algorthm, Machne Learnng, 28, 1997, pages [5] Joachms, T., Text categorzaton wth support vector machnes, In Proceedngs of the European Conference on Machne Learnng. Sprnger-Verlag, 1998 [6] Lews, D., Gale, W., A sequental algorthm for tranng text classfers In Proceedngs of the Seventeenth Annual Internatonal ACM-SIGIR Conference on Research and Development n Informaton Retreval, pages Sprnger- Verlag, 1994 [7] Schohn, G., Cohn, D., Less s more: Actve learnng wth support vector machnes, In Proceedngs of the Internatonal Conference on Machne Learnng, [8] Scholkopf, B., Smola, A. J., Learnng wth Kernels, MIT Press, Cambrdge, MA, [9] Tong, S., Koller, D., Support vector machne Actve learnng wth applcatons to text classfcaton, Journal of Machne Learnng Research, 2001.

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Incremental Learning with Support Vector Machines and Fuzzy Set Theory The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 48 CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 3.1 INTRODUCTION The raw mcroarray data s bascally an mage wth dfferent colors ndcatng hybrdzaton (Xue

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Using Neural Networks and Support Vector Machines in Data Mining

Using Neural Networks and Support Vector Machines in Data Mining Usng eural etworks and Support Vector Machnes n Data Mnng RICHARD A. WASIOWSKI Computer Scence Department Calforna State Unversty Domnguez Hlls Carson, CA 90747 USA Abstract: - Multvarate data analyss

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

CLASSIFICATION OF ULTRASONIC SIGNALS

CLASSIFICATION OF ULTRASONIC SIGNALS The 8 th Internatonal Conference of the Slovenan Socety for Non-Destructve Testng»Applcaton of Contemporary Non-Destructve Testng n Engneerng«September -3, 5, Portorož, Slovena, pp. 7-33 CLASSIFICATION

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Multi-Criteria-based Active Learning for Named Entity Recognition

Multi-Criteria-based Active Learning for Named Entity Recognition Mult-Crtera-based Actve Learnng for Named Entty Recognton Dan Shen 1 Je Zhang Jan Su Guodong Zhou Chew-Lm Tan Insttute for Infocomm Technology 21 Heng Mu Keng Terrace Sngapore 119613 Department of Computer

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Efficient Text Classification by Weighted Proximal SVM *

Efficient Text Classification by Weighted Proximal SVM * Effcent ext Classfcaton by Weghted Proxmal SVM * Dong Zhuang 1, Benyu Zhang, Qang Yang 3, Jun Yan 4, Zheng Chen, Yng Chen 1 1 Computer Scence and Engneerng, Bejng Insttute of echnology, Bejng 100081, Chna

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Adaptive Transfer Learning

Adaptive Transfer Learning Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Learning to Classify Documents with Only a Small Positive Training Set

Learning to Classify Documents with Only a Small Positive Training Set Learnng to Classfy Documents wth Only a Small Postve Tranng Set Xao-L L 1, Bng Lu 2, and See-Kong Ng 1 1 Insttute for Infocomm Research, Heng Mu Keng Terrace, 119613, Sngapore 2 Department of Computer

More information

Support Vector Machines. CS534 - Machine Learning

Support Vector Machines. CS534 - Machine Learning Support Vector Machnes CS534 - Machne Learnng Perceptron Revsted: Lnear Separators Bnar classfcaton can be veed as the task of separatng classes n feature space: b > 0 b 0 b < 0 f() sgn( b) Lnear Separators

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies Deep Classfer: Automatcally Categorzng Search Results nto Large-Scale Herarches Dkan Xng 1, Gu-Rong Xue 1, Qang Yang 2, Yong Yu 1 1 Shangha Jao Tong Unversty, Shangha, Chna {xaobao,grxue,yyu}@apex.sjtu.edu.cn

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

An Anti-Noise Text Categorization Method based on Support Vector Machines *

An Anti-Noise Text Categorization Method based on Support Vector Machines * An Ant-Nose Text ategorzaton Method based on Support Vector Machnes * hen Ln, Huang Je and Gong Zheng-Hu School of omputer Scence, Natonal Unversty of Defense Technology, hangsha, 410073, hna chenln@nudt.edu.cn,

More information

A Taxonomy Fuzzy Filtering Approach

A Taxonomy Fuzzy Filtering Approach JOURNAL OF AUTOMATIC CONTROL, UNIVERSITY OF BELGRADE, VOL. 13(1):25-29, 2003 A Taxonomy Fuzzy Flterng Approach S. Vrettos and A. Stafylopats Abstract - Our work proposes the use of topc taxonomes as part

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

A Novel Term_Class Relevance Measure for Text Categorization

A Novel Term_Class Relevance Measure for Text Categorization A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Random Kernel Perceptron on ATTiny2313 Microcontroller

Random Kernel Perceptron on ATTiny2313 Microcontroller Random Kernel Perceptron on ATTny233 Mcrocontroller Nemanja Djurc Department of Computer and Informaton Scences, Temple Unversty Phladelpha, PA 922, USA nemanja.djurc@temple.edu Slobodan Vucetc Department

More information

General Vector Machine. Hong Zhao Department of Physics, Xiamen University

General Vector Machine. Hong Zhao Department of Physics, Xiamen University General Vector Machne Hong Zhao (zhaoh@xmu.edu.cn) Department of Physcs, Xamen Unversty The support vector machne (SVM) s an mportant class of learnng machnes for functon approach, pattern recognton, and

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature Detecton of hand graspng an object from complex background based on machne learnng co-occurrence of local mage feature Shnya Moroka, Yasuhro Hramoto, Nobutaka Shmada, Tadash Matsuo, Yoshak Shra Rtsumekan

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples 94 JOURNAL OF COMPUTERS, VOL. 4, NO. 1, JANUARY 2009 Relable Negatve Extractng Based on knn for Learnng from Postve and Unlabeled Examples Bangzuo Zhang College of Computer Scence and Technology, Jln Unversty,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Wavelets and Support Vector Machines for Texture Classification

Wavelets and Support Vector Machines for Texture Classification Wavelets and Support Vector Machnes for Texture Classfcaton Kashf Mahmood Rapoot Faculty of Computer Scence & Engneerng, Ghulam Ishaq Khan Insttute, Top, PAKISTAN. kmr@gk.edu.pk Nasr Mahmood Rapoot Department

More information

On Some Entertaining Applications of the Concept of Set in Computer Science Course

On Some Entertaining Applications of the Concept of Set in Computer Science Course On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

Cordial and 3-Equitable Labeling for Some Star Related Graphs

Cordial and 3-Equitable Labeling for Some Star Related Graphs Internatonal Mathematcal Forum, 4, 009, no. 31, 1543-1553 Cordal and 3-Equtable Labelng for Some Star Related Graphs S. K. Vadya Department of Mathematcs, Saurashtra Unversty Rajkot - 360005, Gujarat,

More information

Web Spam Detection Using Multiple Kernels in Twin Support Vector Machine

Web Spam Detection Using Multiple Kernels in Twin Support Vector Machine Web Spam Detecton Usng Multple Kernels n Twn Support Vector Machne ABSTRACT Seyed Hamd Reza Mohammad, Mohammad Al Zare Chahook Yazd Unversty, Yazd, Iran mohammad_6468@stu.yazd.ac.r chahook@yazd.ac.r Search

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

A Lazy Ensemble Learning Method to Classification

A Lazy Ensemble Learning Method to Classification IJCSI Internatonal Journal of Computer Scence Issues, Vol. 7, Issue 5, September 2010 ISSN (Onlne): 1694-0814 344 A Lazy Ensemble Learnng Method to Classfcaton Haleh Homayoun 1, Sattar Hashem 2 and Al

More information