A classification scheme for applications with ambiguous data
|
|
- Dominic Bartholomew Shaw
- 6 years ago
- Views:
Transcription
1 A classfcaton scheme for applcatons wth ambguous data Thomas P. Trappenberg Centre for Cogntve Neuroscence Department of Psychology Unversty of Oxford Oxford OX1 3UD, England Andrew D. Back Katestone Scentfc 64 MacGregor Tce Bardon, QLD 4065 Australa Abstract We propose a scheme for pattern classfcatons n applcatons whch nclude ambguous data, that s, where pattern occupy overlappng areas n the feature space. Such stuatons frequently occur wth nosy data and/or where some features are unknown. We demonstrate that t s advantageous to frst detect those ambguous areas wth the help of tranng data and then to re-classfy those data n these areas as ambguous before makng class predctons on test sets. Ths scheme s demonstrated wth a smple example and benchmarked on two real world applcatons. Keywords: data classfcaton, ambguous data, probablstc ANN, k-nn algorthm. 1. Introducton Adaptve data classfcaton s a core ssue n data mnng, pattern recognton, and forecastng. Many algorthms have been developed ncludng classcal methods such as lnear dscrmnant analyss and bayesan classfers, more recent statstcal technques such as k-nn (k-nearest-neghbors, MARS (multvarate adaptve regresson splnes, machne learnng approaches for decson trees etc, ncludng C4.5, CART, C5, Bayes tree and neural network approaches such as multlayer perceptrons and neural trees [1-8]. Most of these classfcaton schemes work well enough when the classes are separable. However, n many real world problems the data may not be separable. That s, there may exst regons n the feature space that are occuped by more than one class. In many problems, ths ambguty n the data s unavodable. A smlar problem occurs when the data are very closely spaced and a hghly nonlnear decson boundary s requred to separate the data. Accordngly, the ams of much recent work on classfcaton has been to seek ways to fnd better nonlnear classfers. Partcularly notable n ths area s the feld of support vector machnes [9, 10]. SVMs have captured much nterest, as they are able to fnd nonlnear classfcaton boundares whle mnmzng the emprcal rsk of false classfcaton. However, t s mportant to consder what the data really means and what the practcal, real world goals are. In many cases, t s desred to fnd a smple classfer whch gves the user a rough, but understandable gude to what the data means. On the other hand, the data tself may be contamnated by nose, some nput varables are completely mssng (.e. the data s then n a feature space whch s mplctly too low or when the data s flawed n other ways. Ths ssue s commonly referred to n the context of robust statstcs and outler detecton. In ths paper we propose a method for preprocessng ambguous data. The basc dea s qute straght forward: rather than seek a complex classfer, we am to frst examne the data wth the am of removng any ambgutes. Once the ambguous data s removed, we then apply whatever classfer s requred, hopefully one whch may lead to a much
2 smpler soluton than would otherwse be obtaned. In dong ths we acknowledge that data n some regons of the state space can ether not be classfed at all or else our confdence n dong so s low. Hence those data ponts are labeled n a dfferent manner to facltate better classfcaton. Our proposed scheme s to dentfy ambguous data and to re-classfy those data wth an addtonal class. We wll call ths addtonal class IDK (I don't know to ndcate that predctng the class of ths data should not be attempted due to t's ambguty. By dong so one looses some predcton of data. However, we wll show that we can gan on the other hand a drastcally ncrease n the confdence of the classfcaton of the remanng data. Our proposed scheme s outlned n the next secton. There we wll also ntroduce a partcular mplementaton that wll be used n the examples dscussed n ths paper. The synthetc example n secton 3 s amed to llustrate the underlyng problem on the proposed soluton n more detal. In secton 4 we wll report on the performance of our algorthms on two real world (benchmark data sets, that of the classcal Irs benchmark [11], and that of a medcal data set [12]. 2. Detecton of Ambguous Data As stressed n the ntroducton, our scheme s to detect ambguous state space areas and to re-classfy data n these areas. Hence, our proposed scheme has the followng steps: 1. Test all data ponts aganst a crtera of ambguty. 2. Re-classfy tranng data whch are ambguous. 3. Classfy test data wth algorthm traned on the re-classfed data Note that the scheme s ndependent of any partcular classfcaton algorthm. In practce t mght be crtcal to choose approprate algorthms for each of these steps. As a means of llustratng the proposed method, we use some partcular algorthms whch are descrbed below. For the frst step we employ a k-nn algorthm [3]. Ths algorthm takes the k closest data ponts to a data pont n queston nto account to decde on the new class of ths data pont n queston. If an overwhelmng maorty of neghborng data s of one partcular class, then ths class s taken to be the class of ths data pont. If no overwhelmng maorty can be reached then ths data pont s declared as ambguous and classfed as member of the class IDK. The next step requres a classfcaton method for the predctve classfcaton of further data (test data. Whle any type of adaptve classfer could be chosen, n the followng test we use a feedforward neural network Hdden Layer: (1 (1 (1 (1 (1 Hdden Layer : a = w k xk + θ ; h = f ( a (2 (2 (2 (2 (2 Output Layer : a = w h + θ ; y = f ( a wth a softmax output layer defned by the normalzed transfer functon k y = exp( a k exp( (2 (2 ak so that the outputs can be treated as probabltes, or confdence values, that the nput data belongs to a class ndcated by the partcular output node. Ths network s traned on the negatve cross entropy E = µ µ t y ( x µ ; w
3 Fgure 1: Example wth overlappng data. The left column shows examples wthn a standard classfcaton procedure, whereas the rght column shows examples wth the proposed re-classfcaton scheme. (a The raw nput data (tranng set wth data from class a (crcles and class b (squares. (b Re-classfed tranng set ncludng ambguous data n class IDK (trangles. (c Classfcaton of the orgnal tranng data after tranng wth a probablstc MLP. False classfcatons are marked wth sold symbols. (d Classfcaton of the re-classfed tranng set. (e,f Performance on a test set. (g,h Probablty surface of class a generated by the two networks. whch s approprate to gve the network outputs the probablstc nterpretaton [13]. t µ s thereby the target vector of tranng example µ wth component t = 1 f the example belongs to the class. In the examples below we use the Levenberg-Marquardt algorthm (LM [14] to tran the network on the tranng data set. 4. Partally Overlappng Classes: An Example Here we llustrate the proposed scheme usng two overlappng classes n a two dmensonal state space. In the followng we defne two classes a and b. The features x 1 and x 2 of the data from each class are thereby drawn from an equal dstrbuton wth overlappng areas: class a : class b : x 1 x [0,1]; 1 [0.6,1.6]; 2 x 2 [0,1] x [0.6,1.6]
4 An example of 100 data ponts of these two classes s shown n Fgure 1(a, data from class a as crcles and data from class b as squares. For comparson we traned a classfcaton network drectly on these tranng data. The networks had always 10 hdden nodes and were traned wth 100 LM tranng steps. The classfcatons of the tranng data wth ths network after tranng are shown n Fgure 1(c. Only 4 data ponts have not been classfed correctly. The network even learned to classfy most of the tranng data n the ambguous area. The re-classfcaton of these tranng wth the KNN algorthm descrbed above s shown n Fgure 1(b. Data wth components 0.6 < x 1 <1 are ambguous. K = 10 nearest neghbors (ncludng the data pont tself were taken nto account when choosng a new class structure for ths data set. The class of the data pont was set to the maorty of data f 80% or more of the neghborng data (ncludng the data pont tself were of ths maorty. If such a maorty could not be reached the data were classfed as class IDK and symbolzed wth open trangles n the fgures. These re-classfed data were used as tranng data for a second classfcaton network smlar to the prevous network. We only added one output node to account for the addtonal class IDK. The classfcaton of the re-classfed tranng data wth the classfcaton network after tranng s shown n Fgure 1(d. Only one data pont was not correctly classfed. More mportant than the performance of the classfcaton network on the tranng data s that of the performance on test data. An example s shown n Fgure 1(e and 1(f for the two separate classfcaton networks respectvely. The network that was traned wth ambguous data (Fgure 1(e falsely classfed 1/3 of the test data correspondng to a standard performance value of P' := n c /n = 0.67, where n c are the number of correct classfcatons and n are the number of data. As mght be expected there are of course numerous false classfcatons n the area wth ambguous data. However, what s even more dsturbng s that there are a lot of false classfcatons of data far away from ths area. Ths can also clearly be seen from the predcton surface of ths network, whch s llustrated n Fgure 1(g wth gray values. Whte correspond thereby to a confdence (value of the frst output node of 1 of predctng class a wth ths network, whereas black correspond to a confdence of 0 of predctng class a (whch correspond to a confdence of 1 to predct class b n ths example. It can clearly be seen that the attempt of the network to fnd a classfcaton scheme n the area wth ambguous data let to the proposal of structure that does not correspond to the underlyng problem. Ths structure s extrapolated to areas wthout ambguous data leadng to the pure performance on data n these areas of the nput space. The stuaton s much better wth the re-classfed data. The results of the classfcatons of the same test data wth the network traned on the re-classfed data are shown n Fgure 1(f. Only fve pattern have been falsely classfed, all of whch are close to the boundares of the area wth ambguous data. Ths correspond to a standard performance of P' = 0.95 (compared to 0.67 when takng all classes nto account. The mprovement comes largely from the fact that the underlyng problem has not any longer ambguous data, so that perfect classfcatons can be expected n the lmt of nfnte data. Ths s contrary to the orgnal problem whch ncludes ambguous data and hence a perfect classfcaton can not be archved n the lmt of nfnte tranng data. Indeed, n the example shown n Fgure 1(f, no data have been classfed wrongly when takng only predctons of class a and b nto account. Ths wll not always be the case but wll be true n the nfnte data lmt. Moreover, the areas far away from the ambguous area can be predcted wth hgh confdence, and the false classfcatons wll have a much lower confdence value. 4. Real World Data: Some Benchmark Examples The prevous example was ntended to descrbe our scheme and to llustrate why ths scheme should be useful. However, only data taken from real-world examples can tell f ths scheme s useful n practce. Hence, we wll report n the followng on some ntal study of the applcaton of ths scheme to some real world data. The followng examples are all taken from the UCI repostory of machne learnng databases [15] whch can be accessed va the Internet Irs Dataset We tested our scheme frst on the classcal Irs-data benchmark. The dataset contans 150 examples wth 4 physcal propertes of 3 members of the famly of rs flowers. The dataset was frst used by Fscher n 1936 [11] to llustrate multdata dscrmnant analyss technques n taxometrc problems. We dvded the dataset evenly nto a tranng set and a test set by takng 25 examples from each class nto each subset. The tranng dataset was re-classfed wth the same procedure as n the example of secton 3 wthout
5 adustng any parameters. Ths re-classfcaton run classfed 8 examples of the tranng data set as class IDK. The class label of all members of the frst class were preserved, whch showed that the tranng set of ths class dd not nclude ambguous data and was easly separable from the other classes. Ths fndng s n accordance wth smlar fndngs of Duch et al. [16]. We used a smlar network as n the prevous example for the classfcaton task tself; only the number of output nodes were adusted to represent the requred number of classes. After 100 tran steps the network was able to represent all data n the tranng sets, of the orgnal data as well as of the re-classfed data. The network made 4% false classfcatons (3 examples on the test set wth the network traned on the orgnal data. However, no false classfcatons were made n the classfcaton task wth the network traned on the re-classfed data when only takng classfcatons of rs types nto account. The prce to pay was that 11 examples of the test set were labeled as IDK Wsconsn Breast Cancer data The second test was made on medcal data compled by Dr. Wllam H. Wolberg at the Unversty of Wsconsn Hosptals, Madson. A smaller database of these data was ntally studed n [12]. The verson of the dataset we used contaned data from 699 patents wth 9 predctve attrbutes used n breast cancer dagnoss. The data were classfed n two classes: bengn (458 nstances, 65.5% and malgnant (241 nstances, 34.5%. Data from 16 patents were ncomplete. We gnored these records n the followng test. However, t should be stressed that ncomplete nformaton should lead to more ambguous data and should favor our approach. The effect of ncomplete data wll be dscussed n more detal elsewhere. We agan used the same KNN re-classfcaton algorthms as n the prevous example wthout adustng any parameters. The data were randomly dvded nto 340 tranng data and 343 test data. 14 data ponts were classfed as IDK wth the KNN re-classfcaton algorthm. All data ponts of the tranng sets, the orgnal on the re-classfed, were classfed correctly after tranng the networks. However, the network traned on the orgnal data classfed 23 nstances (6.7% of the test data ncorrectly, whereas the second network made only 6 mstakes (1.75%. 22 nstances of the test data were classfed as IDK. 5. Concluson and Outlook In ths paper we have proposed a scheme to solve classfcaton problems wth ambguous data. We showed that classfcaton problems wth ambguous data can lead to pure classfcaton results, not only for data n the ambguous areas but also for data n the areas whch should have a much better predctablty. We showed that the dentfcaton of ambguous nput areas and the use of re-classfed data for the tranng of the classfcaton algorthms can lead to an drastc reducton of false predctve classfcatons. Hence one should consder avodng predctons of some data n areas whch are hghly ambguous. We thnk that ths approach s very sutable when predctons have to be made wth partcular cauton. There are many ssues wthn ths scheme that have to be dscussed n the future. In partcular we used only a smple k-nn re-classfcaton scheme to dentfy ambguous areas n the nput space. We nether studed systematcally the dependence on the parameters of ths algorthm, nor dd we explore whch algorthms mght be best suted for a partcular problem. There are also a varety of classfcaton algorthms avalable, each whch mght have advantages for partcular applcatons. Our network can be mproved wth a Bayesan regularzaton scheme, whch can also provde addtonal nformaton on the complexty of the underlyng problem. Some work n ths drecton s n progress. Also other advanced classfcaton methods such as SVM can be used. Acknowledgment: We would lke to thank Wlodek Duch for the dscussons of hs results on rule extracton durng hs vst n Japan.
6 References [1] Bretman, L., Fredman, J.H., Olshen, R.A., Stone, C.J., Classfcaton and Regresson Trees, Wadsworth, Belmont, CA, [2] Buntne, W.L., Learnng classfcaton trees, Statstcs and Computng 2, 63-73, [3] Cover, T.M., Hart, P.E. Nearest neghbor pattern classfcatons, IEEE Transacton on Informaton Theory 13, 21-27, [4] Duda, R.O., Hart, P.E.m Pattern Classfcaton and Scene Analyss, Wley, New York, [5] Hanson R., Stutz, J., Cheeseman, P. Bayesan classfcaton wth correlaton and nhertance,proceedngs of the 12th Internatonal Jont Conference on Artfcal Intellgence 2, Sydney, Australa,Morgan Kaufmann, ,1991. [6] Mche, D., Spegelhalter, D.J., Taylor, C.C., (edtors, Machne Learnng, Neural and Statstcal Classfcaton, Ells Horwood, [7] Rchard, M.D., Lppmann, R.P., Neural network classfers etmate Bayesan a-posteror probabltes, Neural Computaton 3, , [8] Tso, A.C., Pearson, R.A., Comparson of three classfcaton technques, CART, C4.5, and multlayer perceptrons, n Advances n Neural Informaton Processng Systems 3, , [9] Vapnk V., The Nature of Statstcal Learnng Theory, Sprnger Verlag, New York, [10] Vapnk, V., Golowch, S., Smola, A., Support vector method for functon approxmaton, regresson estmaton, and sgnal processng, n Advances n Neural Informaton Processng Systems 9, [11] Fscher R., The use of multple measurements n taxonomc problems, Annals of Eugencs 7, pp , 1936 [12] O. L. Mangasaran and W. H. Wolberg, Cancer dagnoss va lnear programmng, SIAM News, Volume 23, Number 5, September 1990, pp [13] Amar, S., Backpropagaton and stochastc gradent desent methods, Neurocomputng 5, , [14] Hagan, M.T., Menha, M., Tranng feedforward networks wth the Marquardt algorthm, IEEE Transactons n Neural Networks, vol.5, no.6, pp , 1994 [15] Mertz, C.J., Murphy, P.M., UCI repostory of machne learnng databases, [16] Duch W, Adamczak R, Grabczewsk K, Zal G (1999 Hybrd neural-global mnmzaton method of logcal rule extracton, Int. Journal of Advanced Computatonal Intellgence (n prnt.
Support Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationThe Research of Support Vector Machine in Agricultural Data Classification
The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou
More informationClassifying Acoustic Transient Signals Using Artificial Intelligence
Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationCS 534: Computer Vision Model Fitting
CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust
More informationMachine Learning 9. week
Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationClassification / Regression Support Vector Machines
Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationA Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines
A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd
More information12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification
Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero
More informationFace Recognition University at Buffalo CSE666 Lecture Slides Resources:
Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationInvestigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers
Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,
More informationClassification Methods
1 Classfcaton Methods Ajun An York Unversty, Canada C INTRODUCTION Generally speakng, classfcaton s the acton of assgnng an object to a category accordng to the characterstcs of the object. In data mnng,
More informationHuman Face Recognition Using Generalized. Kernel Fisher Discriminant
Human Face Recognton Usng Generalzed Kernel Fsher Dscrmnant ng-yu Sun,2 De-Shuang Huang Ln Guo. Insttute of Intellgent Machnes, Chnese Academy of Scences, P.O.ox 30, Hefe, Anhu, Chna. 2. Department of
More informationAnnouncements. Supervised Learning
Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples
More informationSVM-based Learning for Multiple Model Estimation
SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationBOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET
1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationLearning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks
In AAAI-93: Proceedngs of the 11th Natonal Conference on Artfcal Intellgence, 33-1. Menlo Park, CA: AAAI Press. Learnng Non-Lnearly Separable Boolean Functons Wth Lnear Threshold Unt Trees and Madalne-Style
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationFace Recognition Based on SVM and 2DPCA
Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty
More informationSHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE
SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationDetermining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationComplex System Reliability Evaluation using Support Vector Machine for Incomplete Data-set
Internatonal Journal of Performablty Engneerng, Vol. 7, No. 1, January 2010, pp.32-42. RAMS Consultants Prnted n Inda Complex System Relablty Evaluaton usng Support Vector Machne for Incomplete Data-set
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationRelated-Mode Attacks on CTR Encryption Mode
Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory
More informationEYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS
P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye
More informationData Mining: Model Evaluation
Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct
More informationFEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur
FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationIncremental Learning with Support Vector Machines and Fuzzy Set Theory
The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and
More informationExtraction of Fuzzy Rules from Trained Neural Network Using Evolutionary Algorithm *
Extracton of Fuzzy Rules from Traned Neural Network Usng Evolutonary Algorthm * Urszula Markowska-Kaczmar, Wojcech Trelak Wrocław Unversty of Technology, Poland kaczmar@c.pwr.wroc.pl, trelak@c.pwr.wroc.pl
More information2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements
Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.
More informationBAYESIAN MULTI-SOURCE DOMAIN ADAPTATION
BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,
More informationTaxonomy of Large Margin Principle Algorithms for Ordinal Regression Problems
Taxonomy of Large Margn Prncple Algorthms for Ordnal Regresson Problems Amnon Shashua Computer Scence Department Stanford Unversty Stanford, CA 94305 emal: shashua@cs.stanford.edu Anat Levn School of Computer
More informationAn Anti-Noise Text Categorization Method based on Support Vector Machines *
An Ant-Nose Text ategorzaton Method based on Support Vector Machnes * hen Ln, Huang Je and Gong Zheng-Hu School of omputer Scence, Natonal Unversty of Defense Technology, hangsha, 410073, hna chenln@nudt.edu.cn,
More informationA New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1
A New Feature of Unformty of Image Texture Drectons Concdng wth the Human Eyes Percepton Xng-Jan He, De-Shuang Huang, Yue Zhang, Tat-Mng Lo 2, and Mchael R. Lyu 3 Intellgent Computng Lab, Insttute of Intellgent
More informationA New Approach For the Ranking of Fuzzy Sets With Different Heights
New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays
More informationApplying EM Algorithm for Segmentation of Textured Images
Proceedngs of the World Congress on Engneerng 2007 Vol I Applyng EM Algorthm for Segmentaton of Textured Images Dr. K Revathy, Dept. of Computer Scence, Unversty of Kerala, Inda Roshn V. S., ER&DCI Insttute
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationSkew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach
Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research
More informationOutline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:
Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationDetection of an Object by using Principal Component Analysis
Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,
More informationLecture 4: Principal components
/3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness
More informationType-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data
Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES
More informationResearch of Neural Network Classifier Based on FCM and PSO for Breast Cancer Classification
Research of Neural Network Classfer Based on FCM and PSO for Breast Cancer Classfcaton Le Zhang 1, Ln Wang 1, Xujewen Wang 2, Keke Lu 2, and Ajth Abraham 3 1 Shandong Provncal Key Laboratory of Network
More informationImplementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status
Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status
More informationA Multivariate Analysis of Static Code Attributes for Defect Prediction
Research Paper) A Multvarate Analyss of Statc Code Attrbutes for Defect Predcton Burak Turhan, Ayşe Bener Department of Computer Engneerng, Bogazc Unversty 3434, Bebek, Istanbul, Turkey {turhanb, bener}@boun.edu.tr
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationCLASSIFICATION OF ULTRASONIC SIGNALS
The 8 th Internatonal Conference of the Slovenan Socety for Non-Destructve Testng»Applcaton of Contemporary Non-Destructve Testng n Engneerng«September -3, 5, Portorož, Slovena, pp. 7-33 CLASSIFICATION
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationNAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics
Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson
More informationHigh-Boost Mesh Filtering for 3-D Shape Enhancement
Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More informationCollaboratively Regularized Nearest Points for Set Based Recognition
Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,
More informationFace Detection with Deep Learning
Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationFuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System
Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105
More informationCHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION
48 CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 3.1 INTRODUCTION The raw mcroarray data s bascally an mage wth dfferent colors ndcatng hybrdzaton (Xue
More informationUnder-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset
Under-Samplng Approaches for Improvng Predcton of the Mnorty Class n an Imbalanced Dataset Show-Jane Yen and Yue-Sh Lee Department of Computer Scence and Informaton Engneerng, Mng Chuan Unversty 5 The-Mng
More informationSupport Vector Machines. CS534 - Machine Learning
Support Vector Machnes CS534 - Machne Learnng Perceptron Revsted: Lnear Separators Bnar classfcaton can be veed as the task of separatng classes n feature space: b > 0 b 0 b < 0 f() sgn( b) Lnear Separators
More informationAdaptive Transfer Learning
Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk
More informationData Mining For Multi-Criteria Energy Predictions
Data Mnng For Mult-Crtera Energy Predctons Kashf Gll and Denns Moon Abstract We present a data mnng technque for mult-crtera predctons of wnd energy. A mult-crtera (MC) evolutonary computng method has
More informationVol. 5, No. 3 March 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.
Journal of Emergng Trends n Computng and Informaton Scences 009-03 CIS Journal. All rghts reserved. http://www.csjournal.org Unhealthy Detecton n Lvestock Texture Images usng Subsampled Contourlet Transform
More informationA Comparative Study of Fuzzy Classification Methods on Breast Cancer Data *
Comparatve Study of Fuzzy Classfcaton Methods on Breast Cancer Data * Rav. Jan, th. braham School of Computer & Informaton Scence, Unversty of South ustrala, Mawson Lakes Boulevard, Mawson Lakes, S 5095
More informationComplex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.
Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal
More informationGeneral Vector Machine. Hong Zhao Department of Physics, Xiamen University
General Vector Machne Hong Zhao (zhaoh@xmu.edu.cn) Department of Physcs, Xamen Unversty The support vector machne (SVM) s an mportant class of learnng machnes for functon approach, pattern recognton, and
More informationGA-Based Learning Algorithms to Identify Fuzzy Rules for Fuzzy Neural Networks
Seventh Internatonal Conference on Intellgent Systems Desgn and Applcatons GA-Based Learnng Algorthms to Identfy Fuzzy Rules for Fuzzy Neural Networks K Almejall, K Dahal, Member IEEE, and A Hossan, Member
More informationAudio Content Classification Method Research Based on Two-step Strategy
(IJACSA) Internatonal Journal of Advanced Computer Scence and Applcatons, Audo Content Classfcaton Method Research Based on Two-step Strategy Sume Lang Department of Computer Scence and Technology Chongqng
More informationEmpirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap
Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*
More informationTighter Perceptron with Improved Dual Use of Cached Data for Model Representation and Validation
Proceedngs of Internatonal Jont Conference on Neural Networks, Atlanta, Georga, USA, June 49, 29 Tghter Perceptron wth Improved Dual Use of Cached Data for Model Representaton and Valdaton Zhuang Wang
More informationUnderstanding K-Means Non-hierarchical Clustering
SUNY Albany - Techncal Report 0- Understandng K-Means Non-herarchcal Clusterng Ian Davdson State Unversty of New York, 1400 Washngton Ave., Albany, 105. DAVIDSON@CS.ALBANY.EDU Abstract The K-means algorthm
More informationProper Choice of Data Used for the Estimation of Datum Transformation Parameters
Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and
More informationEXTENDED BIC CRITERION FOR MODEL SELECTION
IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7
More informationAssociative Based Classification Algorithm For Diabetes Disease Prediction
Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha
More informationImpact of a New Attribute Extraction Algorithm on Web Page Classification
Impact of a New Attrbute Extracton Algorthm on Web Page Classfcaton Gösel Brc, Banu Dr, Yldz Techncal Unversty, Computer Engneerng Department Abstract Ths paper ntroduces a new algorthm for dmensonalty
More informationFast Feature Value Searching for Face Detection
Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com
More informationA MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES
A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES Aram AlSuer, Ahmed Al-An and Amr Atya 2 Faculty of Engneerng and Informaton Technology, Unversty of Technology, Sydney, Australa
More informationClassification Based Mode Decisions for Video over Networks
Classfcaton Based Mode Decsons for Vdeo over Networks Deepak S. Turaga and Tsuhan Chen Advanced Multmeda Processng Lab Tranng data for Inter-Intra Decson Inter-Intra Decson Regons pdf 6 5 6 5 Energy 4
More informationUSING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES
USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES 1 Fetosa, R.Q., 2 Merelles, M.S.P., 3 Blos, P. A. 1,3 Dept. of Electrcal Engneerng ; Catholc Unversty of
More information