Ensemble-based Feature Selection Criteria

Size: px
Start display at page:

Download "Ensemble-based Feature Selection Criteria"

Transcription

1 Esemble-based Feature Selectio Criteria Terry Wideatt 1, Matthew Prior 1, Niv Effro 2, Natha Itrator 2 1 Cetre for Visio, Speech ad Sigal Proc (CVSSP), Uiversity of Surrey, Guildford, Surrey, Uited Kigdom GU2 7XH 2 School of Computer Sciece, Tel-Aviv Uiversity, Ramat-Aviv 69978, Israel [t.wideatt,m.prior]@surrey.ac.uk, ivefro@gmail.com, i@post.tau.ac.il Abstract. Recursive Feature Elimiatio (RFE) combied with feature rakig is a effective techique for elimiatig irrelevat features whe the feature dimesio is large, but it is difficult to distiguish betwee relevat ad redudat features. The usual method of determiig whe to stop elimiatig features is based o either a validatio set or cross-validatio techiques. I this paper, we preset feature selectio criteria based o out-of-bootstrap (OOB) ad class separability, both computed o the traiig set thereby obviatig the eed for validatio. The RFE method described i this paper uses a two-class eural etwork classifier ad the rakig of features is based o the magitude of eural etwork weights. This approach is compared experimetally with a oisy bootstrapped versio of Fisher s Liear Discrimiat (FLD) to rak features. The techiques are exteded to multi-class problems usig the Error-Correctig Output Codig (ECOC) method. Experimetal ivestigatio o artificial ad atural bechmark data demostrates the effectiveess of these criteria i selectig optimal umber of features ad classifier complexity. Furthermore, the kow locatio of ifluetial features i the simulated data permits the use of ROC (Receiver Operatig Curve) to demostrate the performace of RFE. Keywords: RFE, ECOC, Multiple Classifiers, feature selectio. 1 Itroductio Cosider a supervised learig problem, i which traiig patters cosist of a large umber of features, may of which are suspected to be irrelevat to the classificatio problem at had. To reduce dimesioality, a decisio eeds to be take whether to select or extract features. Oe of the most popular geeral purpose feature extractio techiques is Pricipal Compoet Aalysis (PCA), which is a mappig or projectio o to the pricipal directios ad is a effective method of feature space reductio. It is particularly importat to reduce the umber of features for small sample size problems (Sectio 3). I geeral, feature extractio techiques make use of all the origial features whe mappig to ew features. However, over-fittig may result if the dimesio space is high. Furthermore, it may ot be successful due to complex class distributios [1]. Fially, feature extractio methods are difficult to iterpret i terms of the importace of origial features. This loss of iterpretability is oe of the key reasos why feature selectio is preferred i may data miig ad bio-iformatics applicatios. I this paper, oly feature selectio methods are cosidered. Feature selectio has received attetio for may years from researchers i the fields of patter recogitio, machie learig ad statistics. Exhaustive eumeratio of all subsets

2 of features is impractical except for a few features. Various greedy algorithms have bee developed to fid the best subset of features. A popular approach is to rak features accordig to a suitable criterio. ad various strategies are possible oce the raked feature set is obtaied. For example, a fixed umber of features may be selected to desig a classifier or alteratively a threshold may be set o the rakig criterio to determie the umber of features. Selectig the optimal umber of features with respect to geeralizatio performace ormally requires a validatio set or cross-validatio techiques. I this paper, we preset feature selectio criteria based o out-of-bootstrap (OOB) ad class separability. They are computed o the traiig set, so that validatio is ot required. The separability measure defied i equatio (4) was proposed i [2] for selectig optimal base classifier complexity. The mai cotributio of this paper is to propose stoppig criteria based o class separability ad OOB estimate whe RFE is applied to a esemble. The paper is orgaized as follows. Sectio 2 describes the relevat cocepts for esemble methods, icludig the OOB estimate ad the Error-Correctig Output Codig (ECOC) method for solvig multi-class problems. Sectio 3 discusses feature rakig methods ad Recursive Feature Elimiatio (RFE). Sectio 4 describes the datasets used for experimetatio as well as the ROC curve for characterizig the performace o simulated data, for which the locatio of ifluetial features is kow. Experimetal results are give i Sectio 5, showig that optimal umber of features ad base classifier complexity may be selected without eed for validatio. 2 Esemble Methods The Multiple Classifier System (MCS) or committee/esemble approach, has emerged over recet years to address the practical problem of desigig classificatio systems with improved accuracy ad efficiecy. The aim is to desig a composite system that outperforms idividual classifiers by poolig together the decisios of all classifiers. The ratioale is that it may be more difficult to optimise the desig of a sigle complex classifier tha to optimise the desig of a combiatio of relatively simple (base) classifiers. I this paper, we assume a simple parallel MCS architecture with homogeous base classifiers. For two-class classifiers the combiig rule is majority vote while for multi-class the decisio rule is defied i equatio (7). Ijectig radomess ito the MCS framework has bee foud to be a good strategy for improvig geeralisatio performace. Radom perturbatios have bee show to be useful i patter space (Bootstrappig), feature space (Radom Subspace Method RSM [3]), Class Labels (Error-Correctig Output Codig: ECOC) as well as i base classifiers themselves. Of these four types of radom perturbatio methods, all are used i this paper except RSM. ECOC is described i Sectio 2.2, ad radom weights are used to iitialize eural etwork base classifiers i Sectio 5. Bootstrappig [4] is a popular esemble techique ad implies that if µ traiig patters are radomly sampled with replacemet, (1-1/µ)) µ 37% are removed with remaiig patters occurrig oe or more times. The out-of-bootstrap (OOB) estimate uses the patters left out. The idividual base classifier OOB should be distiguished from the esemble OOB. For the esemble OOB, all traiig patters cotribute to the estimate, but the oly participatig classifiers for each patter are those that have ot bee used with that patter for traiig (that is, approximately thirtyseve percet of classifiers). Note that OOB gives a biased estimate of the absolute value

3 of geeralizatio error [5], but i this paper, the estimate of the absolute value is ot importat. The OOB estimate for the ECOC esemble is give i Sectio 2.2. Selectig parameters for MCS desig should ideally be carried out usig oly the traiig set, but this is usually difficult ad results i a biased choice. Model selectio from traiig data is kow to require a built-i assumptio, sice realistic learig problems are i geeral ill-posed [6]. The assumptio i this paper is that base classifier complexity ad umber of features may be selected usig a bootstrap estimate ad that patters left out of the bootstrap may be used to determie optimal values with respect to geeralizatio error. A potetial problem with the bootstrap is that each base classifier sees oly approximately sixty three percet of the traiig set. It is show experimetally i Sectio 5 that the reduced umber of traiig patters does ot lead to a iaccurate estimate of the optimal values but may lead to a iaccurate estimate of the absolute value of geeralizatio error. Note that the bootstrap estimate does ot require ay assumptios regardig uderlyig probability distributios. 2.1 Diversity ad Class Separability Attempts to uderstad the effectiveess of the MCS framework have prompted the developmet of various measures. The Margi cocept was used origially to help explai Boostig ad Support Vector Machies. Bias ad Variace are cocepts from regressio theory that have motivated modified defiitios for 0/1 loss fuctio for characterisig Baggig ad other esemble techiques. Various diversity measures have bee studied with the itetio of determiig whether they correlate with esemble accuracy. However, the questio of whether the iformatio available from ay of these measures ca be used to assist MCS desig is ope. Most commoly, MCS parameters are set with the help of either a validatio set or cross-validatio techiques [7]. Diversity measures have received much attetio recetly sice it is recogized that diversity amog base classifiers is a ecessary coditio for improvemet i esemble performace. However, there is o geeral agreemet about how to quatify the otio of diversity amog a set of classifiers. Diversity measures ca be categorised ito two types [8], pair-wise ad o-pair-wise. I order to apply pair-wise measures to fidig overall diversity of a set of classifiers it is ecessary to average over the set. No-pair-wise measures attempt to measure diversity of a set of classifiers directly, based for example o variace, etropy or proportio of classifiers that fail o radomly selected patters. The mai difficulty with diversity measures is the so-called accuracy-diversity dilemma. As explaied i [9], as base classifiers approach the highest levels of accuracy, diversity must decrease so that it is expected that there will be a trade-off betwee diversity ad accuracy. The Diversity/Accuracy Dilemma leads us to expect that esemble performace may ot be optimized whe each idividual classifier is optimized [10]. There has bee o covicig theory or experimetal study to suggest that there exists ay measure that ca reliably predict geeralisatio error of a esemble. However, i [11] the OOB estimate was used to tue diversity via early-stoppig of a eural etwork esemble. Classical class separability measures refer to the ability to predict separatio of patters ito classes usig origial features ad rely o a Gaussia assumptio [12]. I [2] a class separability measure is proposed for MCS that is based o a biary feature represetatio, i which each patter is represeted by its biary MCS classifier decisios. It is restricted to two-class problems ad results i a biary-to-biary mappig. The problem with

4 applyig classical class separability measures is that the implicit Gaussia assumptio is ot appropriate for this mappig [13]. Let there be µ patters with the label ω m give to each patter x m where m = 1, µ. I a MCS framework, the mth patter may be represeted by the B-dimesioal vector formed from the B base classifier decisios give by x = ξ, ξ,, ξ ) ξ mi, ω m {0,1}, i = 1 B (1) m K ( m1 m2 mb I equatio (1) ω m =f(x m ) where f is the ukow biary-to-to biary mappig from classifier decisios to target label. Followig [8], the otatio i equatio (1) is modified so that the classifier decisio is 1 if it agrees with the target label ad 0 otherwise xm = ( ym 1, ym2, K, ymb ) y mi, ω m = {0,1}, y mi =1 iff ξ mi =ω m (2) Pairwise diversity measures, such as Q statistic, Correlatio Coefficiet, Double Fault ad Disagreemet measures [8] take o accout of class assiged to a patter. I cotrast, class separability [14] is computed betwee classifier decisios (equatio (2)) over pairs of patters of opposite class, usig four couts defied by logical AND ( ) operator B ab a b N = 1 0 ψ ω ω a,b {0,1}, ψ = y,ψ = y (3) =1ψ m j mj j, m The th patter for a two-class problem is assiged σ = 1 K σ N µ m = 1 11 N 11 m N µ m = 1 00 N 00 m (4) where K σ = µ N N + ab, N = µ µ m = 1 N m N m m = 1 m = 1 N ab m The motivatio for σ comes from estimatio of the first order spectral coefficiets [2] of the biary-to-biary mappig defied i equatio (1). Each patter is compared with all patters of the other class, ad the umber of joitly correctly ( N ) ad icorrectly 00 ( N ) classified patters are couted. Note that a classifier that correctly classifies oe patter but icorrectly classifies the other does ot cotribute. The two terms i equatio (4) represet the relative positive ad egative evidece that the patter comes from the target class. We sum over patters with positive coefficiet to produce a sigle umber betwee 1 ad +1 that represets the separability of a set of patters µ σ = σ, σ = 1 > 0 I our experimets i Sectio 4 we use the Q diversity measure, as recommeded i [8]. Diversity Q betwee ith ad jth classifiers is defied as 11 (5)

5 Q i j N N N N = (6) N N + N N where N ab µ = ψ m=1 B base classifiers a mj ψ b mj with a,b,ψ defied i equatio (3). The mea is take over 1 B B 2 Q 1) i= 1 j= i+ 1 Q = B( B. 2.2 Error-Correctig Output Codig ECOC There are several motivatios for decomposig a multi-class problem ito complemetary two-class problems. The decompositio meas that attetio ca be focused o developig a effective techique for the two-class classifier, without havig to cosider explicitly the desig of the multi-class case. This is useful, for example with MLPs, whe two-class classifiers do ot aturally scale up to multi-class. Also, it is hoped that the parameters of a base classifier ru several times may be easier to determie tha a complex classifier ru oce, ad perhaps facilitate faster ad more efficiet solutios. Fially, solvig differet two-class sub-problems repeatedly with radom perturbatio may help to reduce error i the origial problem. The ECOC method [15] is a example of distributed output codig [16], i which a patter is assiged to the class that is closest to a correspodig code word. Rows of the ECOC matrix act as the code words, ad are desiged usig error-correctig priciples to provide some error isesitivity with respect to idividual classificatio errors. The origial motivatio for ecodig multiple classifiers usig a error-correctig code was based o the idea of modelig the predictio task as a commuicatio problem, i which class iformatio is trasmitted over a chael. Errors itroduced ito the process arise from various stages of the learig algorithm, icludig features selected ad fiite traiig sample. From error-correctig theory, we kow that a matrix desiged to have d bits error-correctig capability implies that there is a miimum Hammig Distace 2d+1 betwee ay pair of code words. Assumig each bit is trasmitted idepedetly, it is the possible to correct a received patter havig fewer tha d bits i error, by assigig the patter to the code word closest i Hammig distace. Clearly, from this perspective it is desirable to use a matrix cotaiig code words havig high miimum Hammig distace betwee ay pair. To solve a multi-class problem i the ECOC framework we eed a set of codes to decompose the origial problem, a suitable two-class base classifier, ad a decisiomakig framework. For a K-class problem, each row of the K x B biary ECOC matrix Z acts as a code word for each class. Each of the B colums of Z partitios the traiig data ito two super-classes accordig to the value of the correspodig biary elemet. To classify patter x m, it is applied to the B traied base classifiers formig vector [x m1, x m2,..., x mb ] where x mj is the output of the jth base classifier. The L 1 orm distace L i (where i = 1. K) betwee output vector ad code word for each class is computed L i = b = Z j 1 x mj (7)

6 ad x m is assiged to the class ω m correspodig to closest code word. Patter x m is classified usig oly those classifiers that are i the set OOB m, defied as the set of classifiers for which x m is OOB. For the OOB estimate, the summatio i equatio (7) is therefore modified to j OOB m. I [17] it is show that ay variatio i Hammig distace betwee pairs of code words will reduce the effectiveess of the combiig strategy. I [18] it is show that maximisig the miimum Hammig Distace betwee code words implies miimisig upper bouds o geeralisatio error. I classical codig theory, theorems o error-correctig codes guaratee a reductio i the oise i a commuicatio chael, but the assumptio is that errors are idepedet. Whe applied to machie learig the situatio is more complex, i that error correlatio depeds o the data set, base classifier as well as the code matrix Z. I the origial ECOC approach [15], heuristics were employed to maximise the distace betwee the colums of Z to reduce error correlatio. Radom codes, provided that they are log eough, have frequetly bee employed with almost as good performace [17]. It would seem to be a matter of idividual iterpretatio whether log radom codes may be cosidered to approximate required error-correctig properties. I this paper, a radom code matrix with ear equal split of classes (approximately equal umber of 1 s i each colum) is chose, as proposed i [19]. 3 Feature Rakig The aim of feature selectio is to fid a feature subset from the origial set of features such that a iductio algorithm that is ru o data cotaiig oly those features geerates a classifier that has the highest possible accuracy []. Typically with tes of features i the origial set, a exhaustive search is computatioally prohibitive. Ideed the problem is kow to be NP-hard [], ad a greedy search scheme is required. However, some recet problems such as those i gee selectio ad text categorizatio require feature selectio to be applied to hudreds ad thousads of features. For these problems, classical feature selectio schemes are ot greedy eough, ad filter, wrapper ad embedded approaches have bee developed [21]. Oe-dimesioal feature rakig methods cosider each feature i isolatio ad rak the features accordig to a scorig fuctio, but are disadvataged by implicit orthogoality assumptios [21]. They are very efficiet but i geeral have bee show to be iferior to multi-dimesioal methods [21] that cosider all features simultaeously. A feature scorig method is a fuctio Score(j) where j=1 p is a feature, for which higher scores usually idicate more ifluetial features. Oe-dimesioal fuctios igore all p-1 remaiig features whereas a multi-dimesioal scorig fuctio cosiders correlatios with remaiig features. Four oe-dimesioal scorig fuctios are described i [22] ad compared with the oisy bootstrap (Sectio 3) ad other regularizatio techiques. The issue of feature relevace, redudacy ad irrelevace has bee explicitly addressed i may papers. As oted i [23] it is possible to thik up examples for which

7 two features may appear irrelevat by themselves but be relevat whe cosidered together. Also addig redudat features ca provide the desirable effect of oise reductio. It thus appears ecessary to do more tha just cosider idividual features by themselves as with oe-dimesioal methods. The most importat problem arises from the relatively small umber of patters relative to the umber of features. I Patter Recogitio this is kow as the small sample size problem, that is whe the umber of patters is less tha or of comparable size to the umber of features [1]. It meas that there is a risk of the classifier over-fittig the data, ad thereby capturig uwated idiosycrasies. A popular way to avoid this is to utilize simple, for example, liear classifiers. Feature rakig problems have received much attetio i the literature. However, there has bee relatively little work devoted to hadlig feature rakig explicitly i the cotext of MCS. Most previous work has focused o determiig feature subsets to combie, but differ i the way the subsets are chose. The Radom Subspace Method (RSM) [3] is the best kow method, ad it was show that a radom choice of feature subset, (allowig a sigle feature to be i more tha oe subset), improves performace for high-dimesioal problems. I [1], forward feature ad radom (without replacemet) selectio methods are used to sequetially determie disjoit optimal subsets. I [24], feature subsets are chose based o how well a feature correlates with a particular class. Rakig subsets of radomly chose features before combiig was reported i [25]. 3.1 Rakig by MLP weights The equatio for the output O of a sigle output sigle hidde-layer MLP, assumig sigmoid activatio fuctio S is give by 1 2 O S x W ) W (8) = j ( i j i where i,j are the iput ad hidde ode idices, x i is iput feature, W 1 is the first layer weight matrix ad W 2 is the output weight vector. I [26], a local feature selectio gai w i is derived form equatio (8) w i = 1 W W j The feature rakig strategy that uses equatio (9) will subsequetly be referred to as mod-. This product of weights strategy has bee foud i geeral ot to give a reliable feature rakig [27]. However, whe used with RFE it is oly required to fid the least relevat features. We have ot experimeted with ay more sophisticated strategies based o sesitivity aalysis [28]. 3.2 Rakig by Noisy Bootstrap Fisher s criterio measures the separatio betwee two sets of patters i a directio w, ad is defied for the projected patters as the differece i meas ormalized by the averaged variace m 1 m (10) 2 J ( w) = j σ 1 +σ 2 (9)

8 FLD is defied as the liear fuctio for which J(w) is maximized. It is coveiet to re-write J(w) as w T S B w J ( w) = (11) w T S W w where, S B is the betwee-class scatter matrix ad S W is the withi-class scatter matrix. The objective of FLD is to fid the trasformatio matrix w* that maximises J(w) i equatio (10) ad w* is kow to be the solutio of the followig eigevalue problem S B - S W Λ = 0 where Λ is a diagoal matrix whose elemets are the eigevalues of matrix S W -1 S B. Sice i practice S W is early always sigular, dimesioality reductio is required. Typically this is performed by Pricipal Compoets Aalysis (PCA) before solvig the eigevalue problem, but as oted i Sectio 1, that is ot appropriate for our iteded applicatio. The idea behid the oisy bootstrap is to estimate the oise i the data ad exted the traiig set by re-samplig with simulated oise. Therefore, the umber of patters may be icreased by usig a re-samplig rate greater tha 100 percet, thereby solvig the small sample size problem. The oise model assumes a multi-variate Gaussia distributio with zero mea ad diagoal covariace matrix. The reaso for assumig a diagoal matrix is that there are geerally isufficiet umber of patters to make a reliable estimate of ay correlatios betwee features. For each class, the stadard deviatio of each feature is used for the diagoal etry. Two parameters to tue are the oise added γ ad the sample to feature ratio s2f. Followig [22] we set for our experimets γ = 0.25 ad s2f = 10. Origially, the oisy bootstrap was combied with Fisher s criterio to produce a 1- dimesioal feature score [29], ad the subsequetly with the modulus of the FLD weights. I Sectio 5 we will refer to the feature rakig strategy as the oisy bootstrap ad assume that it icorporates the weight rakig defied by w* i equatio (10). 3.3 Recursive Feature Elimiatio (RFE) RFE is a simple algorithm [] ad operates recursively as follows: 1) Rak the features accordig to a suitable feature rakig method 2) Idetify ad remove the r least raked features If r 2, which is usually desirable from a efficiecy viewpoit, this produces a feature subset rakig. The mai advatage of RFE is that the oly requiremet to be successful is that at each recursio the least raked subset does ot cotai a strogly relevat feature [23]. However, the choice of whe to stop elimiatig features is difficult ad ormally requires a validatio set or cross-validatio techiques. 4 Datasets The artificial data is two-class accordig to [22], which is iteded to simulate a problem i gee selectio. Oe thousad patters are geerated per class usig a diagoal p x p covariace matrix that is estimated from the colo data [31]. The differece betwee

9 the two classes is i the first 2*p d features. Class ω 1 has all zero mea features whereas ω 2 has the first p d features set to c > 0 ad the ext p d features set to c, with remaiig features zero mea. Therefore 2*p d are ifluetial features, with remaiig features for both classes zero mea. For our experimets umber of traiig patters = 50(2.5%), 40,, dimesio p = 100, 500 p d = 25, c = The mai advatage of this simulated data is that we kow the potetially ifluetial features, so that it is possible to rate the feature rakigs. If we assume that features are ordered with highest score idicatig more ifluetial features, cosider what happes as we reduce the score threshold to iclude more features. Each feature ca be labeled as true positive or true egative, ad we ca plot a ROC (Receiver Operatig Curve), that is true positives versus true egatives as the threshold is chaged. The area uder the ROC curve is a sigle umber used to idicate the tradeoff betwee sesitivity ad specificity. The assumptio for plottig the area uder the ROC curve is that, at each recursive step, we reduce the threshold just eough to iclude the ext subset of features. I our cotext, higher area idicates a better feature rakig with respect to the locatio of the ifluetial features. Natural two-class ad multi-class bechmark problems have bee selected from [32] ad [33] ad are show i Table 1. For datasets with missig values the scheme suggested i [32] is used. For RFE testig i Sectio 5 the origial features are ormalized to mea 0 std 1 ad the umber of features icreased to oe hudred by addig oisy features (Gaussia std 0.25). Table 1: Bechmark Datasets showig umbers of patters, classes, cotiuous ad discrete features DATASET #pat #class #co #dis cacer card credita dermatology diabetes ecoli glass heart iris io segmet soybea vehicle vote vowel wave yeast

10 5 Experimetal Evidece All experimets use radom traiig/testig splits, ad the results are reported as mea over te rus. Two-class problems are split /80 (% traiig) ad use 100 base classifiers. Multi-class problems are also split /80 but use 0 base classifiers, oe for each two-class decompositio, described i Sectio 2.2. The purpose of the iitial experimet is to determie geeralizatio performace as the umber of hidde odes ad umber of traiig epochs of multi-layer perceptro (MLP) base classifiers are systematically varied. Each ode-epoch combiatio is repeated te times with the same umber of odes ad epochs used for each MLP. All other parameters of the base classifier MLPs are fixed at the same values over all rus. The umber of hidde odes is varied over 2-16 ad umber of traiig epochs over 1-69 (log scale). Radom perturbatio of the MLP base classifiers is caused by differet startig weights o each ru, combied with oe hudred percet bootstrapped traiig patters. The experimet is performed with oe hudred sigle hidde-layer MLP base classifiers, usig the Leveberg-Marquardt traiig algorithm with default parameters (µ iit =0.001, µ dec =0.1, µ ic =10). Error Rates % Error Rates % Coefficiet (a) Base Test (c) Base OOB (e) σ Number of Epochs (b) Esemble Test (d) Esemble OOB (f) Q Number of Epochs Figure 1: Mea test error rates, OOB estimates, measures σ, Q for Diabetes /80 with [2,4,8,16] odes Figure 1 shows Diabetes /80, a dataset that is kow to over-fit with Boostig ad other methods. Figure 1 (a) (b) shows base classifier ad esemble test error rates, (c) (d) the base classifier ad esemble OOB estimates ad (e) (f) the measures σ, Q defied i

11 equatios (5) ad (6) for various ode-epoch combiatios. It may be see that σ ad base classier OOB are good predictors of base classifier test error rates as base classifier complexity is varied. The correlatio betwee σ ad test error was thoroughly ivestigated i [10], showig high values of correlatio that were sigificat (95 % cofidece whe compared with radom chace). I [10] it was also show that bootstrappig did ot sigificatly chage the esemble error rates, actually improvig them slightly o average. The class separability measure σ shows that the base classifier test error rates are optimized whe the umber of epochs is chose to maximize class separability. Furthermore, at the optimal umber of epochs Q shows that diversity is miimized. It appears that base classifiers startig from radom weights icrease correlatio (reduce diversity) as complexity is icreased ad peaks as the classifier starts to over-fit the data. A possible explaatio of the over-fittig behavior is that classifiers produce differet fits of those patters i the regio where classes are overlapped [10]. Note from Figure 1 that the esemble is more resistat to over-fittig tha base classifier for epochs greater tha 7, ad the esemble OOB accurately predicts this tred. This experimet was performed for all the datasets, ad i geeral the esemble test error was foud to be more resistat to over-fittig for both two-class ad multi-class datasets. Figure 2 shows similar curves to Figure 1 averaged over all multi-class datasets. Based o these results 8 hidde odes was chose, with 7 epochs for two-class, epochs for multiclass ad 10 epochs for artificial data. Coefficiet Error Rates % Error Rates % (a) Base Test (c) Base OOB (e) σ Number of Epochs (b) Esemble Test (d) Esemble OOB (f) Q 0.1 Number of Epochs Figure 2: Mea test error rates, OOB estimates, measures σ, Q over te multiclass /80 datasets with [2,4,8,16] odes

12 Figure 3 shows RFE with oisy bootstrap feature rakig for two-class artificial data with oe hudred features. The recursive step size is chose usig a logarithmic scale to start at 100 ad fiish at 2 features with miimum step size of 1. Both base classifier OOB ad σ are see to correlate well with base classifier test error. Similarly, esemble OOB achieves miimum error at the same umber of features as esemble test error, with the exceptio of 1% sample size ( patters). We have ot ivestigated whether icreasig the umber of classifiers improves the estimate for small sample size. Note from Figure 3 (a) (b) (c) (d) that the OOB estimate is geerally a poor idicator of absolute geeralizatio error. Figure 4 (a) shows a typical ROC curve defied i Sectio 4 at 100 features. For RFE, it is difficult to compare feature subsets whe there is differet umber of features. Therefore, we keep a list of sorted features puttig the feature subset that has bee elimiated o each recursio at the ed of the list. At each recursive step, we the have 100 sorted features. The result is show i Figure 4 (b), idicatig that RFE cosistetly improves the feature rakig. The experimet was repeated without applyig RFE, that is the feature orderig obtaied at 100 features is used for each feature reductio. The differece i test error rates betwee the two is show i Figure 5, demostratig that RFE makes a large differece to error rates below features. To determie the effect of RFE o a rage of two-class ad multi-class problems, RFE was applied to the datasets show i Table 1. For each dataset, the umber of features is icreased to 100 by addig oisy features, as explaied i Sectio 4. The RFE curves (ot show) appeared similar to Figure 3, achievig a miimum at the umber of features predicted by OOB. For two-class problems, there was o sigificat differece betwee oisy bootstrap ad mod-. The mea over all features ad all datasets is show i Table 2. For compariso, usig origial features (Table 1) the mea error rate over all /80 problems was 14.1 % for two-class ad 17.8 % for multi-class. A potetial problem with bootstrappig is that each base classifier sees oly approximately 63% traiig patters. To determie the effect of the reduced sample size, the RFE experimet for artificial data was repeated without bootstrappig. The miimum error achieved was 0.5 % compared with 1.2 % with bootstrappig. However, the umber of features at which the OOB ad the test error started to rise did ot chage. For seve two-class problems with 100 features, the mea best error rate was 13.4 % compared with 13.7 % with bootstrappig. 6 Coclusio It is show i this paper that classifier complexity ad umber of features may be selected usig a out-of-bootstrap (OOB) error estimate. The base classifier OOB estimate achieves

13 a miimum whe the estimate of class separability reaches a maximum. The method is exteded to multi-class problems usig ECOC, ad is see to be less sesitive to overfittig whe the umber of features is reduced below the optimal umber. Both oisy bootstrap with Liear Descrimiat ad the modulus of eural etwork weights provide a good feature rakig criterio. However, for large umber of features it is better to combie wth RFE to recursively remove irrelevat features. Table 2: Mea best error rates (%) for artificial data (2.5/97.5), seve two-class problems (/80), te multi-class problems (/80) artificial 100 feats artificial 500 feats two-class 100 feats multi-class 100 feats mod- RFE oisyboot RFE Error Rates % Error Rates % Coefficiet (a) Base Test (c) Base OOB (e) σ umber of features (b) Esemble Test (d) Esemble OOB (f) Q umber of features Figure 3: Mea test error rates, OOB estimates, measures σ, Q for RFE with oisy bootstrap feature rakig, artificial data ad [1,1.5,2,2.5] % traiig patters

14 True Positive (a) ROC False Positive Coefficiet (b) Area uder ROC Number of Features Figure 4: (a) typical ROC curve at 100 features (b) area uder ROC curve for 100 sorted features usig RFE 0 (a) Base Test 0 (b) Esemble Test Error Rates % umber of features umber of features Figure 5: Test error rates with RFE mius test error rates without RFE, artificial data [1,1.5,2,2.5] % traiig patters Refereces 1 Skuruchia M. ad Dui R. P. W., Combiig feature subsets i feature selectio, Proc. 6th It. Workshop Multiple Classifier Systems, Editors: N. Oza, R. Polikar, F. Roli, J. Kittler, Seaside, Calif. USA, Jue, 05, Lecture otes i computer sciece, Spriger-Verlag, Wideatt T., Vote Coutig Measures for Esemble Classifiers, Patter Recogitio 36(12), 03, Ho T. K., The radom subspace method for costructig decisio forests, IEEE Tras. Patter Aalysis ad Machie Itelligece, (8) 1998, Efro B. ad Tibshirai R. J., A Itroductio to the Bootstrap, Chapma & Hall, Bylader T, Estimatig geeralisatio error two-class datasets usig out-of-bag estimate, Machie Learig 48, 02, Tikhoov A. N. ad Arsei V. A., Solutios of ill-posed problems, Wisto & Sos, Washigto, Hase L. K. ad Salamo P., Neural Network Esembles, IEEE Tras. Patter Aalysis ad Machie Itelligece, 12(10), 1990, Kucheva L. I. ad Whitaker C. J., Measures of diversity i classifier esembles, Machie Learig 51, 03, Kucheva L. I., Skurichia M. ad, Dui R. P. W. A experimetal study o diversity for baggig ad boostig with liear classifiers, Iformatio Fusio, 3 (2), 02,

15 10 Wideatt T., Accuracy/Diversity ad esemble classifier desig, IEEE Tras. Neural Networks 17(5), 06, Carey J. G. ad Cuigham, Tuig Diversity i bagged esembles, It. Joural Neural Systems, 10(4), 00, Fukuaga K., Itroductio to statistical patter recogitio, Academic Press (1990). 13 Ho T.K. ad Basu M., Complexity measures of supervised classificatio problems, IEEE Tras. PAMI 24(3), 02, Wideatt T. Diversity Measures for Multiple Classifier System Aalysis ad Desig, Iformatio Fusio, 6 (1), 04, Dietterich T. G. ad Bakiri G., Solvig multiclass learig problems via error-correctig output codes, Joral of Artificial Itelligece Research 2, 1995, Sejowski T. J. ad Roseberg C. R., Parallel etworks that lear to proouce eglish text, Joural of Complex Systems, 1(1), 1987, Wideatt T. ad Ghaderi R.., Codig ad Decodig Strategies for Multi-class Learig Problems, Iformatio Fusio, 4(1), 03, Allwei E. L., Schapire R. E. ad Siger Y., Reducig Multi-class to Biary: A Uifyig Approach for Margi Classifiers, Joural of Machie Learig Research 1, 00, Schapire R. E., Usig Output Codes to Boost Multi-class Learig Problems, 14th It. Cof. of Machie Learig, Morga Kaufma, 1997, Kohavi R. ad Joh G. H., Wrappers for feature subset selectio, Artificial Itelligece Joural, special issueo relevace, 97 (1-2), 1997, Guyo I. ad Elisseeff A. A itroductio to variable ad feature selectio, Joural of Machie Learig Research 3, 03, Efro N. ad Itrator N., Multi-dimesioal feature scorig for gee expressio data, submitted. 23 Yu L. ad Liu H., Efficiet feature selectio via aalysis of relevace ad redudacy, Joural of Machie Learig Research 5, 04, Oza N., ad Tumer K., Iput Decimatio esembles: decorrelatio through dimesioality reductio, Proc. 2d It. Workshop Multiple Classifier Systems, Editors: J. Kittler, F. Roli,, Cambridge, UK, July, 01, Lecture otes i computer sciece, Spriger-Verlag, Bryll R., Gutierrez-Osua R. ad Quek F. Attribute baggig: improvig accuracy of classifier esembles by usig radom feature subsets, Patter Recogitio 36, 03, Hsu C. Huag H. ad Schuschel D., The ANNIGMA-wrapper approach to fast feature selectio for eural ets, IEEE Tras. System, Ma ad Cyberetics-Part B:Cyberetics 32(2), 02, Wag W., Joes P. ad Partridge D. Assessig the impact of iput features i a feedforward eural etwork, Neural Computig ad Applicatios 9, 00, Motaa J. J. ad Palmer A., Numeric Sesitivity aalysis applied to feedforward eural etworks, Neural Computig ad Applicatios 12, 03, Efro N. ad Itrator N., The effect of oisy bootstrappig o the robustess of supervised classificatio of gee expressio data, IEEE It. Workshop o Machie Learig for Sigal Processig, Brazil, 04, Guyo I., Westo J., Barhill S. ad Vapik V., Gee selectio for cacer classificatio usig support vector machies, Machie Learig 46(1-3), 02, Alo U et al., Broad patters of gee expressio revealed by clusterig aalysis of tumor ad ormal colo tissues probed by oligoucleotide arrays, Proc. Natioal Acad. Sciece 96, 1999, Prechelt L., Probe1: A set of eural etwork Bechmark Problems ad Bechmarkig Rules, Tech Report 21/94, Uiv. Karlsruhe, Germay, Merz C. J., Murphy P. M., UCI repository of machie learig databases, 1998,

Investigating methods for improving Bagged k-nn classifiers

Investigating methods for improving Bagged k-nn classifiers Ivestigatig methods for improvig Bagged k-nn classifiers Fuad M. Alkoot Telecommuicatio & Navigatio Istitute, P.A.A.E.T. P.O.Box 4575, Alsalmia, 22046 Kuwait Abstract- We experimet with baggig knn classifiers

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Lecture 13: Validation

Lecture 13: Validation Lecture 3: Validatio Resampli methods Holdout Cross Validatio Radom Subsampli -Fold Cross-Validatio Leave-oe-out The Bootstrap Bias ad variace estimatio Three-way data partitioi Itroductio to Patter Recoitio

More information

Designing a learning system

Designing a learning system CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try

More information

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees. Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for

More information

Eigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1 Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces

More information

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1 Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process Vol.133 (Iformatio Techology ad Computer Sciece 016), pp.85-89 http://dx.doi.org/10.1457/astl.016. Euclidea Distace Based Feature Selectio for Fault Detectio Predictio Model i Semicoductor Maufacturig

More information

ANN WHICH COVERS MLP AND RBF

ANN WHICH COVERS MLP AND RBF ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Designing a learning system

Designing a learning system CS 75 Itro to Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@pitt.edu 539 Seott Square, -5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

A new algorithm to build feed forward neural networks.

A new algorithm to build feed forward neural networks. A ew algorithm to build feed forward eural etworks. Amit Thombre Cetre of Excellece, Software Techologies ad Kowledge Maagemet, Tech Mahidra, Pue, Idia Abstract The paper presets a ew algorithm to build

More information

Criterion in selecting the clustering algorithm in Radial Basis Functional Link Nets

Criterion in selecting the clustering algorithm in Radial Basis Functional Link Nets WSEAS TRANSACTIONS o SYSTEMS Ag Sau Loog, Og Hog Choo, Low Heg Chi Criterio i selectig the clusterig algorithm i Radial Basis Fuctioal Lik Nets ANG SAU LOONG 1, ONG HONG CHOON 2 & LOW HENG CHIN 3 Departmet

More information

LU Decomposition Method

LU Decomposition Method SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS LU Decompositio Method Jamie Traha, Autar Kaw, Kevi Marti Uiversity of South Florida Uited States of America kaw@eg.usf.edu http://umericalmethods.eg.usf.edu Itroductio

More information

Neuro Fuzzy Model for Human Face Expression Recognition

Neuro Fuzzy Model for Human Face Expression Recognition IOSR Joural of Computer Egieerig (IOSRJCE) ISSN : 2278-0661 Volume 1, Issue 2 (May-Jue 2012), PP 01-06 Neuro Fuzzy Model for Huma Face Expressio Recogitio Mr. Mayur S. Burage 1, Prof. S. V. Dhopte 2 1

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c Advaces i Egieerig Research (AER), volume 131 3rd Aual Iteratioal Coferece o Electroics, Electrical Egieerig ad Iformatio Sciece (EEEIS 2017) Pruig ad Summarizig the Discovered Time Series Associatio Rules

More information

x x 2 x Iput layer = quatity of classificatio mode X T = traspositio matrix The core of such coditioal probability estimatig method is calculatig the

x x 2 x Iput layer = quatity of classificatio mode X T = traspositio matrix The core of such coditioal probability estimatig method is calculatig the COMPARATIVE RESEARCHES ON PROBABILISTIC NEURAL NETWORKS AND MULTI-LAYER PERCEPTRON NETWORKS FOR REMOTE SENSING IMAGE SEGMENTATION Liu Gag a, b, * a School of Electroic Iformatio, Wuha Uiversity, 430079,

More information

Neural Networks A Model of Boolean Functions

Neural Networks A Model of Boolean Functions Neural Networks A Model of Boolea Fuctios Berd Steibach, Roma Kohut Freiberg Uiversity of Miig ad Techology Istitute of Computer Sciece D-09596 Freiberg, Germay e-mails: steib@iformatik.tu-freiberg.de

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used

More information

Numerical Methods Lecture 6 - Curve Fitting Techniques

Numerical Methods Lecture 6 - Curve Fitting Techniques Numerical Methods Lecture 6 - Curve Fittig Techiques Topics motivatio iterpolatio liear regressio higher order polyomial form expoetial form Curve fittig - motivatio For root fidig, we used a give fuctio

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Image Analysis. Segmentation by Fitting a Model

Image Analysis. Segmentation by Fitting a Model Image Aalysis Segmetatio by Fittig a Model Christophoros Nikou cikou@cs.uoi.gr Images take from: D. Forsyth ad J. Poce. Computer Visio: A Moder Approach, Pretice Hall, 2003. Computer Visio course by Svetlaa

More information

Dimensionality Reduction PCA

Dimensionality Reduction PCA Dimesioality Reductio PCA Machie Learig CSE446 David Wadde (slides provided by Carlos Guestri) Uiversity of Washigto Feb 22, 2017 Carlos Guestri 2005-2017 1 Dimesioality reductio Iput data may have thousads

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA Creatig Exact Bezier Represetatios of CST Shapes David D. Marshall Califoria Polytechic State Uiversity, Sa Luis Obispo, CA 93407-035, USA The paper presets a method of expressig CST shapes pioeered by

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

15 UNSUPERVISED LEARNING

15 UNSUPERVISED LEARNING 15 UNSUPERVISED LEARNING [My father] advised me to sit every few moths i my readig chair for a etire eveig, close my eyes ad try to thik of ew problems to solve. I took his advice very seriously ad have

More information

A General Framework for Accurate Statistical Timing Analysis Considering Correlations

A General Framework for Accurate Statistical Timing Analysis Considering Correlations A Geeral Framework for Accurate Statistical Timig Aalysis Cosiderig Correlatios 7.4 Vishal Khadelwal Departmet of ECE Uiversity of Marylad-College Park vishalk@glue.umd.edu Akur Srivastava Departmet of

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

CSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University

CSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University CSCI 5090/7090- Machie Learig Sprig 018 Mehdi Allahyari Georgia Souther Uiversity Clusterig (slides borrowed from Tom Mitchell, Maria Floria Balca, Ali Borji, Ke Che) 1 Clusterig, Iformal Goals Goal: Automatically

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

Evaluation of Support Vector Machine Kernels for Detecting Network Anomalies

Evaluation of Support Vector Machine Kernels for Detecting Network Anomalies Evaluatio of Support Vector Machie Kerels for Detectig Network Aomalies Prera Batta, Maider Sigh, Zhida Li, Qigye Dig, ad Ljiljaa Trajković Commuicatio Networks Laboratory http://www.esc.sfu.ca/~ljilja/cl/

More information

Improving Face Recognition Rate by Combining Eigenface Approach and Case-based Reasoning

Improving Face Recognition Rate by Combining Eigenface Approach and Case-based Reasoning Improvig Face Recogitio Rate by Combiig Eigeface Approach ad Case-based Reasoig Haris Supic, ember, IAENG Abstract There are may approaches to the face recogitio. This paper presets a approach that combies

More information

New Fuzzy Color Clustering Algorithm Based on hsl Similarity

New Fuzzy Color Clustering Algorithm Based on hsl Similarity IFSA-EUSFLAT 009 New Fuzzy Color Clusterig Algorithm Based o hsl Similarity Vasile Ptracu Departmet of Iformatics Techology Tarom Compay Bucharest Romaia Email: patrascu.v@gmail.com Abstract I this paper

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Feature Selection for Change Detection in Multivariate Time-Series

Feature Selection for Change Detection in Multivariate Time-Series Feature Selectio for Chage Detectio i Multivariate Time-Series Michael Botsch Istitute for Circuit Theory ad Sigal Processig Techical Uiversity Muich 80333 Muich, Germay Email: botsch@tum.de Josef A. Nossek

More information

Dimension Reduction and Manifold Learning. Xin Zhang

Dimension Reduction and Manifold Learning. Xin Zhang Dimesio Reductio ad Maifold Learig Xi Zhag eeizhag@scut.edu.c Cotet Motivatio of maifold learig Pricipal compoet aalysis ad its etesio Maifold learig Global oliear maifold learig (IsoMap) Local oliear

More information

Lip Contour Extraction Based on Support Vector Machine

Lip Contour Extraction Based on Support Vector Machine Lip Cotour Extractio Based o Support Vector Machie Author Pa, Xiaosheg, Kog, Jiagpig, Liew, Ala Wee-Chug Published 008 Coferece Title CISP 008 : Proceedigs, First Iteratioal Cogress o Image ad Sigal Processig

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

The Adjacency Matrix and The nth Eigenvalue

The Adjacency Matrix and The nth Eigenvalue Spectral Graph Theory Lecture 3 The Adjacecy Matrix ad The th Eigevalue Daiel A. Spielma September 5, 2012 3.1 About these otes These otes are ot ecessarily a accurate represetatio of what happeed i class.

More information

A Wrapper-Based Combined Recursive Orthogonal Array and Support Vector Machine for Classification and Feature Selection

A Wrapper-Based Combined Recursive Orthogonal Array and Support Vector Machine for Classification and Feature Selection Moder Applied Sciece; Vol. 8, No. ; 24 ISSN 93-844 E-ISSN 93-852 Published by Caadia Ceter of Sciece ad Educatio A Wrapper-Based Combied Recursive Orthogoal Array ad Support Vector Machie for Classificatio

More information

4.2.1 Bayesian Principal Component Analysis Weighted K Nearest Neighbor Regularized Expectation Maximization

4.2.1 Bayesian Principal Component Analysis Weighted K Nearest Neighbor Regularized Expectation Maximization 4 DATA PREPROCESSING 4.1 Data Normalizatio 4.1.1 Mi-Max 4.1.2 Z-Score 4.1.3 Decimal Scalig 4.2 Data Imputatio 4.2.1 Bayesia Pricipal Compoet Aalysis 4.2.2 K Nearest Neighbor 4.2.3 Weighted K Nearest Neighbor

More information

Our Learning Problem, Again

Our Learning Problem, Again Noparametric Desity Estimatio Matthew Stoe CS 520, Sprig 2000 Lecture 6 Our Learig Problem, Agai Use traiig data to estimate ukow probabilities ad probability desity fuctios So far, we have depeded o describig

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

Using The Central Limit Theorem for Belief Network Learning

Using The Central Limit Theorem for Belief Network Learning Usig The Cetral Limit Theorem for Belief Network Learig Ia Davidso, Mioo Amiia Computer Sciece Dept, SUNY Albay Albay, NY, USA,. davidso@cs.albay.edu Abstract. Learig the parameters (coditioal ad margial

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

Kernel Smoothing Function and Choosing Bandwidth for Non-Parametric Regression Methods 1

Kernel Smoothing Function and Choosing Bandwidth for Non-Parametric Regression Methods 1 Ozea Joural of Applied Scieces (), 009 Ozea Joural of Applied Scieces (), 009 ISSN 943-49 009 Ozea Publicatio Kerel Smoothig Fuctio ad Choosig Badwidth for No-Parametric Regressio Methods Murat Kayri ad

More information

Comparison of classification algorithms in the task of object recognition on radar images of the MSTAR base

Comparison of classification algorithms in the task of object recognition on radar images of the MSTAR base Compariso of classificatio algorithms i the task of object recogitio o radar images of the MSTAR base A.A. Borodiov 1, V.V. Myasikov 1,2 1 Samara Natioal Research Uiversity, 34 Moskovskoe Shosse, 443086,

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties WSEAS TRANSACTIONS o COMMUNICATIONS Wag Xiyag The Couterchaged Crossed Cube Itercoectio Network ad Its Topology Properties WANG XINYANG School of Computer Sciece ad Egieerig South Chia Uiversity of Techology

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

VALIDATING DIRECTIONAL EDGE-BASED IMAGE FEATURE REPRESENTATIONS IN FACE RECOGNITION BY SPATIAL CORRELATION-BASED CLUSTERING

VALIDATING DIRECTIONAL EDGE-BASED IMAGE FEATURE REPRESENTATIONS IN FACE RECOGNITION BY SPATIAL CORRELATION-BASED CLUSTERING VALIDATING DIRECTIONAL EDGE-BASED IMAGE FEATURE REPRESENTATIONS IN FACE RECOGNITION BY SPATIAL CORRELATION-BASED CLUSTERING Yasufumi Suzuki ad Tadashi Shibata Departmet of Frotier Iformatics, School of

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Text Feature Selection based on Feature Dispersion Degree and Feature Concentration Degree

Text Feature Selection based on Feature Dispersion Degree and Feature Concentration Degree Available olie at www.ijpe-olie.com vol. 13, o. 7, November 017, pp. 1159-1164 DOI: 10.3940/ijpe.17.07.p19.11591164 Text Feature Selectio based o Feature Dispersio Degree ad Feature Cocetratio Degree Zhifeg

More information

Chapter 2 and 3, Data Pre-processing

Chapter 2 and 3, Data Pre-processing CSI 4352, Itroductio to Data Miig Chapter 2 ad 3, Data Pre-processig Youg-Rae Cho Associate Professor Departmet of Computer Sciece Baylor Uiversity Why Need Data Pre-processig? Icomplete Data Missig values,

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8) CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS

EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS EM375 STATISTICS AND MEASUREMENT UNCERTAINTY LEAST SQUARES LINEAR REGRESSION ANALYSIS I this uit of the course we ivestigate fittig a straight lie to measured (x, y) data pairs. The equatio we wat to fit

More information

Announcements. Recognition III. A Rough Recognition Spectrum. Projection, and reconstruction. Face detection using distance to face space

Announcements. Recognition III. A Rough Recognition Spectrum. Projection, and reconstruction. Face detection using distance to face space Aoucemets Assigmet 5: Due Friday, 4:00 III Itroductio to Computer Visio CSE 52 Lecture 20 Fial Exam: ed, 6/9/04, :30-2:30, LH 2207 (here I ll discuss briefly today, ad will be at discussio sectio tomorrow

More information

On the Accuracy of Vector Metrics for Quality Assessment in Image Filtering

On the Accuracy of Vector Metrics for Quality Assessment in Image Filtering 0th IMEKO TC4 Iteratioal Symposium ad 8th Iteratioal Workshop o ADC Modellig ad Testig Research o Electric ad Electroic Measuremet for the Ecoomic Uptur Beeveto, Italy, September 5-7, 04 O the Accuracy

More information

Arithmetic Sequences

Arithmetic Sequences . Arithmetic Sequeces COMMON CORE Learig Stadards HSF-IF.A. HSF-BF.A.1a HSF-BF.A. HSF-LE.A. Essetial Questio How ca you use a arithmetic sequece to describe a patter? A arithmetic sequece is a ordered

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

Probabilistic Fuzzy Time Series Method Based on Artificial Neural Network

Probabilistic Fuzzy Time Series Method Based on Artificial Neural Network America Joural of Itelliget Systems 206, 6(2): 42-47 DOI: 0.5923/j.ajis.2060602.02 Probabilistic Fuzzy Time Series Method Based o Artificial Neural Network Erol Egrioglu,*, Ere Bas, Cagdas Haka Aladag

More information

Primitive polynomials selection method for pseudo-random number generator

Primitive polynomials selection method for pseudo-random number generator Joural of hysics: Coferece Series AER OEN ACCESS rimitive polyomials selectio method for pseudo-radom umber geerator To cite this article: I V Aiki ad Kh Alajjar 08 J. hys.: Cof. Ser. 944 0003 View the

More information