Additive Groves of Regression Trees

Size: px
Start display at page:

Download "Additive Groves of Regression Trees"

Transcription

1 Addtve Groves of Regresson Trees Dara Sorokna, Rch Caruana, and Mrek Redewald Department of Computer Scence, Cornell Unversty, Ithaca, NY, USA Abstract. We present a new regresson algorthm called Groves of trees and show emprcally that t s superor n performance to a number of other establshed regresson methods. A Grove s an addtve model usually contanng a small number of large trees. Trees added to the Grove are traned on the resdual error of other trees already n the Grove. We begn the tranng process wth a sngle small tree n the Grove and gradually ncrease both the number of trees n the Grove and ther sze. Ths procedure ensures that the resultng model captures the addtve structure of the response. A sngle Grove may stll overft to the tranng set, so we further decrease the varance of the fnal predctons wth baggng. We show that n addton to exhbtng superor performance on a sute of regresson test problems, bagged Groves of trees are very resstant to overfttng. 1 Introducton We present a new regresson algorthm called Grove, an ensemble of addtve regresson trees. We ntalze a Grove wth a sngle small tree. The Grove s then gradually expanded: on every teraton ether a new tree s added, or the trees that already are n the Grove are made larger. Ths process s desgned to try to fnd the smplest model (a Grove wth the fewest number of small trees) that captures the underlyng addtve structure of the target functon. As tranng progesses, ths algorthm yelds a sequence of Groves of slowly ncreasng complexty. Eventually, the largest Groves may begn to overft the tranng set even as they contnue to learn mportant addtve structure. Ths overfttng s reduced by applyng baggng on top of the Grove learnng process. In Secton 2 we descrbe the Grove algorthm step by step, begnnng wth the classcal way of tranng addtve models and ncrementally makng ths process more complcated and better performng at each step. In Secton 3 we compare bagged Groves wth two other regresson ensembles: bagged regresson trees and stochastc gradent boostng. The results show that bagged Groves outperform these other methods and work especally well on hghly non-lnear data sets. In Secton 4 we show that bagged Groves are resstant to overfttng. We conclude and dscuss future work n Secton 5. J.N. Kok et al. (Eds.): ECML 2007, LNAI 4701, pp , c Sprnger-Verlag Berln Hedelberg 2007

2 324 D. Sorokna, R. Caruana, and M. Redewald 2 Algorthm Bagged Groves of Trees, or bagged Groves for short, s an ensemble of regresson trees. Specfcally, t s a bagged addtve model of regresson trees where each ndvdual addtve model s traned n an adaptve way by gradually ncreasng both number of trees and ther complexty. Regresson Trees. The unt model n a Grove s a regresson tree. Algorthms for tranng regresson trees dffer n two major aspects: (1) the crteron for choosng the best splt n a node and (2) the way n whch tree complexty s controlled. We use trees that optmze RMSE (root mean squared error) and we control tree complexty (sze) by mposng a lmt on the sze (number of cases) at an nternal node. If the fracton of the data ponts that reach a node s less than a specfed threshold α, then the node s declared a leaf and s not splt further. Hence the smaller α, 0 α 1, the larger the tree. (See Table 7.) Note that because we wll later bag the tree models, the specfc choce of regresson tree s not partcularly mportant. The man requrement s that the complextyofthetreeshouldbecontrollable. 2.1 Addtve Models Classcal Algorthm A Grove of trees s an addtve model where each addtve term s represented by a regresson tree. The predcton of a Grove s computed as the sum of the predctons of these trees: F (x) =T 1 (x)+t 2 (x)+ + T N (x). Here each T (x), 1 N, s the predcton made by the -th tree n the Grove. The Grove model has two man parameters: N, the number of trees n the Grove, and α, whch controls the sze of each ndvdual tree. We use the same value of α for all trees n a Grove. In statstcs, the basc mechansm for tranng an addtve model wth a fxed number of components s the backfttng algorthm [1]. We wll refer to ths as the Classcal algorthm for tranng a Grove of regresson trees (Algorthm 1). The algorthm cycles through the trees untl the trees converge. The frst tree n the Grove s traned on the orgnal data set, a set of tranng ponts {(x,y)}. Let ˆT 1 denote the functon encoded by ths tree. Then we tran the second tree, whch encodes ˆT 2, on the resduals,.e., on the set {(x,y ˆT 1 (x))}. Thethrdtree then s traned on the resduals of the frst two,.e., on {(x,y ˆT 1 (x) ˆT 2 (x))}, and so on. After we have traned N trees ths way, we dscard the frst tree and retran t on the resduals of the other N 1 trees,.e. on the set {(x,y ˆT 2 (x) ˆT 3 (x) ˆT N (x))}. Then we smlarly dscard and retran the second tree, and so on. We keep cyclng through the trees n ths way untl there s no sgnfcant mprovement n the RMSE on the tranng set. Baggng. As wth sngle decson trees, a sngle Grove tends to overft to the tranng set when the trees are large. Such models show a large varance wth respect to specfc subsamples of the tranng data and beneft sgnfcantly from

3 Addtve Groves of Regresson Trees 325 Algorthm 1. Classcal addtve model tranng functon Classcal(α,N,{x,y}) for =1to N do Tree (α,n) =0 Converge(α,N,{x,y}, Tree (α,n) 1,...,Tree (α,n) N ) functon Converge(α,N,{x,y},Tree (α,n) 1,...,Tree (α,n) N ) repeat for =1to N do newtranset = {x,y k Tree(α,N) k = TranTree(α, newtranset) Tree (α,n) (x)} untl (change from the last teraton s small) baggng, a well-known procedure for mprovng model performance by reducng varance [2]. On each teraton of baggng, we draw a bootstrap sample (bag) from the tranng set, and tran the full model (n our case a Grove of addtve trees) from that sample. After repeatng ths procedure a number of tmes, we end up wth an ensemble of models. The fnal predcton of the ensemble on each test data pont s an average of the predctons of all models. Example. In ths secton we llustrate the effects of dfferent methods of tranng bagged Groves on synthetc data. The synthetc data set was generated by a functon of 10 varables that was prevously used by Hooker [3]. F (x) =π x1x2 2x 3 sn 1 (x 4 )+log(x 3 + x 5 ) x 9 x 10 x7 x 8 x 2 x 7 (1) Varables x 1, x 2, x 3, x 6, x 7, x 9 are unformly dstrbuted between 0.0 and1.0 and varables x 4, x 5, x 8 and x 10 are unformly dstrbuted between 0.6 and Fgure 1 shows a contour plot of how model performance depends on both α, the sze of tree, and N, the number of trees n a Grove, for 100 bagged Groves traned wth the classcal method on 1000 tranng ponts from the above data set. The performance s measured as RMSE on an ndependent test set consstng of 25,000 ponts. Notce that lower RMSE mples better performance. The bottommost horzontal lne for N = 1 corresponds to baggng sngle trees. The plot clearly ndcates that by ntroducng addtve model structure, wth N > 1, performance mproves sgnfcantly. We can also see that the best performance s acheved by Groves contanng 5-10 relatvely small trees (large α), whle for larger trees performance deterorates. 2.2 Layered Tranng When ndvdual trees n a Grove are large and complex, the Classcal addtve model tranng algorthm (Secton 2.1) can overft even f baggng s appled. 1 Ranges are selected to avod extremely large or small functon values.

4 326 D. Sorokna, R. Caruana, and M. Redewald Algorthm 2. Layered tranng functon Layered(α,N,tran) α 0 =,α 1 =,α 2 =,...,α max = α for j =0to max do f j =0then for =1to N do Tree (α 0,N) =0 else for =1to N do Tree (α j,n) =Tree (α j 1,N) Converge(α j,n,tran,tree (α j,n) 1,...,Tree (α j,n) N ) Consder the extreme case α = 0,.e., a Grove of full trees. The frst tree wll perfectly model the tranng data, leavng resduals wth value 0 for the other trees n the Grove. Hence the ntended Grove of several large trees wll degenerate to a sngle tree. One could address ths ssue by lmtng trees to very small sze. However, we stll would lke to be able to use large trees n a Grove so that we can capture complex and non-lnear functons. To prevent the degeneraton of the Grove as the trees become larger, we developed a layered tranng approach. In the frst round we grow N small trees. Then n later cycles of dscardng and re-tranng the trees n the Grove we gradually ncrease tree sze. More precsely, no matter what the value of α, we always start the tranng process wth small trees, typcally usng a start value α 0 =. Let α j denote the value of the sze parameter after j teratons of the Layered algorthm (Algorthm 2). After reachng convergence for α j 1, we ncrease tree complexty by settng α j to approxmately half the value of α j 1. We contnue to cycle through the trees, re-tranng all trees n the Grove n the usual way, but now allow them to reach the sze correspondent to the new larger α j, and as before, we proceed untl the Grove converges on ths layer. We keep gradually ncreasng tree sze untl α j α. For a tranng set wth 1000 data ponts and α = 0, we use the followng sequence of values of α j :(,,, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001). It s worth notng that whle tranng a Grove of large trees, we automatcally obtan all Groves wth the same N for all smaller tree szes n the sequence. Fgure 2 shows how 100 bagged Groves traned by the layered approach perform on the synthetc data set. Overall performance s much better than for the classcal algorthm and bagged Groves of N large trees now perform at least as well as bagged Groves of N smaller trees. 2.3 Dynamc Programmng Tranng There s no reason to beleve that the best (α, N) Grove should always be constructed from a ( 2α, N) Grove. In fact, a large number of small trees mght

5 Addtve Groves of Regresson Trees 327 overft the tranng data and hence lmt the beneft of ncreasng tree sze n later teratons. To avod ths problem, we need to gve the Grove tranng algorthm addtonal flexblty n choosng the rght balance between ncreasng tree sze and the number of trees. Ths s the motvaton behnd the Dynamc Programmng Grove tranng algorthm. Ths algorthm can choose to construct a new Grove from an exstng one by ether addng a new tree (whle keepng tree sze constant) or by ncreasng tree sze (whle keepng the number of trees constant). Consderng the parameter grd, the Grove for a grd pont (α j,n) could be constructed ether from ts left neghbor (α j 1,n)orfromtslowerneghbor(α j,n 1). Pseudo-code for ths approach s shown n Algorthm 3. We make a choce between the two optons for computng each Grove (addng another tree or makng the trees larger) n a greedy manner,.e., the one that results n better performance of the Grove on the valdaton set. We use the out-of-bag data ponts [4] as the valdaton set for choosng whch of the two Groves to use at each step. Fgure 3 shows how the Dynamc Programmng approach mproves bagged Groves over the layeredtranng. Fgure 4 shows the choces that are made durng the process: t plots the average dfference between RMSE of the Grove created from the lower neghbor (ncrease n) and performance of the Grove created from the left neghbor (decrease α j ). That s, a negatve value means that the former s preferred, whle a postve value means that the latter s preferred at that grd pont. We can see that for ths data set ncreasng the tree sze s the preferred drecton, except for cases wth many small trees. Ths dynamc programmng verson of the algorthm does not explore all possble sequences of steps to buld a Grove of trees, because we requre that every grove bult n the process should contan trees of equal sze. We have tested several other possble approaches that don t have ths restrcton, but they faled to produce any mprovements and were notceably worse from the runnng tme pont of vew. For these reasons we prefer the dynamc programmng verson over other, less restrcted optons. 2.4 Randomzed Dynamc Programmng Tranng Our bagged Grove tranng algorthms so far performed baggng n the usual way,.e., create a bag of data, tran all Groves for dfferent vallues of (α, N) onthat bag, then create the next bag, generate all models on ths bag; and so on for 100 dfferent bags. When the Dynamc Programmng algorthm generates a Grove usng the same bag,.e., the same tran set that was used to generate ts left and lower neghbors, complex models mght not be very dfferent from ther neghbors because those neghbors already mght have overftted and there s not enough tranng data to learn anythng new. We can address ths problem by usng a dfferent bag of data on every step of the Dynamc Programmng algorthm, so that every Grove has some new data to learn from. Whle performance of a sngle Grove mght become worse, performance of bagged Groves mproves due to ncreased varablty n the models. Fgure 5 shows the mproved performance of ths fnal verson of our Grove tranng approach. The most complex Groves are

6 328 D. Sorokna, R. Caruana, and M. Redewald #trees n a grove #trees n a grove Alpha (sze of leaf) Alpha (sze of leaf) 5 Fg. 1. RMSE of bagged Grove, Classcal algorthm Fg. 2. RMSE of bagged Grove, Layered algorthm #trees n a grove Alpha (sze of leaf) #trees n a grove Alpha (sze of leaf) Fg. 3. RMSE of bagged Grove, Dynamc Programmng algorthm Fg. 4. Dfference n performance between horzontal and vertcal steps #trees n a grove Alpha (sze of leaf) #trees n a grove Alpha (sze of leaf) Fg. 5. RMSE of bagged Grove (100 bags), Randomzed Dynamc Programmng algorthm Fg. 6. RMSE of bagged Grove (500 bags), Randomzed Dynamc Programmng algorthm

7 Addtve Groves of Regresson Trees 329 Algorthm 3. Dynamc Programmng Tranng functon DP(α,N,tranSet) α 0 =,α 1 =,α 2 =,...,α max = α for j =0to max do for n =1to N do for =1to n 1 do Tree attempt1, =Tree (α j,n 1) Tree attempt1,n =0 Converge(α j,n,tran,tree attempt1,1,...,tree attempt1,n) f j>0then for =1to n do Tree attempt2, =Tree (α j 1,n) Converge(α j,n,tran,tree attempt2,1,...,tree attempt2,n) wnner = Compare Treeattempt1, and Treeattempt2, on valdaton set for =1to n do Tree (α j,n) =Tree wnner, now performng worse than ther left neghbors wth smaller trees. Ths happens because those models need more baggng steps to converge to ther best qualty. Fgure 6 shows the same plot for baggng wth 500 teratons where the property more complex models are at least as good as ther less complex counterparts s restored. 3 Experments We evaluated bagged Groves of trees on 2 synthetc and 5 real-world data sets and compared the performance to two other regresson tree ensemble methods that are known to perform well: stochastc gradent boostng and bagged regresson trees. Bagged Groves consstently outperform both of them. For real data sets we performed 10 fold cross valdaton: for each run 8 folds were used as a tranng set, 1 fold as a valdaton set for choosng the best set of parameters and the last fold was used as the test set for measurng performance. For the two synthetc data sets we generated 30 blocks of data contanng 1000 ponts each and performed 10 runs usng dfferent blocks for tranng, valdaton and test sets. We report mean and standard devaton of the RMSE on the test set. Table 1 shows the results; for comparablty across data sets all numbers are scaled by the standard devaton of the response n the dataset tself. 3.1 Parameter Settngs Groves. We traned 100 bagged Groves usng the Randomzed Dynamc Programmng technque for all combnatons of parameters N and α wth 1 N

8 330 D. Sorokna, R. Caruana, and M. Redewald Table 1. Performance of bagged Groves (Randomzed Dynamc Programmng tranng) compared to boostng and baggng. RMSE on the test set averaged over 10 runs. Calforna Elevators Knematcs Computer Stock Synthetc Synthetc Housng Actvty No Nose Nose Bagged Groves RMSE StdDev Boostng RMSE StdDev Bagged trees RMSE StdDev and α {,,, 0.05, 0.02, 0.01, 0.005}. Notce that wth these settngs the resultng ensemble can consst of at most 1500 trees. From these models we selected the one that gave the best results on the valdaton set. The performance of the selected Grove on the test set s reported. Stochastc Gradent Boostng. The obvous man compettor to bagged Groves s gradent boostng [5] [6], a dfferent ensemble of trees also based on addtve models. There are two major dfferences between boostng and Groves. Frst, boostng never dscards trees,.e., every generated tree stays n the model. Grove teratvely retrans ts trees. Second, all trees n a boostng ensemble are always bult to a fxed sze, whle groves of large trees are traned frst usng groves of smaller trees. We beleve that these dfferences allow Groves to better capture the natural addtve structure of the response functon. The general gradent boostng framework supports optmzng for a varety of loss functons. We selected squared-error loss because ths s the loss functon that our current verson of the Groves algorthm optmzes for. However, lke gradent boostng, Groves can be modfed to optmze for other loss functons. Fredman [6] recommends boostng small trees wth at most 4 10 leaf nodes for best results. However, we dscovered for one of our datasets that usng larger trees wth gradent boostng dd sgnfcantly better. Ths s not surprsng snce some real datasets contan complex nteractons, whch cannot be accurately modeled by small trees. For farness we therefore also nclude larger boosted trees n the comparson than Fredman suggested. More precsely, we tred all α {1,,,, 0.05}. Table 7 shows the typcal correspondence between α and number of leaf nodes n a tree, whch was very smlar across the data sets. Prelmnary results dd not show any mprovement for tree sze beyond α =0.05. Stochastc gradent boostng deals wth overfttng by means of two technques: regularzaton and subsamplng. Both technques depend on user-set parameters. Based on recommendatons n the lterature and on our own evaluaton we used the followng values for the fnal evaluaton: and 0.05 for the regularzaton

9 Addtve Groves of Regresson Trees 331 α # leaf nodes 1 2 (stump) full tree RMSE α =, n = 5 α = 0, n = baggng teratons Fg. 7. Typcal number of leaf nodes for dfferent values of α Fg. 8. Performance of bagged Grove for smpler and more complex models coeffcent and, 0.6, and 0.8 as the fracton of the subsamplng set sze from the whole tranng set. Boostng can also overft f t s run for too many teratons. We tred up to 1500 teratons to make the maxmum number of trees n the ensemble equal for all methods n comparson. The actual number of teratons that performs best was determned based on the valdaton set, and therefore can be lower than 1500 for the best boosted model. In summary, to evaluate stochastc gradent boostng, we tred all combnatons of the values descrbed above for the 4 parameters: sze of trees, number of teratons, regularzaton coeffcent, and subsamplng sze. As for Groves, we determne the best combnaton of values for these parameters based on a separate valdaton set. Baggng. Baggng sngle trees s known to provde good performance by sgnfcantly decreasng varance of the ndvdual tree models. However, compared wth Groves and boostng, whch are both based on addtve models, bagged trees do not explctly model the addtve structure of the response functon. Increasng the number of teratons n baggng does not result n overfttng and baggng of larger trees usually produces better models than baggng smaller trees. Hence we omtted parameter tunng for baggng. Instead we smply report results for a model consstng of 1500 bagged full trees. 3.2 Datasets Synthetc Data wthout Nose. Ths s the same data set that we used as a runnng example n the earler sectons. The response functon s generated by Equaton 1. The performance of bagged Groves on ths dataset s much better than the performance of other methods. Synthetc Data wth Nose. Ths s the same synthetc dataset, only ths tme Gaussan nose s added to the response functon. The standard devaton σ of

10 332 D. Sorokna, R. Caruana, and M. Redewald the nose dstrbuton s chosen as 1/2 of the standard devaton of the response n the orgnal data set. As expected, the performance of all methods drops. Bagged Groves stll perform clearly better, but the dfference s smaller. We have used 5 regresson data sets from the collecton of Luís Torgo [7] for the next set of experments. Knematcs. The Knematcs famly of datasets orgnates from the Delve repostory [8] and descrbes a smulaton of robot arm movement. We used a kn8nm verson of the dataset: 8192 cases, 8 contnuous attrbutes, hgh level of nonlnearty, low level of nose. Groves show 20% mprovement over gradent boostng on ths dataset. It s worth notcng that boostng preferred large trees on ths dataset; trees wth α =0.05 showed clear advantage over smaller trees. However, there was no further mprovement for boostng even larger trees. We attrbute these effects to hgh non-lnearty of the data. Computer Actvty. Another dataset from the Delve repostory, descrbes the state of multuser computer systems cases, 22 contnuous attrbutes. The varance of performance for all algorthms s low. Groves show small (3%) mprovement compared to boostng. Calforna Housng. Ths s a dataset from the StatLb repostory [9] and t descrbes housng prces n Calforna from the 1990 Census: 20, 640 observatons, 9 contnuous attrbutes. Groves show 6% mprovement compared to boostng. Stock. Ths s a relatvely small (960 data ponts) regresson dataset from the StatLb repostory. It descrbes daly stock prces for 10 aerospace companes: the task s to predct the frst one from the other 9. Predcton qualty from all methods s very hgh, so we can assume that the level of nose s small. Ths s another case when Groves gve sgnfcant mprovement (18%) over gradent boostng. Elevators. Ths data set s obtaned from the task of controllng an arcraft [10]. It seems to be nosy, because the varance of performance s hgh although the data set s rather large: 16, 559 cases wth 18 contnuous attrbutes. Here we see a 6% mprovement. 3.3 Dscusson Based on the emprcal results we conjecture that Bagged Groves outperform the other algorthms most when the datasets are hghly non-lnear and not very nosy. (Nose can obscure some of the non-lnearty n the response functon, makng the best models that can be learned from the data more lnear than they would have been for models traned on the response wthout nose.) Ths can be explaned as follows. Groves can capture addtve structure yet at the same tme use large trees. Large trees capture non-lnearty and complex nteractons well, and ths gves Groves an advantage over gradent boostng whch reles mostly on addtvty. Gradent boostng usually works best wth small trees, and fals to make effectve use of large trees. At the same tme most data sets, even nonlnear ones, stll have sgnfcant addtve structure. The ablty to detect and

11 Addtve Groves of Regresson Trees 333 model ths addtvty gves Groves an advantage over baggng, whch s effectve wth large trees, but does not explctly model addtve structure. Gradent boostng s a state of the art ensemble tree method for regresson. Chpman et al [11] recently performed an extensve comparson of several algorthms on 42 data sets. In ther experments gradent boostng showed performance smlar to or better than Random Forests and a number of other types of models. Our algorthm shows performance consstently better than gradent boostng and for ths reason we do not expect that Random Forests or other methods that are not superor to gradent boostng would outperform our bagged Groves. In terms of computatonal cost, bagged Groves and boostng are comparable. In both cases a large number of tree models has to be traned (more for Groves) and there s a varety of parameter combnatons that need to be examned (more for boostng). 4 Baggng Iteratons and Overfttng Resstance In our experments we used a fxed number of baggng teratons and dd not consder ths a tunng parameter because baggng rarely overfts. In baggng the number of teratons s not as crucal as t s for boostng: f we bag as long as we can afford, we wll get the best value that we can acheve. In that sense the expermental results we report are conservatve and Bagged Groves could potentally be mproved by addtonal baggng teratons. Weobservedasmlartrendforparametersα and N as well: more complex models (larger trees, more trees) are at least as good as ther less complex counterparts, but only f they are bagged suffcently many tmes. Fgure 8 shows how the performance on the synthetc data set wth nose depends on the number of baggng teratons for two bagged Groves. The smpler one s traned wth N =5 and α = and the more complex one s traned wth N =10andα =0.We can see that eventually they converge to the same performance and that the smpler model only does better than the complex model when the number of baggng teratons s small. 2 We observed smlar behavor for the other datasets. Ths suggests that one way to get good performance wth bagged Groves mght be to buld the most complex Groves (large trees, many trees) that can be afforded and bag them many, many tmes untl performance tops out. In ths case we mght not need a valdaton set to select the best parameter settngs. However, n practce the most complex models can requre many more teratons of baggng than smpler models that acheve almost the same level of performance much faster. Hence the approach that used n our experments can be more useful n practce: select 2 Note that ths s only true because of the layered approach to tranng Groves whch trans Groves of trees of smaller sze before movng on to Groves wth larger trees. If one ntalzed a Grove wth a sngle large tree, performance of bagged Groves mght stll decrease wth ncreasng tree sze because the ablty of the Grove to learn the addtve structure of the problem would be njured.

12 334 D. Sorokna, R. Caruana, and M. Redewald a computatonally acceptable number of baggng teratons (100 seems to work fne, but one could also use 200 or 500 to be more confdent) and search for the best N and α for ths number of baggng teratons on the valdaton set. 5 Concluson We presented a new regresson algorthm, bagged Groves of trees, whch s an addtve ensemble of regresson trees. It combnes the benefts of large trees that model complex nteractons wth benefts of capturng addtve structure by means of addtve models. Because of ths, bagged Groves perform especally well on complex non-lnear datasets where the structure of the response functon contans both addtve structure (whch s best modeled by addtve trees) and varable nteractons (whch s best modeled wthn a tree). We have shown that on such datasets bagged Groves outperform state-of-the-art technques such as stochastc gradent boostng and baggng. Thanks to baggng, and the layered way n whch Groves are traned, bagged Groves resst overfttng more complex Groves tend to acheve the same or better performance as smpler Groves. Groves are good at capturng the addtve structure of the response functon. A future drecton of our work s to develop technques for determnng propertes nherent n the data usng ths algorthm. In partcular, we beleve we can use Groves to learn useful nformaton about statstcal nteractons between varables n the data set. References 1. Haste, T., Tbshran, R., Fredman, J.H.: The Elements of Statstcal Learnng. Sprnger, Hedelberg (2001) 2. Breman, L.: Baggng Predctors. Machne Learnng 24, (1996) 3. Hooker, G.: Dscoverng ANOVA Structure n Black Box Functons. In: Proc. ACM SIGKDD, ACM Press, New York (2004) 4. Bylander, T.: Estmatng Generalzaton Error on Two-Class Datasets Usng Outof-Bag Estmates. Machne Learnng 48(1 3), (2002) 5. Fredman, J.: Greedy Functon Approxmaton: a Gradent Boostng Machne. Annals of Statstcs 29, (2001) 6. Fredman, J.: Stochastc Gradent Boostng. Computatonal Statstcs and Data Analyss 38, (2002) 7. Torgo, L.: Regresson DataSets, ltorgo/regresson/datasets.html 8. Rasmussen, C.E., Neal, R.M., Hnton, G., van Camp, D., Revow, M., Ghahraman, Z., Kustra, R., Tbshran, R.: Delve. Unversty of Toronto, delve 9. Meyer, M., Vlachos, P.: StatLb. Department of Statstcs at Carnege Mellon Unversty, Camacho, R.: Inducng Models of Human Control Sklls. In: Nédellec, C., Rouverol, C. (eds.) Machne Learnng: ECML-98. LNCS, vol. 1398, Sprnger, Hedelberg (1998) 11. Chpman, H., George, E., McCulloch, R.: Bayesan Ensemble Learnng. In: Advances n Neural Informaton Processng Systems 19, pp (2007)

Application of Additive Groves Ensemble with Multiple Counts Feature Evaluation to KDD Cup 09 Small Data Set

Application of Additive Groves Ensemble with Multiple Counts Feature Evaluation to KDD Cup 09 Small Data Set Applcaton of Addtve Groves Ensemble wth Multple Counts Feature Evaluaton to KDD Cup 09 Small Data Set Dara Sorokna dara@cs.cmu.edu School of Computer Scence, Carnege Mellon Unversty, Pttsburgh PA 15213

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Incremental Learning with Support Vector Machines and Fuzzy Set Theory The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

A Fusion of Stacking with Dynamic Integration

A Fusion of Stacking with Dynamic Integration A Fuson of Stackng wth Dynamc Integraton all Rooney, Davd Patterson orthern Ireland Knowledge Engneerng Laboratory Faculty of Engneerng, Unversty of Ulster Jordanstown, ewtownabbey, BT37 OQB, U.K. {nf.rooney,

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Brave New World Pseudocode Reference

Brave New World Pseudocode Reference Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Dynamic Integration of Regression Models

Dynamic Integration of Regression Models Dynamc Integraton of Regresson Models Nall Rooney 1, Davd Patterson 1, Sarab Anand 1, Alexey Tsymbal 2 1 NIKEL, Faculty of Engneerng,16J27 Unversty Of Ulster at Jordanstown Newtonabbey, BT37 OQB, Unted

More information

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Semi-parametric Regression Model to Estimate Variability of NO 2 Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz

More information

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

Parameter estimation for incomplete bivariate longitudinal data in clinical trials Parameter estmaton for ncomplete bvarate longtudnal data n clncal trals Naum M. Khutoryansky Novo Nordsk Pharmaceutcals, Inc., Prnceton, NJ ABSTRACT Bvarate models are useful when analyzng longtudnal data

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB

SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB V. Hotař, A. Hotař Techncal Unversty of Lberec, Department of Glass Producng Machnes and Robotcs, Department of Materal

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis Assgnment and Fuson of Multple Learnng Methods Appled to Remote Sensng Image Analyss Peter Bajcsy, We-Wen Feng and Praveen Kumar Natonal Center for Supercomputng Applcaton (NCSA), Unversty of Illnos at

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Anonymisation of Public Use Data Sets

Anonymisation of Public Use Data Sets Anonymsaton of Publc Use Data Sets Methods for Reducng Dsclosure Rsk and the Analyss of Perturbed Data Harvey Goldsten Unversty of Brstol and Unversty College London and Natale Shlomo Unversty of Manchester

More information

EXTENDED BIC CRITERION FOR MODEL SELECTION

EXTENDED BIC CRITERION FOR MODEL SELECTION IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7

More information

Learning physical Models of Robots

Learning physical Models of Robots Learnng physcal Models of Robots Jochen Mück Technsche Unverstät Darmstadt jochen.mueck@googlemal.com Abstract In robotcs good physcal models are needed to provde approprate moton control for dfferent

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

A classification scheme for applications with ambiguous data

A classification scheme for applications with ambiguous data A classfcaton scheme for applcatons wth ambguous data Thomas P. Trappenberg Centre for Cogntve Neuroscence Department of Psychology Unversty of Oxford Oxford OX1 3UD, England Thomas.Trappenberg@psy.ox.ac.uk

More information

5.0 Quality Assurance

5.0 Quality Assurance 5.0 Dr. Fred Omega Garces Analytcal Chemstry 25 Natural Scence, Mramar College Bascs of s what we do to get the rght answer for our purpose QA s planned and refers to planned and systematc producton processes

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Alternating Direction Method of Multipliers Implementation Using Apache Spark

Alternating Direction Method of Multipliers Implementation Using Apache Spark Alternatng Drecton Method of Multplers Implementaton Usng Apache Spark Deterch Lawson June 4, 2014 1 Introducton Many applcaton areas n optmzaton have benefted from recent trends towards massve datasets.

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Inverse Kinematics (part 2) CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Spring 2016

Inverse Kinematics (part 2) CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Spring 2016 Inverse Knematcs (part 2) CSE169: Computer Anmaton Instructor: Steve Rotenberg UCSD, Sprng 2016 Forward Knematcs We wll use the vector: Φ... 1 2 M to represent the array of M jont DOF values We wll also

More information

Solitary and Traveling Wave Solutions to a Model. of Long Range Diffusion Involving Flux with. Stability Analysis

Solitary and Traveling Wave Solutions to a Model. of Long Range Diffusion Involving Flux with. Stability Analysis Internatonal Mathematcal Forum, Vol. 6,, no. 7, 8 Soltary and Travelng Wave Solutons to a Model of Long Range ffuson Involvng Flux wth Stablty Analyss Manar A. Al-Qudah Math epartment, Rabgh Faculty of

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

A Robust LS-SVM Regression

A Robust LS-SVM Regression PROCEEDIGS OF WORLD ACADEMY OF SCIECE, EGIEERIG AD ECHOLOGY VOLUME 7 AUGUS 5 ISS 37- A Robust LS-SVM Regresson József Valyon, and Gábor Horváth Abstract In comparson to the orgnal SVM, whch nvolves a quadratc

More information

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp Lfe Tables (Tmes) Summary... 1 Data Input... 2 Analyss Summary... 3 Survval Functon... 5 Log Survval Functon... 6 Cumulatve Hazard Functon... 7 Percentles... 7 Group Comparsons... 8 Summary The Lfe Tables

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Modeling Waveform Shapes with Random Effects Segmental Hidden Markov Models

Modeling Waveform Shapes with Random Effects Segmental Hidden Markov Models Modelng Waveform Shapes wth Random Effects Segmental Hdden Markov Models Seyoung Km, Padhrac Smyth Department of Computer Scence Unversty of Calforna, Irvne CA 9697-345 {sykm,smyth}@cs.uc.edu Abstract

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Learning to Project in Multi-Objective Binary Linear Programming

Learning to Project in Multi-Objective Binary Linear Programming Learnng to Project n Mult-Objectve Bnary Lnear Programmng Alvaro Serra-Altamranda Department of Industral and Management System Engneerng, Unversty of South Florda, Tampa, FL, 33620 USA, amserra@mal.usf.edu,

More information

Random Kernel Perceptron on ATTiny2313 Microcontroller

Random Kernel Perceptron on ATTiny2313 Microcontroller Random Kernel Perceptron on ATTny233 Mcrocontroller Nemanja Djurc Department of Computer and Informaton Scences, Temple Unversty Phladelpha, PA 922, USA nemanja.djurc@temple.edu Slobodan Vucetc Department

More information