Predicting the Density of Algae Communities using Local Regression Trees
|
|
- Meagan Holt
- 5 years ago
- Views:
Transcription
1 Predctng the Densty of Algae Communtes usng Local Regresson Trees Luís Torgo LIACC Machne Learnng Group / FEP Unversty of Porto R. Campo Alegre, 823 Phone: , Fax: emal: ltorgo@ncc.up.pt WWW: ABSTRACT: Ths paper descrbes the applcaton of local regresson trees to an envronmental regresson task. Ths task was part of the 3 rd Internaton ERUDIT Competton. The technque descrbed n ths paper was announced as one of the three runner-up methods by the jury of the competton. We brefly descrbe RT, a system that s able to generate local regresson trees. We emphasse the mult-strategy features of RT that we clam as beng one of the causes for the obtaned performance. We descrbed the pre-processng steps that were taken n order to apply RT to the competton data, hghlghtng the weghed schema that was used to combne the predctons of the best RT varants. We conclude by renforcng the dea that the combnaton of features of dfferent data analyss technques can be useful to obtan hgher predctve accuracy. KEYWORDS: Regresson trees, local modellng, hybrd methods, local regresson trees. INTRODUCTION Ths paper descrbes an applcaton of local regresson trees (Torgo, 1997a, 1997b, 1999) to an envronmental data analyss task. Ths applcaton was carred out n the context of the 3 rd Internatonal Competton organsed by ERUDIT n conjuncton wth the new Computatonal Intellgence and Learnng Cluster. Ths cluster s a cooperaton between four EC-funded Networks of Excellence : ERUDIT, EvoNet, MLnet and NEuroNet. Ths paper descrbes the use of system RT (Torgo, 1999) n the context of ths competton. RT s a computer program that s able to obtan local regresson trees. In ths paper we brefly descrbe the man deas behnd local regresson trees, and then focus on the steps taken to obtan a soluton to ths data analyss task. The data analyss task concerns the envronmental problem of determnng the state of rvers and streams by montorng and analysng certan measurable chemcal concentratons wth the goal of nferrng the bologcal state of the rver, namely the densty of algae communtes. Ths study s motvated by the ncreasng concern wth the mpact human actvtes are havng on the envronment. Identfyng the key chemcal control varables that nfluence the bologcal process assocated wth these algae has become a crucal task n order to reduce the mpact of man actvtes. The data used n ths 3 rd ERUDIT nternatonal competton comes from such a study. Water qualty samples were collected from varous European rvers durng one year. These samples were analysed for varous chemcal substances ncludng: ntrogen n the form of ntrates, ntrtes and ammona, phosphate, ph, oxygen and chlorde. At the same tme, algae samples were collected to determne the dstrbutons of the algae populatons. The dynamcs of algae communtes s strongly nfluenced by the external chemcal envronment. Determnng whch chemcal factors are nfluencng more ths dynamcs s mportant knowledge that can be used to control these populatons. At the same tme there s an economcal factor motvatng even more ths analyss. In effect, the chemcal analyss s cheap and easly automated. On the contrary, the bologcal part nvolves mcroscopc examnaton, requres traned manpower and s therefore both expensve and slow. The competton task conssted of predctng the frequency dstrbuton of seven dfferent algae on the bass of eght measured concentratons of chemcal substances plus some addtonal nformaton charactersng the envronment from whch the sample was taken (season, rver sze and flow velocty). RT s a regresson analyss tool that can be seen as a knd of mult-strategy data analyss system. We wll see that ths characterstc s a consequence of the theory behnd local regresson trees. In effect, these models ntegrate regresson trees (e.g. Breman et al.,1984) wth local modellng (e.g. Cleveland and Loader, 1995). The ntegraton schema used n
2 RT allows several varants to be tred out n a gven data set. Due to ths flexblty, RT can cope wth problems havng dfferent characterstcs due to ts ablty of emulatng technques wth dfferent approxmaton bases. In ths competton we have taken advantage of ths facet of RT. We have treated the predcton of each of the seven algae frequences as a dfferent regresson task. For each of the seven regresson problems we have carred out a selecton process wth the am of estmatng whch RT varant provded the best accuracy. The followng secton provdes a bref descrpton of local regresson trees. We descrbe the man components ntegratng these hybrd regresson models and refer the method that was used to ntegrate them wthn RT. We them focus on the techncal detals concernng the applcaton of RT to the task of predctng the densty of algae communtes. LOCAL REGRESSION TREES Local Regresson Trees (Torgo, 1997a, 1997b, 1999) explore the possblty of mprovng the accuracy of regresson trees by usng smoother models at the tree leaves. These models can be regarded as a hybrd approach to multvarate regresson ntegratng dfferent solutons to ths data analyss problem. Local regresson (e.g. Cleveland and Loader, 1995) s a non-parametrc statstcal methodology that provdes smooth modellng by not assumng any partcular global form of the unknown regresson functon. On the contrary these models ft a functonal form wthn the neghbourhood of the query ponts. These models are known to provde hghly accurate predctons over a wde range of problems due to the absence of a pre-defned functonal form. However, local regresson technques are also known by ther computatonal cost, low nterpretablty and storage requrements. Regresson trees (e.g. Breman et al.,1984), on the other hand, obtan models through a recursve parttonng algorthm that nsures hgh computatonal effcency. Moreover, the resultng models are usually consdered hghly nterpretable. By ntegratng regresson trees wth local modellng, not only we mprove the accuracy of the trees, but also ncrease the computatonal effcency and comprehensblty of local models. In ths secton we provde a bref descrpton of local regresson trees. We start by descrbng both standard regresson trees and local modellng. We then address the ssue of how these two methodologes are ntegrated n our RT regresson tool. STANDARD REGRESSION TREES A regresson tree can be seen as a knd of addtve model (Haste & Tbshran, 1990) of the form where, l ( ) = k I( x ) m x (1) = 1 D k are constants; I(.) s an ndcator functon returnng 1 f ts argument s true and 0 otherwse; and D are dsjont parttons of the tranng data D such that U l = D and I l =1 D D =1 = φ. Models of ths type are sometmes called pecewse constant regresson models as they partton the predctor space 1 χ n a set of mutually exclusve regons and ft a constant value wthn each regon. An mportant aspect of tree-based regresson models s that they provde a propostonal logc representaton of these regons n the form of a tree. Each path from the root of the tree to a leaf corresponds to such a regon. Each nner node 2 of the tree s a logcal test on a predctor varable 3. In the partcular case of bnary trees there are two possble outcomes of the test, true or false. Ths means that assocated to each partton D we have a path P consstng of a conjuncton of logcal tests on the predctor 1 The multdmensonal space formed by all nput (or predctor) varables of a multvarate regresson problem. 2 All nodes except the leaves. 3 Although work exsts on multvarate tests (e.g. Breman et. al. 1984; Murthy et. al., 1994; Broadley & Utgoff, 1995; Gama, 1997).
3 varables. Ths symbolc representaton of the regresson surface s an mportant ssue when one wants to have a better understandng of the problem under consderaton. For nstance, consder the followng small example of a regresson tree obtaned by applyng RT to the competton data to obtan a model for one of the algae: T 39.8 ± 5.9 P 1 Ch , wth k 1 = 39.8 Ch P 2 Ch 7 > 43.8 Ch , wth k 2 = 50.1 F Ch T F 50.1 ± 12.2 Ch T F 12.9 ± ± 1.4 P 3 Ch 7 > 43.8 Ch 3 > 7.16 Ch , wth k 3 = 12.9 P 4 Ch 7 > 43.8 Ch 3 > 7.16 Ch 6 > 51.1, wth k 4 = 3.85 Fgure 1: An Example of a Regresson Tree. As there are four dstnct paths from the root node to the leaves, ths tree dvdes the nput space n four dfferent regons. The conjuncton of the tests n each path can be regarded as a logcal descrpton of such regons, as shown above. Ths tree can be used both to make predctons of the densty of ths alga for future water samples. Moreover, t provdes nformaton on whch varables nfluence more ths densty. Regresson trees are constructed usng a recursve parttonng (RP) algorthm. Ths algorthm bulds a tree by recursvely splttng the tranng sample nto smaller subsets. We gve below a hgh level descrpton of the algorthm. The RP algorthm receves as nput a set of n data ponts, { } t n D t = x, y, and f certan termnaton crtera are not met t generates a test node t, whose branches are obtaned by applyng the same algorthm wth two subsets of the nput data ponts. These subsets consst of the cases that logcally ental the test n the node, Dt {, y Dt : s } * L = x and the remanng cases, D = { x, y D : x s } *. At each node the best splt test s chosen accordng to some t R local crteron, whch means that ths s a greedy hll-clmbng algorthm. t = 1 x, The Recursve Parttonng Algorthm. Input : A set of n data ponts, { < x, y > }, = 1,...,n Output : A regresson tree IF termnaton crteron THEN Create Leaf Node and assgn t a Constant Value Return Leaf Node ELSE Fnd Best Splttng Test s* Create Node t wth s* Left_branch(t) = RegressonParttonngAlgorthm({ <x, y > : x s* }) Rght_branch(t) = RegressonParttonngAlgorthm({ <x, y > : x / s* }) Return Node t ENDIF The algorthm has three man components: A way to select a splt test (the splttng rule). A rule to determne when a tree node s termnal (termnaton crteron). A rule for assgnng a value to each termnal node.
4 The answer to these three problems s related to the error crteron that s selected to gude the growth of the tree. The most common choce s the mnmsaton of the mean squared error (MSE). Usng ths crteron the constant that should be used n the leaves of the trees (termnal nodes) s the average goal varable value of the cases that fall n each leaf. Moreover, ths crteron leads to the followng rule for determnng the best splt of each test node (Torgo, 1999): The best splt s * s the splt that maxmses the expresson, where, S L y and = D t L S n 2 L S. R = y D t R 2 R S + (2) n tl t R Ths rule leads to fast ncremental updatng algorthms that allow effcent search for the best test of each node (Torgo, 1999). Fnally, the termnaton crteron s related to the ssue of relable error estmaton. Methodologes lke regresson trees that have a rch hypothess (model) language, ncur the danger of overfttng the tranng data. Ths n turn leads to unnecessarly large models that usually have poor generalsaton performance (.e. lower predctve accuracy on new samples of the same problem). Overfttng avodance n regresson trees s usually acheved through a process of prunng unrelable branches of an overly large tree (e.g. Breman et al.,1984). Many prunng technques exst (see Torgo, 1999 for an overvew), most of them relyng on relable estmaton of predctve error of the trees (Torgo, 1998). Regresson trees acheve compettve predctve accuracy n a wde range of applcatons. Moreover, ther computatonal effcency allows dealng wth problems wth hundreds of varables and hundreds of thousands of cases. Stll, regresson trees provde a hghly non-smooth functon approxmaton wth strong dscontnutes 4. LOCAL MODELLING Local modellng 5 (Fan, 1995) belongs to a data analytc methodology whose basc dea behnd conssts of obtanng the predcton for a data pont x by fttng a parametrc functon n the neghbourhood of x. Ths means that these methods are locally parametrc as opposed to, for nstance, least squares lnear regresson. Moreover, these methods do not produce a vsble model of the data. Instead they make predctons based on local models generated on a query pont bass. Accordng to Cleveland and Loader (1995) local regresson traces back to the 19 th century. These authors provde a hstorcal survey of the work done snce then. The modern work on local modellng starts n the 1950 s wth the kernel methods ntroduced wthn the probablty densty estmaton settng (Rosenblatt,1956; Parzen,1962) and wthn the regresson settng (Nadaraya,1964; Watson,1964). Local polynomal regresson (Stone,1977; Cleveland,1979; Katkovnk,1979) s a generalsaton of ths early work on kernel regresson. In effect, kernel regresson amounts to fttng a polynomal of degree zero (a constant) n a neghbourhood. Summarsng, we can state the general goal of local regresson as tryng to ft a polynomal of degree p around a query pont (or test case) q usng the tranng data n ts neghbourhood. Ths ncludes the varous avalable settngs lke kernel regresson (p=0), local lnear regresson (p=1), etc 6. Most studes on local modellng are done for the case of one unvarate problems. Stll, the framework s applcable to the multvarate case and has been used wth success n some domans (Atkeson et al.,1997; Moore et al.,1997). However, several authors alert for the danger of applyng these methods n hgher nput space dmensons 7 (Hardle, 1990; Haste & Tbshran,1990). The problem s that wth hgh number of varables the tranng cases are so sparse that the noton of local neghbourhood can hardly be seen as local. Another drawback of local modellng s the complete lack of nterpretablty of the models. No nterpretable model of the tranng data s obtaned. Wth some smulaton work one may obtan a graphcal pcture of the approxmaton provded by local regresson models, but ths s only possble wth low number of nput varables. 4 Ths later characterstcs can both be seen as advantageous or dsadvantageous, dependng on the applcaton. 5 Also known as non-parametrc smoothng and local regresson. 6 A further generalsaton of ths set-up conssts of usng polynomal mxng (Cleveland & Loader,1995), where p can take non-nteger values. 7 The so called curse of dmensonalty (Bellman, 1961).
5 In spte of beng consdered a non-parametrc regresson technque, local modellng does have several parameters that must be tuned n order to obtan good predctve results. One of the most mportant s the noton of neghbourhood. Gven a query pont q we need to decde whch tranng cases wll be used to ft a local polynomal around the query pont. Ths nvolves defnng a dstance metrc over the multdmensonal space defned by the nput varables. Wth ths metrc we can specfy a dstance functon that allows fndng the nearest tranng cases of any query pont. Stll, many open ssues reman unspecfed. Namely, weghng of the varables wthn the dstance calculaton can be crucal n domans wth less relevant varables. Moreover, we need to specfy how many tranng cases wll enter the local ft (usually known as the bandwdth selecton problem). Even after havng a bandwdth sze specfcaton, we need to wegh the contrbutons of the tranng cases wthn the bandwdth. Nearer ponts should contrbute more nto the local ft. Ths s usually accomplshed through a weghng functon that takes the dstance to the query pont nto account (known as the kernel functon). The correct tunng of all these modellng parameters can be crucal for a successful use of local modellng. Local modellng provdes smooth functon approxmaton wth wde applcablty through correct tunng of the dstance functon parameters. Stll, ths s a computatonal ntensve method that does not produce a comprehensble model of the data. Moreover, these methods have dffcultes n dealng wth domans wth strong dscontnutes n the functon beng approxmated. INTEGRATING REGRESSION TREES WITH LOCAL MODELLING In ths secton we descrbe how we propose to overcome some of the lmtatons of both regresson trees and local modellng through an ntegraton of both methods that result n what we refer to as local regresson trees. The man goals of ths ntegraton can be stated as follows: Improve standard regresson trees smoothness, leadng to superor predctve accuracy. Improve local modellng n the followng aspects: Computatonal effcency. Generaton of models that are comprehensble to human users. Capablty to deal wth domans wth strong dscontnutes. In our study of local regresson trees we have consdered the ntegraton of the followng local modellng technques: Kernel models (Watson, 1964; Nadaraya, 1964). These models amount to fttng a polynomal of degree zero (.e. a constant) n the local neghbourhood. Local lnear polynomal (Stone,1977; Cleveland,1979; Katkovnk,1979). Where we ft a lnear polynomal wthn the neghbourhood. Sem-parametrc models or partal lnear models (Spegelman, 1976; Hardle, 1990). Consstng of fttng a standard least squares lnear model on all data plus a kernel model component on the resduals of the lnear polynomal that acts as a local correcton of the global lnearty assumpton of the polynomal. The man decsons concernng the ntegraton of local models wth regresson trees are how, and when to perform t. Three man alternatves exst: Assume the use of local models n the leaves durng all tree nducton (.e. growth and prunng phases). Grow a standard regresson tree and use local models only durng the prunng stage. Grow and prune a standard regresson tree. Use local models only n predcton tasks. The frst of these alternatves s more consstent from a theoretcal pont of vew as the choce of the best splt depends on the models at the leaves. Ths s the approach followed n RETIS (Karalc, 1992), whch ntegrates global least squares lnear polynomals n the tree leaves. However, obtanng such models s a computatonally demandng task. For each tral splt the left and rght chld models need to be obtaned and ther error calculated. Even wth a smple model lke the average, wthout an effcent ncremental algorthm the evaluaton of all canddate splts s too heavy. Ths means that f more complex models are to be used, lke lnear polynomals, ths task s practcally unfeasble for large domans 8. The experments descrbed by Karalc (1992) used data sets wth few hundred cases. The author does not provde any results concernng the computaton penalty of usng lnear models nstead of averages. Stll, we clam that 8 Partcularly wth large number of contnuous varables.
6 when usng more complex models lke kernel regresson ths approach s not feasble f we want to acheve a reasonable computaton tme. The second alternatve s to ntroduce the more complex models only durng the prunng stage. The tree s grown (.e. the splts are chosen) assumng averages n the leaves. Only durng the prunng stage we consder that the leaves wll contan more complex models, whch entals obtanng them for each node of the grown tree. Notce that ths s much more effcent than the frst alternatve mentoned before whch nvolved obtanng the models for each tral splt consdered durng the tree growth. Ths s the approach followed n M5 (Qunlan, 1992). Ths system also uses global least squares lnear polynomals n the tree leaves but these models are only added durng the prunng stage. We have access to a verson of M5 9 and we have confrmed that ths s a computatonally feasble soluton even for large problems. In our ntegraton of regresson trees wth local modellng we have followed the thrd alternatve. In ths approach the learnng process s separated from predcton tasks. We generate the regresson trees usng the standard methodology descrbed prevously. If we want to use the learned tree to make predctons for a set of unseen test cases, we can choose whch model should be used n the tree leaves. These can nclude complex models lke kernels or local lnear polynomals. Usng ths approach we only have to ft as many models as there are leaves n the fnal pruned tree. The man advantage of ths approach s ts computatonal effcency. However, t also allows tryng several alternatve models wthout havng to re-learn the tree. Usng ths approach the ntal tree can be regarded as a knd of rough approxmaton of the regresson surface whch s comprehensble to the human user. On top of ths rough surface we may ft smoother models for each data partton generated n the leaves of the regresson tree so as to ncrease the predctve accuracy. In Torgo (1999) a large expermental evaluaton of local regresson trees over a wde range of problems was carred out. Ths large set of expermental comparsons confrmed that local regresson trees are sgnfcantly more accurate than the standard regresson trees. However, these new regresson models are computatonally more demandng and less comprehensble than standard regresson trees. We have also observed that local regresson trees could overcome some of the lmtatons of local modellng technques, partcularly ther lack of comprehensblty and rather hgh processng tme. These experments have shown that local regresson trees are sgnfcantly faster than local modellng technques. Moreover, through the ntegraton wthn a tree-based structure we obtan a comprehensble nsght of the approxmaton of these models. Fnally, we have also observed that the modellng bas resultng from combnng local models and partton-based approaches mproves the accuracy of local models n domans where there are strong dscontnutes n the regresson surface. PREDICTING THE DENSITY OF ALGAE COMUNITIES In ths secton we descrbe the steps followed n the applcaton of RT n the context of the 3 rd Internatonal ERUDIT Competton. As we have mentoned ths competton conssted of a data analyss task concernng the envronmental problem of determnng the state of rvers and streams by montorng and analysng certan measurable chemcal concentratons. The goal of the task was to obtan a model that should allow makng predctons concernng the frequency dstrbuton of seven algae. Wth ths purpose the organsers have delvered two data fles to each compettor. The frst contaned the tranng data consstng of 200 samples descrbed by 11 nput varable plus the observed algae frequency dstrbutons. Ths tranng fle was presented as a matrx wth 200 rows and 18 (11+7) columns. From the 11 nput varables (labelled as A,,K) three were categorcal, namely the season, the rver sze and the flud velocty. All remanng nput varables were numerc. Several rver samples contaned unknown nput varable values (sgnalled by XXXX ). The compettors were also gven a second fle contanng the test data consstng of another 140 rver samples. The compettors were only gven the values of the 11 nput varables of these test samples. There were also unknown varable values n the test cases. Thus the test data set corresponded to a matrx of 140 rows and 11 columns of values. The goal of the competton was to buld a model based on the tranng data that could provde predctons of the frequency dstrbutons of the 7 algae (labelled as a,,g) for the 140 test samples, whose true values were only known to the organsers. The compettors should present ther results as a matrx of predctons. The solutons proposed by the compettors were evaluated by the sum of the total squared errors between ther predctons and the true values. 9 Verson 5.1
7 PROBLEM SOLUTION USING RT Ths secton brefly descrbes the steps taken to produce a matrx of predctons based on models obtaned wth RT. The frst step that we carred out was to dvde the gven tranng data nto seven dfferent tranng fles. The dvson was done for each of the 7 dfferent algae. Ths means that we dealt wth ths problem as seven dfferent regresson tasks. For each of these seven tranng sets all nput varable values were the same, and only the target varables changed. The second step of our proposed soluton conssted of fllng-n the unknown varable values. For each tranng case wth an unknown varable value we have searched for the 10 most smlar cases. Ths smlarty was asserted usng an Eucldean dstance metrc. The medan value of the varable of these 10 cases was used to fll n the unknown varable value. The methodology used n RT to ntegrate local models wth regresson trees enables ths system to emulate a large number of regresson technques through smple parameter settngs. In effect, as the predctons of the local models used n the leaves are obtaned completely ndependently of the tree growth phase we can very easly try dfferent varants of local regresson. Moreover, we can even use these models on all tranng data that would correspond to the orgnal local models. Ths later alternatve s easly accomplshed by prunng too much the ntal tree untl a sngle leaf s reached. Ths means that RT can emulate several regresson technques lke standard regresson trees, local regresson trees and local modellng technques. Moreover, several of the mplemented local modellng technques can be used to emulate other regresson models. For nstance, a local lnear polynomal can behave lke a standard least squares lnear regresson model through approprate parameter tunng 10. In resume, RT hybrd character can be used to obtan the followng types of models: Standard least squares (LS) regresson trees. Least absolute devaton (LAD) regresson trees. Regresson trees (LS or LAD) wth least squares lnear polynomals n the leaves. Regresson trees (LS or LAD) wth kernel models n the leaves. Regresson trees (LS or LAD) wth k nearest neghbors n the leaves. Regresson trees (LS or LAD) wth least squares local lnear polynomals n the leaves. Regresson trees (LS or LAD) wth least squares partal lnear polynomals (models that ntegrate both parametrc and non-parametrc components) n the leaves. Least squares lnear polynomals (wth or wthout backward elmnaton to smplfy the models). Kernel models. K nearest neghbors. Local lnear polynomals Partal lnear polynomals. Havng so many varants n RT the next step of our analyss was to try to fnd out whch were the most promsng technques for each of the seven regresson tasks. Wth ths purpose we have carred out the followng experment. We have selected a large set of canddate varants of RT. For each of the seven tranng sets we have carred out the followng experment wth the goal of estmatng the predctve accuracy (measured by the Mean Squared Error) of each varant. The MSE was estmated usng a 10-fold Cross Valdaton experment. Each tranng set was randomly dvded nto 10 approxmately equally szed folds. For each fold, a model was bult wth the remanng nne and ts predcton error calculated n the fold. Ths process was repeated for the 10 dfferent folds and the predcton errors were averaged. Ths 10-fold Cross Valdaton process was repeated 10 tmes wth 10 dfferent random permutatons of each tranng set. The fnal score of each varant was obtaned by averagng over these 10 repettons. For each of the seven algae the most promsng varants of RT were collected together wth ther estmated predcton error. Ths model selecton stage mmedately revealed that the seven regresson tasks posed qute dfferent challenges. In effect, there was a large dversty of technques that were estmated as the most promsng dependng of the alga n consderaton. For each of the seven algae tranng sets we have used the respectve most promsng RT varants to obtan a model. These models were them appled n the gven 140 test cases and ther predctons collected. The fnal predctons that 10 In ths example t s enough to use an nfnte bandwdth (meanng that all tranng ponts contrbute to obtan the model) and a unform weghng functon (whch gves equal weghts to all tranng ponts).
8 were submtted as our solutons were obtaned by a weghed average of these predctons. To better llustrate ths weghng schema, magne that for alga a the most promsng RT varants and ther estmated predctve accuracy were: M, Err ; M, Err ; M, Err ; M, Err a M a The fnal predcton for a test case M a y = b M b c M c d M d x would be calculated as: ( x ) Err + M ( x ) Err + M ( x ) Err + M ( x ) M a b Err M a + Err M b M b c + Err M c + Err M c M d d Err M d where, x s a test case; y s the predcton for case x ; and M a ( x ) s the predcton of model M a for case x. DISCUSSION The soluton obtaned through the process descrbed before was declared by the jury of the ERUDIT competton as one of the three runner-ups wnners. After the announcement of the results the organsaton provded the true values of the test samples and ths enabled us to analyse n more detal the performance of our proposed data analyss technque. The total sum of squared error of our soluton was , whch corresponds to a MSE of At the tme of wrtng the scores of the other compettors are not known so t s a bt hard to nterpret the value of these numbers. Stll, we have collected other statstcs that could provde further nsghts on the qualty of the soluton. We have calculated the mean absolute devaton (MAD) of the predctons of RT 11. The followng graph shows the mnmun MAD, maxmum MAD and an nterval (represented by a box) between MAD±Standard Error, for all seven algae. Absolute Error a b c d e f g Algae Fgure 2: The Scores of our Soluton n term of Absolute Error. From ths fgure t s clear that the best results are obtaned for predctng the frequency of algae c, d and g, whle the worst predctons occur wth algae a and f. Stll, the overall Mean Absolute Error s whch s a relatvely low value Ths statstc s more meanngful as t s n the same scale as the goal varable. 12 The values of the algae dstrbutons range from 0 to 100 (although the maxmum value n the tranng samples was 89.8 for alga a).
9 Wth the true goal varable values we were able to confrm that the rankng of RT varants obtaned through repeated 10-fold Cross Valdaton was most of the tmes correct. We have also carred out some ntal experments to try to dentfy the man causes for the good performance of our soluton. These experments clearly show that the avalablty of dfferent varants was hghly benefct because worse results are obtaned when tryng the same varant over the seven regresson tasks. As an example, whle our multstrategy weghed soluton acheved an overall MSE of on all seven tasks, usng the best RT varant (Torgo, 1999), whch conssts of usng local regresson trees wth partal lnear models n the leaves, we obtaned a MSE of Moreover, the weghng schema does also brng some accuracy gans when compared to the alternatve of always selectng the predcton of the best estmated varant for each task 13. We thnk that usng more sophstcated model combnaton methodologes lke boostng (Freund, 1995) or baggng (Breman,1996) should provde even more nterestng accuracy results. CONCLUSIONS We have descrbed the applcaton of a regresson analyss tool named RT to an envronmental data used n the 3 rd Internatonal ERUDIT Competton. RT mplements a new regresson methodology called local regresson trees. Ths technque can be regarded as a combnaton of two exstng methodologes, regresson trees and local modellng. We have descrbed the ntegraton schema used n RT to obtan local regresson trees. We have shown how ths schema s mportant n obtanng a tool that s able to approxmate qute dfferent regresson surfaces. The results obtaned by RT n the competton provde further confdence on the correctness of our ntegraton schema. The methodology used to obtan a soluton for the competton was based on the assumpton that usng dfferent varants of RT on each of the seven regresson tasks of the competton provded better results than always usng the same regresson methodology. Our experments and the results of the competton confrmed ths hypothess. Moreover, averagng over the best varants dd also mprove the accuracy of the resultng predctons. Ths applcaton provdes further evdence towards the advantages of systems ncorporatng multple data analyss technques, and also towards the accuracy advantages of combnng predctons of dfferent models. REFERENCES Atkeson,C.G., Moore,A.W., Schaal,S., 1997, Locally Weghted Learnng, Artfcal Intellgence Revew, 11, Specal ssue on lazy learnng, Aha, D. (Ed.). Bellman,R., 1961, Adaptve Control Processes. Prnceton Unversty Press. Breman,L., Fredman,J.H., Olshen,R.A. & Stone,C.J., 1984, Classfcaton and Regresson Trees, Wadsworth Int. Group, Belmont, Calforna, USA. Breman,L., 1996, Baggng Predctors, Machne Learnng, 24, (p ). Kluwer Academc Publshers. Broadley,C., Utgoff,P., 1995, Multvarate decson trees, Machne Learnng, 19, Kluwer Academc Publshers. Cleveland, W., 1979, Robust locally weghted regresson and smoothng scatterplots, Journal of the Amercan Statstcal Assocaton, 74, Cleveland,W., Loader,C., 1995, Smoothng by Local Regresson: Prncples and Methods (wth dscusson), n Computatonal Statstcs. Fan,J., 1995, Local Modellng, n Encyclopdea of Statstcal Scence. Freund,Y., 1995, Boostng a weak learnng algorthm by majorty, n Informaton and Computaton, 121 (2), Gama,J., 1997, Probablstc lnear tree, n Proceedngs of the 14 th Internatonal Conference on Machne Learnng, Fsher,D. (ed.). Morgan Kaufmann. Hardle,W., 1990, Appled Nonparametrc Regresson. Cambrdge Unversty Press. 13 Whle the weghed average obtaned a total sum of squared error of (MSE=87.020), usng the predctons of the best varant we get a total of (MSE=89.296).
10 Haste,T., Tbshran,R., 1990, Generalzed Addtve Models. Chapman & Hall. Karalc,A., 1992, Employng Lnear Regresson n Regresson Tree Leaves, n Proceedngs of ECAI-92. Wley & Sons. Katkovnk,V., 1979, Lnear an nonlnear methods of nonparametrc regresson analyss, Sovet Automatc Control, 5, Moore,A. Schneder,J., Deng,K., 1997, Effcent Locally Weghted Polynomal Regresson Predctons, n Proceedngs of the 14 th Internatonal Conference on Machne Learnng (ICML-97), Fsher,D. (ed.). Morgan Kaufmann Publshers. Murthy,S., Kasf,S., Salzberg,S., 1994, A system for nducton of oblque decson trees, n Journal of Artfcal Intellgence Research, 2, Nadaraya, E.A., 1964, On estmatng regresson, Theory of Probablty and ts Applcatons, 9: Parzen,E., 1962, On estmaton of a probablty densty functon and mode, Annals Mathematcal Statstcs, 33, Qunlan, J., 1992, Learnng wth Contnuous Classes, n Proceedngs of the 5th Australan Jont Conference on Artfcal Intellgence. Sngapore: World Scentfc, Rosenblatt,M., 1956, Remarks on some nonparametrc estmates of a densty functon, Annals Mathematcal Statstcs, 27, Spegelman,C., 1976, Two technques for estmatng treatment effects n the presence of hdden varables: adaptve regresson and a soluton to Reersol s problem. Ph.D. Thess. Dept. of Mathematcs, Northwestern Unversty. Stone, C.J., 1977, Consstent nonparametrc regresson, The Annals of Statstcs, 5, Torgo,L., 1997a, Functonal models for Regresson Tree Leaves, n Proceedngs of the Internatonal Conference on Machne Learnng (ICML-97), Fsher,D. (ed.). Morgan Kaufmann Publshers. Also avalable n Torgo,L., 1997b, Kernel Regresson Trees, n Proceedngs of the poster papers of the European Conference on Machne Learnng (ECML-97), Internal Report of Faculty of Informatcs and Statstcs, Unversty of Economcs, Prague, ISBN: Also avalable n Torgo,L., 1998, A Comparatve Study of Relable Error Estmators for Prunng Regresson Trees, n Proceedng of the Iberoamercan Conference on Artfcal Intellgence, Coelho,H. (ed.), Sprnger-Verlag. Also avalable n Torgo,L., 1999, Inductve Learnng of Tree-based Regresson Models, Ph.D. Thess, to be presented at the Department of Computer Scence, Faculty of Scences, Unversty of Porto. Soon avalable n Watson, G.S., 1964, Smooth Regresson Analyss, Sankhya: The Indan Journal of Statstcs, Seres A, 26 :
Determining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationProblem Definitions and Evaluation Criteria for Computational Expensive Optimization
Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationProper Choice of Data Used for the Estimation of Datum Transformation Parameters
Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationNAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics
Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationA Statistical Model Selection Strategy Applied to Neural Networks
A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationSimulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010
Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationA Semi-parametric Regression Model to Estimate Variability of NO 2
Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationAn Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed
More informationBOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET
1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationCS 534: Computer Vision Model Fitting
CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust
More informationy and the total sum of
Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationMachine Learning 9. week
Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationAdaptive Regression in SAS/IML
Adaptve Regresson n SAS/IML Davd Katz, Davd Katz Consultng, Ashland, Oregon ABSTRACT Adaptve Regresson algorthms allow the data to select the form of a model n addton to estmatng the parameters. Fredman
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationInvestigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers
Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationX- Chart Using ANOM Approach
ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationCourse Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms
Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques
More informationHierarchical clustering for gene expression data analysis
Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationSHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE
SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro
More informationAvailable online at ScienceDirect. Procedia Environmental Sciences 26 (2015 )
Avalable onlne at www.scencedrect.com ScenceDrect Proceda Envronmental Scences 26 (2015 ) 109 114 Spatal Statstcs 2015: Emergng Patterns Calbratng a Geographcally Weghted Regresson Model wth Parameter-Specfc
More information12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification
Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero
More informationA Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems
A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty
More informationSVM-based Learning for Multiple Model Estimation
SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:
More informationLearning-Based Top-N Selection Query Evaluation over Relational Databases
Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **
More informationTECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.
TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of
More informationAn Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.
[Type text] [Type text] [Type text] ISSN : 97-735 Volume Issue 9 BoTechnology An Indan Journal FULL PAPER BTAIJ, (9), [333-3] Matlab mult-dmensonal model-based - 3 Chnese football assocaton super league
More informationA MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS
Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung
More informationDynamic Integration of Regression Models
Dynamc Integraton of Regresson Models Nall Rooney 1, Davd Patterson 1, Sarab Anand 1, Alexey Tsymbal 2 1 NIKEL, Faculty of Engneerng,16J27 Unversty Of Ulster at Jordanstown Newtonabbey, BT37 OQB, Unted
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationA New Approach For the Ranking of Fuzzy Sets With Different Heights
New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays
More informationThe Research of Support Vector Machine in Agricultural Data Classification
The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou
More informationApplication of Additive Groves Ensemble with Multiple Counts Feature Evaluation to KDD Cup 09 Small Data Set
Applcaton of Addtve Groves Ensemble wth Multple Counts Feature Evaluaton to KDD Cup 09 Small Data Set Dara Sorokna dara@cs.cmu.edu School of Computer Scence, Carnege Mellon Unversty, Pttsburgh PA 15213
More informationData Mining: Model Evaluation
Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct
More informationAPPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT
3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationBAYESIAN MULTI-SOURCE DOMAIN ADAPTATION
BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More informationEmpirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap
Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*
More informationOn Some Entertaining Applications of the Concept of Set in Computer Science Course
On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15
CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc
More informationClassification / Regression Support Vector Machines
Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM
More informationReview of approximation techniques
CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationRelevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis
Assgnment and Fuson of Multple Learnng Methods Appled to Remote Sensng Image Analyss Peter Bajcsy, We-Wen Feng and Praveen Kumar Natonal Center for Supercomputng Applcaton (NCSA), Unversty of Illnos at
More informationUnsupervised Learning
Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and
More informationUnderstanding K-Means Non-hierarchical Clustering
SUNY Albany - Techncal Report 0- Understandng K-Means Non-herarchcal Clusterng Ian Davdson State Unversty of New York, 1400 Washngton Ave., Albany, 105. DAVIDSON@CS.ALBANY.EDU Abstract The K-means algorthm
More informationUnsupervised Learning and Clustering
Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned
More informationParallel matrix-vector multiplication
Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more
More informationFeature Selection as an Improving Step for Decision Tree Construction
2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor
More informationA Fusion of Stacking with Dynamic Integration
A Fuson of Stackng wth Dynamc Integraton all Rooney, Davd Patterson orthern Ireland Knowledge Engneerng Laboratory Faculty of Engneerng, Unversty of Ulster Jordanstown, ewtownabbey, BT37 OQB, U.K. {nf.rooney,
More informationFuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System
Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105
More informationIncremental Learning with Support Vector Machines and Fuzzy Set Theory
The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationS.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION?
S.P.H. : A SOLUTION TO AVOID USING EROSION CRITERION? Célne GALLET ENSICA 1 place Emle Bloun 31056 TOULOUSE CEDEX e-mal :cgallet@ensca.fr Jean Luc LACOME DYNALIS Immeuble AEROPOLE - Bat 1 5, Avenue Albert
More informationON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE
Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton
More informationProblem Set 3 Solutions
Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,
More informationConcurrent Apriori Data Mining Algorithms
Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng
More informationCSE 326: Data Structures Quicksort Comparison Sorting Bound
CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the
More informationBackpropagation: In Search of Performance Parameters
Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,
More informationReport on On-line Graph Coloring
2003 Fall Semester Comp 670K Onlne Algorthm Report on LO Yuet Me (00086365) cndylo@ust.hk Abstract Onlne algorthm deals wth data that has no future nformaton. Lots of examples demonstrate that onlne algorthm
More informationMultiblock method for database generation in finite element programs
Proc. of the 9th WSEAS Int. Conf. on Mathematcal Methods and Computatonal Technques n Electrcal Engneerng, Arcachon, October 13-15, 2007 53 Multblock method for database generaton n fnte element programs
More informationBrave New World Pseudocode Reference
Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be
More informationType-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data
Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES
More informationAdditive Groves of Regression Trees
Addtve Groves of Regresson Trees Dara Sorokna, Rch Caruana, and Mrek Redewald Department of Computer Scence, Cornell Unversty, Ithaca, NY, USA {dara,caruana,mrek}@cs.cornell.edu Abstract. We present a
More informationCSCI 104 Sorting Algorithms. Mark Redekopp David Kempe
CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal
More informationSum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints
Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan
More informationAnalysis of Continuous Beams in General
Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,
More informationFEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur
FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents
More informationA Robust Method for Estimating the Fundamental Matrix
Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More information