Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

Size: px
Start display at page:

Download "Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat"

Transcription

1 GENOMIC SELECTION Comparson Between Lnear and Non-parametrc Regresson Models for Genome-Enabled Predcton n Wheat Paulno Pérez-Rodríguez,*,1 Danel Ganola, Juan Manuel González-Camacho,* José Crossa, Yann Manès, and Susanne Dresgacker *Colego de Postgraduados, Montecllo, Texcoco 56230, Méxco, Departments of Anmal Scences, Dary Scence, and Bostatstcs and Medcal Informatcs, Unversty of Wsconsn-Madson, Madson, Wsconsn 53706, and Bometrcs and Statstcs Unt and Global Wheat Program, Internatonal Maze and Wheat Improvement Center (CIMMYT), Mexco, D.F., Méxco ABSTRACT In genome-enabled predcton, parametrc, sem-parametrc, and non-parametrc regresson models have been used. Ths study assessed the predctve ablty of lnear and non-lnear models usng dense molecular markers. The lnear models were lnear on marker effects and ncluded the Bayesan LASSO, Bayesan rdge regresson, Bayes A, and Bayes B. The non-lnear models (ths refers to non-lnearty on markers) were reproducng kernel Hlbert space (RKHS) regresson, Bayesan regularzed neural networks (BRNN), and radal bass functon neural networks (RBFNN). These statstcal models were compared usng 306 elte wheat lnes from CIMMYT genotyped wth 1717 dversty array technology (DArT) markers and two trats, days to headng (DTH) and gran yeld (GY), measured n each of 12 envronments. It was found that the three non-lnear models had better overall predcton accuracy than the lnear regresson specfcaton. Results showed a consstent superorty of RKHS and RBFNN over the Bayesan LASSO, Bayesan rdge regresson, Bayes A, and Bayes B models. KEYWORDS GenPred Shared data resources Genome-enabled predcton of complex trats based on marker data are becomng mportant n plant and anmal breedng, personalzed medcne, and evolutonary bology (Meuwssen et al. 2001; Bernardo and Yu 2007; de los Campos et al. 2009, 2010; Crossa et al. 2010, 2011; Ober et al. 2012). In the standard, nfntesmal, pedgree-based model of quanttatve genetcs, the famly structure of a populaton s reflected n some expected resemblance between relatves. The latter s measured as an expected covarance matrx among ndvduals and s used to predct genetc values (e.g. Crossa et al. 2006; Burgueño et al. 2007, 2011). Whereas pedgree-based models do not account for Mendelan segregaton and the expected covarance matrx s constructed usng assumptons that do not hold (e.g. absence of selecton and Copyrght 2012 Pérez-Rodríguez et al. do: /g Manuscrpt receved July 9, 2012; accepted for publcaton October 5, 2012 Ths s an open-access artcle dstrbuted under the terms of the Creatve Commons Attrbuton Unported Lcense ( by/3.0/), whch permts unrestrcted use, dstrbuton, and reproducton n any medum, provded the orgnal work s properly cted. Supportng nformaton s avalable onlne at suppl/do: /g /-/dc1 1 Correspondng author: Colego de Postgraduados, Montecllo, Texcoco 56230, Méxco. E-mal: perpdgo@gmal.com mutaton and random matng), the marker-based models allow tracng Mendelan segregaton at several postons of the genome and observng realzed (as opposed to expected) covarances. Ths enhances the potental for mprovng the accuracy of estmates of genetc values, thus ncreasng the genetc progress attanable when these predctons are used for selecton purposes n leu of pedgree-based predctons. Recently, de los Campos et al. (2009, 2010) and Crossa et al. (2010, 2011) used Bayesan estmates from genomc parametrc and sem-parametrc regressons, and they found that models that ncorporate pedgree and markers smultaneously had better predcton accuracy for several trats n wheat and maze than models based only on pedgree or only on markers. The standard lnear genetc model represents the phenotypc response of the th ndvdual (y ) as the sum of a genetc value, g,and of a model resdual, e, such that the lnear model for n ndvduals ð ¼ 1; :::; nþ s represented as y ¼ g þ e. However, buldng predctve models for complex trats usng a large number of molecular markers (p) wth a set of lnes comprsng ndvduals (n) wthp n s challengng because ndvdual marker effects are not lkelhooddentfed. In ths case, marker effects can be estmated va penalzed parametrc or sem-parametrc methods or ther Bayesan counterparts, rather than va ordnary least squares. Ths reduces Volume 2 December

2 the mean-squared error of estmates; t also ncreases predcton accuracy of out-of-sample cases and prevents over-fttng (de los Campos et al. 2010). In addton to the well-known Bayes A and B lnear regresson models orgnally proposed by Meuwssen et al. (2001) for ncorporatng marker effects nto g, there are several penalzed parametrc regresson methods for estmatng marker effects, such as rdge regresson, the least absolute shrnkage and selecton operator (LASSO), and the elastc net (Haste et al. 2009). The Bayesan counterparts of these models have proved to be useful because approprate prors can be assgned to the regularzaton parameter(s), and uncertanty n the estmatons and predctons can be measured drectly by applyng the Bayesan paradgm. Regresson methods assume a lnear relatonshp between phenotype and genotype, and they typcally account for addtve allelc effects only; however, evdence of epstatc effects on plant trats s vast and well documented (e.g. Holland 2001, 2008). In wheat, for nstance, detaled analyses have revealed a complex crcutry of epstatc nteractons n the regulaton of headng tme nvolvng dfferent vernalzaton genes, day-length senstvty genes, and earlness per se genes, as well as the envronment (Laure et al. 1995; Cockram et al. 2007). Epstatc effects have also been found to be an mportant component of the genetc bass of plant heght and bread-makng qualty trats (Zhang et al. 2008; Cont et al. 2011). It s becomng common to study gene gene nteractons by usng a paradgm of networks that ncludes aggregatng gene gene nteracton that exsts even n the absence of man effects (McKnney and Pajewsk 2012). Interactons between alleles at two or more loc could theoretcally be represented n a lnear model va use of approprate contrasts. However, ths does not scale when the number of markers (p) s large, as the number of 2-locus, 3-locus, etc., nteractons s mnd bogglng. An alternatve approach to the standard parametrc modelng of complex nteractons s provded by non-lnear, sem-parametrc methods, such as kernel-based models (e.g. Ganola et al. 2006; Ganola and van Kaam 2008) or artfcal neural networks (NN) (Okut et al. 2011; Ganola et al. 2011), under the assumpton that such procedures can capture sgnals from hgh-order nteractons. The potental of these methods, however, depends on the kernel chosen and on the neural network archtecture. In a recent study, Heslot et al. (2012) compared the predctve accuracy of several genome-enabled predcton models, ncludng reproducng kernel Hlbert space (RKHS) and NN, usng barley and wheat data; the authors found that the non-lnear models gave a modest but consstent predctve superorty (as measured by correlatons between predctons and realzatons) over the lnear models. In partcular, the RKHS model had a better predctve ablty than that obtaned usng the parametrc regressons. The use of RKHS for predctng complex trats was frst proposed by Ganola et al. (2006) and Ganola and van Kaam (2008). de los Campos et al. (2010) further developed the theoretcal bass of RHKS wth kernel averagng (smultaneous use of varous kernels n the model) and showed ts good predcton accuracy. Other emprcal studes n plants have corroborated the ncrease n predcton accuracy of kernel methods (e.g. Crossa et al. 2010, 2011; de los Campos et al. 2010; Heslot et al. 2012). Recently, Long et al. (2010), usng chcken data, and González-Camacho et al. (2012), usng maze data, showed that NN methods provded predcton accuracy comparable to that obtaned usng the RKHS method. In NN, the bases functons (adaptve covarates ) are nferred from the data, whch gves the NN great potental and flexblty for capturng complex nteractons between nput varables (Haste et al. 2009). In partcular, Bayesan regularzed neural networks (BRNN) and radal bass functon neural networks (RBFNN) have features that make them attractve for use n genomc selecton (GS). In ths study, we examned the predctve ablty of varous lnear and non-lnear models, ncludng the Bayes A and B lnear regresson models of Meuwssen et al. (2001); the Bayesan LASSO, as n Park and Casella (2008) and de los Campos et al. (2009); RKHS, usng the kernel averagng strategy proposed by de los Campos et al. (2010); the RBFNN, proposed and used by González-Camacho et al. (2012); and the BRNN, as descrbed by Neal (1996) and used n the context of GS by Ganola et al. (2011). The predctve ablty of these models was compared usng a cross-valdaton scheme appled to a wheat data set from CIMMYT s Global Wheat Program. MATERIALS AND METHODS Expermental data The data set ncluded 306 elte wheat lnes, 263 lnes that are canddates for the 29 th Sem-Ard Wheat Screenng Nursery (SAWSN), and 43 lnes from the 18 th Sem-Ard Wheat Yeld Tral (SAWYT) from CIMMYT s Global Wheat Program. These lnes were genotyped wth 1717 dversty array technology (DArT) markers generated by Trtcarte Pty. Ltd. (Canberra, Australa; com.au). Two trats were analyzed: gran yeld (GY) and days to headng (DTH) (see Supportng Informaton, Fle S1). The trats were measured n a total of 12 dfferent envronments (1 12) (Table 1): GY n envronments 1 7 and DTH n envronments 1 5 and 8 12 (10 n all). Dfferent agronomc practces were used. Yeld trals were planted n 2009 and 2010 usng prepared beds and flat plots under controlled drought or rrgated condtons. Yeld data from experments n 2010 were replcated, whereas data from trals n 2009 were adjusted means from an alpha lattce ncomplete block desgn wth adjustment for spatal varablty n the drecton of rows and columns usng the autoregressve model ftted n both drectons. Data used to tran the models for GY and DTH n 2009 were the best lnear unbased estmator (BLUE) after spatal analyss, whereas the BLUE data for 2010 were obtaned after performng analyses n each of the 12 envronments and combned. The expermental desgns n each locaton conssted of alpha lattce ncomplete block desgns of dfferent szes, wth two replcates each. Broad-sense hertablty at ndvdual envronments was calculated as h 2 ¼ s 2 g =ðs2 g þ s2 e nreps Þ,wheres2 g and s2 e are the genotype and error varance components, respectvely, and nreps s the number of replcates. For the combned analyses across envronments, broad-sense hertablty was calculated as h 2 ¼ s 2 g =ðs2 g þ s2 ge nenv þ s 2 e Þ, where the term nenv nrepsþ s2 ge s the genotype envronment nteracton varance component, and nenv s the number of envronments ncluded n the analyss. Statstcal models One method for ncorporatng markers s to defne g as a parametrc lnear regresson on marker covarates x j wth form g ¼ Pp x jb j, such that y ¼ Pp x jb j þ e ( j = 1,2,...,p markers); here, b j s the partal regresson of y on the j th marker covarate (Meuwssen et al. 2001). Extendng the model to allow for an ntercept y ¼ m þ Xp x j b j þ e (1) We adopted Gaussan assumptons for model resduals; specfcally, the jont dstrbuton of model resduals n Equaton 1 was 1596 P. Pérez-Rodríguez et al.

3 n Table 1 Twelve envronments representng combnatons of dverse agronomc management (drought or full rrgaton, sowng n standard, bed, or flat systems), stes n Mexco, and years for two trats, gran yeld (GY) and days to headng (DTH), wth ther broadsense hertablty (h 2 ) measured n 2010 Envronment Code Agronomc Management Ste n Mexco Year Trat Measured h 2 (GY) h 2 (DTH) 1 Drought-bed Cd. Obregon 2009 GY, DTH 2 Drought-bed Cd. Obregon 2010 GY, DTH Drought-flat Cd. Obregon 2010 GY, DTH Full rrgaton-bed Cd. Obregon 2009 GY, DTH 5 Full rrgaton-bed Cd. Obregon 2010 GY, DTH Heat-bed Cd. Obregon 2010 GY Full rrgaton-flat melga Cd. Obregon 2010 GY Standard Toluca 2009 DTH 9 Standard El Batan 2009 DTH 10 Small observaton plot Cd. Obregon 2009 DTH 11 Small observaton plot Cd. Obregon 2010 DTH Standard Agua Fra 2010 DTH assumed normal wth mean zero and varance s 2 e. The lkelhood functon s p yjm; g; s 2 Y n e ¼ ¼1 where N y jm þ Pp x j b j ; s 2 e y centered at m þ Pp 0 1 X p N@y m þ x j b j ; s 2 A e (2) s a normal densty for random varable x j b j and wth varance s 2 e.dependngon how prors on the marker effects are assgned, dfferent Bayesan lnear regresson models result. Lnear models: Bayesan rdge regresson, Bayesan LASSO, Bayes A, and Bayes B A standard penalzed regresson method s rdge regresson (Hoerl and Kennard 1970); ts Bayesan counterpart, Bayesan rdge regresson (BRR), uses a pror densty of marker effects, pðb j jvþ, thats, Gaussan, centered at zero and wth varance common to all the markers, that s, pðb j js 2 b Þ¼Nðb jj0; s 2 b Þ,wheres2 b s a pror-varance of marker effects. Marker effects are assumed ndependent and dentcally dstrbuted apror. We assgned scaled nverse ch dstrbutons x 22 ðdf : ; s : Þ to the varance parameters s 2 e and s2 b. The pror degrees of freedom parameters were set to df : ¼ 4ands : ¼ 1. It can be shown that the posteror mean of marker effects s the best lnear unbased predctor (BLUP) of marker effects, so Bayesan rdge regresson s often referred to as RR-BLUP (de los Campos et al. 2012). The Bayesan LASSO, Bayes A, and Bayes B relax the assumpton of common pror varance to all marker effects. The relatonshp among these three models s as follows: Bayes B can be consdered as the most general of the three, n the sense that Bayes A and Bayesan rdge regresson can be vewed as specal cases of Bayes B. Ths s because Bayes A s obtaned from Bayes B by settng p ¼ 0(the proporton of markers wth null effects), and Bayesan rdge regresson s obtaned from Bayes B by settng p ¼ 0 and assumng that all the markers have the same varance. Bayes B uses a mxture dstrbuton wth a mass at zero, such that the (condtonal) pror dstrbuton of marker effects s gven by b j s 2 0 wth probablty p j ; p ¼ Nð0; s 2 j Þ wth probablty 1-p (3) The pror assgned to s 2 j ; j ¼ 1; ::::; p s the same for all markers,.e. a scaled nverted ch squared dstrbuton x 22 ðdf b ; s b Þ,wheredf b are the degrees of freedom and s b s a scalng parameter. Bayes B becomes Bayes A by settng p =0. In the case of Bayes B, we h took p ¼ 0:95; df b ¼ 4, and s b ¼ ~s 2 a ðdf b 2 2Þ=df b wth ~s 2 a ¼ ~s2 S = ð1 2 pþ Pp 2q j ð1 2 q j Þ,where q j s the allele frequency for marker j and ~s 2 S s the addtve genetc varance explaned by markers [see Haber et al. (2011) and Resende et al. (2012) for more detals]. In the case of s 2 e,weassgnedaflat pror as n Wang et al. (1994). The Bayesan LASSO assgns a double exponental (DE) dstrbuton to all marker effects (condtonally on a regularzaton parameter l), centered at zero and wth marker-specfc varance, that s, pðb j jl; s e Þ¼DE b j j0; l. The DE dstrbuton does not conjugate se 2 wth the Gaussan lkelhood, but t can be represented as a mxture of scaled normal denstes, whch allows easy mplementaton of the model (Park and Casella 2008; de los Campos et al. 2009). The prors used were exactly the same as those used n González-Camacho et al. (2012). The models used n ths study, the Bayesan rdge regresson, Bayesan LASSO (BL), Bayes A, and Bayes B, are explaned n detal n several artcles; for example, Bayes A and Bayes B are descrbed n Meuwssen et al. (2001), Haber et al. (2011), and Resende et al. (2012), and an account of BL s gven n de los Campos et al. (2009, 2012), Crossa et al. (2010, 2011), Perez et al. (2010), and González- Camacho et al. (2012). Non-lnear models: RBFNN, BRNN, and RKHS In ths secton, we descrbe the basc structure of the non-lnear sngle hdden layer feed-forward neural network (SLNN) wth two of ts varants, the radal bass functon neural network and the Bayesan regularzed neural network. We also gve a bref explanaton of RKHS wth the averagng kernel method at the end of ths secton. Sngle hdden layer feed-forward neural network: In a sngle-layer feed-forward (SLNN), the non-lnear actvaton functons n the hdden layer enable a NN to have unversal approxmaton ablty, gvng t great potental and flexblty n terms of capturng complex patterns. The structure of the SLNN s depcted n Fgure 1, whch llustrates the structure of the method for a phenotypc contnuous response. Ths NN can be thought of as a two-step regresson (e.g. Haste et al. 2009). In the frst step, n the non-lnear hdden layer, S data-derved bass functons (k = 1, 2,..., S neurons), fz ½kŠ g, are nferred, and n the second step, n the lnear output layer, the response s regressed on the bass functons (nferred n the hdden layer). The nner product between the nput vector and the weght Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1597

4 Fgure 1 Structure of a sngle-layer feedforward neural network (SLNN) adapted from González-Camacho et al. (2012). In the hdden layer, nput varables x = ðx 1 ; :::; x p Þ (j =1,...,p markers) are combned for each neuron (k=1,...,s neurons) usng a lnear functon, u ½kŠ ¼ b k þ Pp xj b ½kŠ, and subsequently transformed usng a non-lnear actvaton functon, j yeldng a set of nferred scores, z ½kŠ ¼ g k ðu ½kŠ Þ. These scores are used n the output layer as bass functons to regress the response usng the lnear actvaton functon on the dataderved predctors y ¼ m þ PS w k z ½kŠ þ e. vector (b ½kŠ ) of each neuron of the hdden layer, plus a bas (ntercept b k ), s performed, that s, u ½kŠ ¼ b k þ Pp x j b ½kŠ j ; (j =1,...,p markers); ths s then transformed usng a non-lnear actvaton functon g k ðu ½kŠ Þ. One obtans z ½kŠ ¼ g k b k þ Pp x j b ½kŠ j,whereb k s an ntercept and (b [1] 1,..., b [1] p ;..., b [S] 1,..., b [S] p )9 s a vector of regresson coeffcents or weghts of each neuron k n the hdden layer. The g k ð:þ s the actvaton functon, whch maps the nputs nto the real lne n the closed nterval [21,1]; for example, g k ðxþ ¼ expð2xþ 2 1 s known as the tangent hyperbolc functon. expð2xþþ1 Fnally, n the lnear output layer, phenotypes are regressed on the data-derved features, fz ½kŠ g, accordng to 0 1 y ¼ m þ XS w k z ½kŠ þ e ¼ m þ XS w k g k þ Xp x j b ½kŠ j Aþ e : (4) Radal bass functon neural network: The RBFNN was frst proposed by Broomhead and Lowe (1988) and Poggo and Gros (1990). Fgure 2 shows the archtecture of a sngle hdden layer RBFNN wth S non-lnear neurons. Each non-lnear neuron n the hdden layer has a Gaussan radal bass functon (RBF) defned as z ½kŠ ¼ exp½ 2 h k kx 2c k k 2 Š,where kx 2 c k k s the Eucldean norm between the nput vector x and the center vector c k and h k s the bandwdth of the Gaussan RBF. Subsequently, n the lnear output layer, phenotypes are regressed on the data-derved features, fz ½kŠ g, accordng to y ¼ m þ PS w k z ½kŠ þ e,wheree s a model resdual. Estmatng the parameters of the RBFNN: The vector of weghts v ¼ fw 1 ; :::; w S g of the lnear output layer s obtaned usng the ordnary least-squares ft that mnmzes the mean squared dfferences between the ^y (from RBFNN) and the observed responses y n the tranng set, provded that the Gaussan RBFs for centers c k and h k of the hdden layer are defned. The centers are selected usng an orthogonalzaton least-squares learnng algorthm, as descrbed by Chen et al. (1991) and mplemented n Matlab 2010b. The centers are added teratvely such that each new selected center s orthogonal to the others. The selected centers maxmze the decrease n the meansquared error of the RBFNN, and the algorthm stops when the number of centers (neurons) added to the RBFNN attans a desred precson (goal error) or when the number of centers s equal to the number of nput vectors, that s, when S=n. The bandwdth h k of the Gaussan RBF s defned n terms of a desgn parameter of the net 0: spread, thats,h k ¼ for each Gaussan RBF of the hdden spread layer. To select the best RBFNN, a grd for tranng the net was generated, contanng dfferent values of spread and dfferent precson values (goal error). The ntal value of the spread was the medan of the Eucldean dstances between each par of nput vectors (x ), and an ntal value of 0.02 for the goal error was consdered. The parameter spread allows adjustng the form of the Gaussan RBF such that t s suffcently large to respond to overlappng regons of the nput space but not so bg that t mght nduce the Gaussan RBF to have a smlar response. Bayesan regularzed neural networks: The dfference between SLNN and BRNN s n the functon to be mnmzed (see the penalzed functon below); therefore, the basc structure of a BRNN can be represented n Fgure 1 as well. The SLNN descrbed above s flexble enough to approxmate any non-lnear functon; ths great flexblty allows NN to capture complex nteractons among predctor 1598 P. Pérez-Rodríguez et al.

5 Fgure 2 Structure of a radal bass functon neural network adapted from González- Camacho et al. (2012). In the hdden layer, nformaton from nput varables ðx 1 ; :::; x p Þ (j = 1,...,p markers) s frst summarzed by means of the Eucldean dstance between each of the nput vectors fx g wth respect to S (data-nferred) (k=1,...,s neurons) centers fc k g, that s, u ½kŠ ¼ h k jjx 2 c k jj 2. These dstances are then transformed usng the Gaussan functon z ½kŠ scores are used n the output layer as bass ¼ expð2u ½kŠ Þ. These functons for the lnear regresson y ¼ m þ PS w k z ½kŠ þ e. varables (Haste et al. 2009). However, ths flexblty also leads to two mportant ssues: (1) as the number of neurons ncreases, the number of parameters to be estmated also ncreases; and (2) as the number of parameters rses, the rsk of over-fttng also ncreases. It s common practce to use penalzed methods va Bayesan methods to prevent or pallate over-fttng. MacKay (1992, 1994) developed a framework for obtanng estmates of all the parameters n a feed-forward sngle neural network by usng an emprcal Bayes approach. Let u ¼ (w 1,...,w S ; b 1,...,b S ; b [1] 1,..., b [1] p ;..., b [S] 1,..., b [S] p, m)9 be the vector contanng all the weghts, bases, and connecton strengths. The author showed that the estmaton problem can be solved n two steps, followed by teraton: (1) Obtan the condtonal posteror modes of the elements n u assumng that the varance components s 2 e and s 2 u are known and that the pror dstrbuton for the all the elements n u s gven by pðujs 2 u Þ¼MNð0; s2 uiþ. It s mportant to note that ths approach assgns the same pror to all elements of u, even though ths may not always be the best thng to do. The densty of the condtonal (gven the varance parameters) posteror dstrbuton of the elements of u, accordngtobayes theorem, s gven by pðu y; s 2 e ; s 2 u Þ¼ pðy u; s 2 e Þpðu s 2 u Þ pðy s 2 e ; s 2 u Þ (5) The condtonal modes can be obtaned by maxmzng Equaton 5 over u. However, the problem s equvalent to mnmzng the followng penalzed sum of squares [see Ganola et al. (2011) for more detals] FðuÞ ¼b Xn ¼1 e 2 þ a Xm u 2 j where b ¼ 1=ð2s 2 e Þ, a ¼ 1=ð2s2 u Þ, e s the dfference between observed and predcted phenotypes for the ftted model, and u j (j ¼ 1; :::; m) s the j th element of vector u. (2) Update s 2 e and s2 u. The updatng formulas are obtaned by maxmzng an approxmaton to the margnal lkelhood of the data pðyjs 2 e ; s2 uþ (the evdence ) gven by the denomnator of Equaton 5. (3) Iterate between (1) and (2) untl convergence. The orgnal algorthm developed by MacKay was further mproved by Foresee and Hagan (1997) and adopted by Ganola et al. (2011) n the context of genome and pedgree-enabled predcton. The algorthm s equvalent to estmaton va maxmum penalzed lkelhood estmaton when weght decay s used, but t has the advantage of provdng a way of settng the extent of weght decay through the varance component s 2 u. Neal (1996) ponted out that the procedure of MacKay (1992, 1994) can be further generalzed. For example, there s no need to approxmate probabltes va Gaussan assumptons; furthermore, t s possble to estmate the entre posteror dstrbutons of all the elements n u, not only ther (condtonal) posteror modes. Next, we brefly revew Neal s approach to solvng the problem; a comprehensve revson can be found n Lampnen and Vehtar (2001). Pror dstrbutons: a) Varance component of the resduals: Neal (1996) used a conjugate nverse Gamma dstrbuton as a pror for the varance assocated wth the resdual, e, gven n Equaton 4, that s, s 2 e Inv-Gammaðs e ; df e Þ,wheres e and df e are the scale and degrees of freedom parameters, respectvely. These parameters can be set to the default values gven by Neal (1996), s e =0.05, df e =0.5. These values were also used by Lampnen and Vehtar (2001). b) Connecton strengths, weghts, and bases: Neal (1996) suggested dvdng the network parameters n u nto groups and then usng herarchcal models for each group of parameters; for example, connecton strengths (b [1] 1,..., b [1] p ;...; b [S] 1,..., b [S] p ), bases (b 1,...,b S ) of the hdden layer, and output weghts (w 1,...,w S ), and general mean or bas (m) of the lnear output layer. Suppose that u 1,...,u k are parameters of a gven group; then assume Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1599

6 ( pðu 1 ; :::; u k s 2 u Þ¼ð2pÞ 2k=2 s k u exp 2 1 2s 2 u X S u 2 k And, at the last stage of the model, assgn the pror s 2 u Inv-Gammaðs u ; df u Þ. The scale parameter of the dstrbuton assocated wth the group of parameters contanng the connecton strengths (b [1] 1,..., b [1] p ;...; b [S] 1,..., b [S] p ) changes accordng to the number of nputs, n ths case, s u ¼ð0:05=p 1=dfu Þ 2 wth df u ¼ 0:5 andp s the number of markers n the data set. By usng Markov chan Monte Carlo (MCMC) technques through an algorthm called hybrd Monte Carlo, Neal (1996) developed a software termed flexble Bayesan modelng (FBM) capable of obtanng samples from the posteror dstrbutons of all unknowns n a neural network (as n Fgure 1). Reproducng kernel Hlbert spaces regresson: RKHS models have been suggested as an alternatve to multple lnear regresson for capturng complex nteracton patterns that may be dffcult to account for n a lnear model framework (Ganola et al. 2006). In RKHS model, the regresson functon takes the form f ðx Þ¼m þ Xn 9 ¼1 ) a 9Kðx ; x 9Þ (6) where x ¼ðx 1 ; :::; x p Þ 9 and x 9 ¼ðx 1; :::; x 9 pþ 9 are nput vectors of 9 marker genotypes n ndvduals and 9; a 9 are regresson coeffcents; and Kðx ; x 9Þ¼expð2 hkx 2x 9k 2 Þ s the reproducng kernel defned (here) wth a Gaussan RBF, where h s a bandwdth parameter and kx 2 x 9k s the Eucldean norm between each par of nput vectors. The strategy termed kernel averagng for selectng optmal values of h wthn a set of canddate values was mplemented usng the Bayesan approach descrbed n de los Campos et al. (2010). Smlartes and connectons between the RKHS and the RBFNN are gven n González-Camacho et al. (2012). Assessment of the models predctve ablty The predctve ablty of the models gven above was compared usng Pearson s correlaton and predctve mean-squared error (PMSE) usng predcted and realzed values. A total of 50 random parttons were generated for each of the data sets, and each partton randomly assgned 90% of the lnes to the tranng set and the remanng 10% to the valdaton set. The partton scheme used was smlar to that n Ganola et al. (2011) and González-Camacho et al. (2012). All scrpts were run n a Lnux work staton; for Bayesan rdge regresson and Bayesan LASSO, we used the R package BLR (de los Campos and Perez 2010), whereas for RKHS, we used the R mplementaton descrbed n de los Campos et al. (2010), whch was kndly provded by the authors. In the case of Bayes A and Bayes B, we used a program descrbed by Hckey and Ter (2009), whch s freely avalable at For the BRNN, we used the FMB software avalable at toronto.edu/~radford/fbm.software.html. Because the computatonal tme requred to evaluate the predctve ablty of the BRNN network was great, we used the Condor hgh throughput computng system at the Unversty of Wsconsn-Madson ( condor). The RBFNN model was run usng Matlab 2010b for Lnux. The dfferences n computng tmes between the models were great. The computng tmes for evaluatng the predcton ablty of the 50 parttons for each trat were as follows, 10 mn for RBFNN, 1.5 hr for RKHS, 3 hr for BRR, 3.5 hr for BL, 4.5 hr for Bayes B, 5.5 hr for Bayes A, and 30 days for BRNN. In the case of RKHS, BRR, BL, Bayes A, and Bayes B, nferences were based on 35,000 MCMC samples, and on 10,000 samples for BRNN. The estmated computng tmes were obtaned usng, as reference, a sngle Intel Xeon CPU GHz and 8 Gb of RAM memory. Sgnfcant reducton n computng tme was acheved by parallelzng the tasks. RESULTS Data from replcated experments n 2010 were used to calculate the broad-sense hertablty for each trat n each envronment (Table 1). Broad-sense hertablty across locatons for 2010 data were 0.67 for GY and 0.92 for DTH. These hgh estmates can be explaned, at least n part, by the strct envronmental control of trals conducted at CIMMYT s experment staton at Cudad Obregon. The hertablty of the two trats for 2009 was not estmated because the only avalable phenotypc data were adjusted means for each envronment. Predctve assessment of the models The predctve ablty of the dfferent models for GY and DTH vared among the 12 envronments. The model deemed best usng correlatons (Table 2) tended to be the one wth the smallest average PMSE (Table 3). The three non-parametrc models had hgher predctve correlatons and smaller PMSE than the lnear models for both GY and DTH. Wthn the lnear models, the results are mxed, and all models gave smlar predctons. Wthn the non-parametrc models, RBFNN and RKHS always gave hgher correlatons between predcted values and realzed phenotypes, and a smaller average PMSE than the BRNN. The mean of the correlatons and the assocated standard errors can be used to test for statstcally sgnfcant mprovements n the predctablty of the non-lnear models vs. the lnear models. The t-test (wth a ¼ 0:05) showed that RKHS gave sgnfcant mprovements n predcton n 13/19 cases (Table 3) compared wth the BL, whereas RBFNN was sgnfcantly better than the BL n 10/19 cases. Smlar results were obtaned when comparng RKHS and RBFNN wth Bayes A and Bayes B. Correlatons between observed and predcted values for DTH were lowest overall n envronments 4 and 8, n Cd. Obregon, 2009, and n Toluca, Average PMSE was n agreement wth the fndngs based on correlatons. Although accuraces n envronment 4 were much lower than n other envronments, the hgher accuracy of the non-parametrc models (RKHS, RBFNN, and BRNN) over that of the lnear models (BL, BRR, Bayes A, and Bayes B) was consstent wth what was observed n the other envronments. Fgures 3 and 4 gve scatter plots of the correlatons obtaned wth the three non-parametrc models vs. the BL for DTH and GY, respectvely; each crcle represents the estmated correlatons for each of the two models ncluded n the plot. In Fgure 3, A C, DTH had a total of 500 ponts (10 envronments and 50 random tranng-testng parttons). In Fgure 4, A C, GY had a total of 350 ponts (7 envronments and 50 random parttons n each envronment). A pont above the 45-degree lne represents an analyss where the method whose predctve correlaton s gven on the vertcal axs (RKHS, RBFNN, BRNN) outperformed the one whose correlaton s gven on the horzontal axs (BL). Both fgures show that although there s a great deal of varablty due to partton, for both DTH and GY, the overall superorty of RKHS and RBFNN over the lnear model BL s clear. For both trats, BL had slghtly better predcton accuracy than the BRNN n terms of the number of ndvdual correlaton ponts. It s nterestng to note that some cross-valdaton parttons pcked subsets of tranng data that had negatve, zero, or very low correlatons wth the observed values n 1600 P. Pérez-Rodríguez et al.

7 n Table 2 Average correlaton (SE n parentheses) between observed and predcted values for gran yeld (GY) and days to headng (DTH) n 12 envronments for seven models Trat Envronment BL BRR Bayes A Bayes B RKHS RBFNN BRNN (0.11) 0.59 (0.11) 0.59 (0.11) 0.56 (0.11) 0.66 (0.09) 0.66 (0.10) 0.64 (0.11) (0.14) 0.57 (0.14) 0.61 (0.12) 0.57 (0.13) 0.63 (0.13) 0.61 (0.13) 0.62 (0.13) (0.13) 0.60 (0.12) 0.62 (0.11) 0.60 (0.12) 0.68 (0.10) 0.69 (0.10) 0.67 (0.11) (0.18) 0.07 (0.17) 0.06 (0.17) 0.06 (0.17) 0.12 (0.18) 0.16 (0.18) 0.02 (0.19) DTH (0.09) 0.64 (0.10) 0.66 (0.09) 0.66 (0.09) 0.69 (0.08) 0.68 (0.08) 0.68 (0.08) (0.15) 0.37 (0.15) 0.36 (0.15) 0.35 (0.14) 0.46 (0.13) 0.46 (0.14) 0.39 (0.15) (0.12) 0.59 (0.11) 0.53 (0.12) 0.52 (0.11) 0.62 (0.11) 0.63 (0.11) 0.61 (0.12) (0.14) 0.52 (0.14) 0.56 (0.13) 0.54 (0.14) 0.61 (0.13) 0.62 (0.12) 0.57 (0.13) (0.15) 0.52 (0.16) 0.53 (0.13) 0.51 (0.13) 0.58 (0.14) 0.59 (0.13) 0.55 (0.14) (0.19) 0.42 (0.18) 0.45 (0.18) 0.45 (0.18) 0.47 (0.18) 0.39 (0.19) 0.35 (0.19) Average 0.59 (0.12) 0.58 (0.12) 0.60 (0.12) 0.57 (0.12) 0.65 (0.10) 0.48 (0.14) 0.48 (0.14) (0.13) 0.43 (0.14) 0.48 (0.13) 0.46 (0.13) 0.51 (0.12) 0.51 (0.12) 0.50 (0.13) (0.14) 0.41 (0.17) 0.48 (0.14) 0.48 (0.14) 0.50 (0.14) 0.43 (0.16) 0.43 (0.16) (0.21) 0.29 (0.22) 0.20 (0.22) 0.18 (0.22) 0.37 (0.20) 0.42 (0.21) 0.32 (0.24) GY (0.15) 0.46 (0.13) 0.43 (0.15) 0.42 (0.15) 0.53 (0.12) 0.55 (0.11) 0.49 (0.14) (0.14) 0.56 (0.16) 0.75 (0.11) 0.74 (0.12) 0.64 (0.13) 0.66 (0.13) 0.63 (0.13) (0.10) 0.67 (0.11) 0.73 (0.08) 0.71 (0.08) 0.73 (0.08) 0.71 (0.08) 0.69 (0.10) (0.14) 0.50 (0.14) 0.42 (0.14) 0.40 (0.15) 0.53 (0.13) 0.54 (0.14) 0.50 (0.14) Average 0.62 (0.10) 0.57 (0.14) 0.69 (0.10) 0.70 (0.09) 0.67 (0.09) 0.56 (0.12) 0.65 (0.10) Ftted models were Bayesan LASSO (BL), RR-BLUP (BRR), Bayes A, Bayes B, reproducng kernel Hlbert spaces regresson (RKHS), radal bass functon neural networks (RBFNN) and Bayesan regularzed neural networks (BRNN) across 50 random parttons of the data wth 90% n the tranng set and 10% n the valdaton set. The models wth hghest correlatons are underlned. the valdaton set. These results ndcate that lnes n the tranng set are not necessarly related to those n the valdaton set. DISCUSSION AND CONCLUSIONS Understandng the mpact of epstass on quanttatve trats remans a major challenge. In wheat, several studes have reported sgnfcant epstass for gran yeld and headng or flowerng tme (Goldrnger et al. 1997). Detaled analyses have shown that vernalzaton, daylength senstvty, and earlness per se genes are manly responsble for regulatng headng tme. The vernalzaton requrement relates to the senstvty of the plant to cold temperatures, whch causes t to accelerate spke prmordal formaton. Transgenc and mutant analyses, for example, have suggested a pathway nvolvng epstatc nteractons that combnes envronment-nduced suppresson and upregulaton of several genes, leadng to fnal floral transton (Shmada et al. 2009). There s evdence that the aggregaton of multple gene gene nteractons (epstass) wth small effects nto small epstatc networks n Table 3 Predctve mean- squared error (PMSE) between observed and predcted values for gran yeld (GY) and days to headng (DTH) n 12 envronments for seven models Trat Envronment BL BRR Bayes A Bayes B RKHS RBFNN BRNN DTH Average GY Average Ftted models were Bayesan LASSO (BL), RR-BLUP (BRR), Bayes A, Bayes B, reproducng kernel Hlbert space regresson (RKHS), radal bass functon neural networks (RBFNN) and Bayesan regularzed neural networks (BRNN) across 50 random parttons of the data wth 90% n the tranng set and 10% n the valdaton set. The models wth lowest PMSE are underlned. Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1601

8 Fgure 3 Plots of the predctve correlaton for each of 50 cross-valdaton parttons and 10 envronments for days to headng (DTH) n dfferent combnatons of models. (A) When the best non-parametrc model s RKHS, ths s represented by an open crcle; when the best lnear model s BL, ths s represented by a flled crcle. (B) When the best non-parametrc model s RBFNN, ths s represented by an open crcle; when the best lnear model s BL, ths s represented by a flled crcle. (C) When the best non-parametrc model s BRNN, ths s represented by an open crcle; when the best lnear model s BL, ths s represented by a flled crcle. The hstograms depct the dstrbuton of the correlatons n the testng set obtaned from the 50 parttons for dfferent models. The horzontal (vertcal) dashed lne represents the average of the correlatons for the testng set n the 50 parttons for the model shown on the Y (X) axs. The sold lne represents Y = X;.e. both models have the same predcton ablty. s mportant for explanng the hertablty of complex trats n genome-wde assocaton studes (McKnney and Pajewsk 2012). Epstatc networks and gene gene nteractons can also be exploted for GS va sutable statstcal-genetc models that ncorporate network complextes. Evdence from ths study, as well as from other research nvolvng other plant and anmal speces, suggests that models that are non-lnear n nput varables (e.g. SNPs) predct outcomes n testng sets better than standard lnear regresson models for genome-enabled predcton. However, t should be ponted out that better predctve ablty can have several causes, one of them the ablty of some nonlnear models to capture epstatc effects. Furthermore, the random cross-valdaton scheme used n ths study was not desgned to 1602 P. Pérez-Rodríguez et al.

9 Fgure 4 Plot of the correlaton for each of 50 cross-valdaton parttons and seven envronments for gran yeld (GY) n dfferent combnatons of models. (A) When the best model s RKHS, ths s represented by an open crcle; when the best model s BL, ths s represented by a flled crcle. (B) When best model s RBFNN, ths s represented by an open crcle; when the best model s BL, ths s represented by a flled crcle. (C) When the best model s BRNN, ths s represented by an open crcle; when the best model s BL, ths s represented by a flled crcle. The hstograms depct the dstrbuton of the correlatons n the testng set obtaned from the 50 parttons for dfferent models. The horzontal (vertcal) dashed lne represents the average of the correlatons for the testng set n the 50 parttons for the model shown on the Y (X) axs. The sold lne represents Y = X;.e. both models have the same predcton ablty. specfcally assess epstass but rather to compare the models predctve ablty. It s nterestng to compare results from dfferent predctve machneres when appled to ether maze or wheat. Dfferences n the predcton accuracy of non-parametrc and lnear models (at least for the data sets ncluded n ths and other studes) seem to be more pronounced n wheat than n maze. Although dfferences depend, among other factors, on the trat-envronment combnaton and the number of markers, t s clear from González-Camacho et al. (2012) that for flowerng trats (hghly addtve) and trats such as gran yeld (addtve and epstatc) n maze, the BL model performed very smlarly to the RKHS and RBFNN. On the other hand, n the present study, whch nvolves wheat, the RKHS, RBFNN, and BRNN models clearly had a markedly better predctve accuracy than BL, BRR, Bayes A, or Bayes B. Ths may be due to the fact that, n wheat, addtve addtve epstass plays an mportant role n gran yeld, as found by Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1603

10 Crossa et al. (2006) and Burgueño et al. (2007, 2011) when assessng addtve, addtve addtve, addtve envronment, and addtve addtve envronment nteractons usng a pedgree-based model wth the relatonshp matrx A. As ponted out frst by Ganola et al. (2006) and subsequently by Long et al. (2010), non-parametrc models do not mpose strong assumptons on the phenotype-genotype relatonshp, and they have the potental of capturng nteractons among loc. Our results wth real wheat data sets agreed wth prevous fndngs n anmal and plant breedng and wth smulated experments, n that a non-parametrc treatment of markers may account for epstatc effects that are not captured by lnear addtve regresson models. Usng extensve maze data sets, González-Camacho et al. (2012) found that RBFNN and RKHS had some smlartes and seemed to be useful for predctng quanttatve trats wth dfferent complex underlyng gene acton under varyng types of nteracton n dfferent envronmental condtons. These authors suggested that t s possble to make further mprovements n the accuracy of the RKHS and RBFNN models by ntroducng dfferental weghts n SNPs, as shown by Long et al. (2010) for RBFs. The tranng populaton used here was not developed specfcally for ths study; t was made up of a set of elte lnes from the CIMMYT ran-fed sprng wheat breedng program. Our results show that t s possble to acheve good predctons of lne performance by combnng phenotypc and genotypc data generated on elte lnes. As genotypng costs decrease, breedng programs could make use of genome-enabled predcton models to predct the values of new breedng lnes generated from crosses between elte lnes n the tranng set before they reach the yeld testng stage. Lnes wth the hghest estmated breedng values could be ntercrossed before beng phenotyped. Such a rapd cyclng scheme would accelerate the fxaton rate of favorable alleles n elte materals and should ncrease the genetc gan per unt of tme, as descrbed by Heffner et al. (2009). It s mportant to pont out that proof-of-concept experments are requred before genome-enabled selecton can be mplemented successfully n plant breedng programs. It s necessary to test genomc predctons on breedng materals derved from crosses between lnes of the tranng populaton. If predctons are relable enough, an experment usng the same set of parental materals could be carred out to compare the feld performance of lnes comng from a genomc-asssted recurrent selecton program scheme vs. lnes comng from a conventonal breedng scheme. The accuraces reported n ths study represent predcton of wheat lnes usng a tranng set comprsng lnes wth some degree of relatedness to lnes n the valdaton set. When the valdaton and the tranng sets are not genetcally related (unrelated famles) or represent populatons wth dfferent genetc structures and dfferent lnkage dsequlbrum patterns, then neglgble accuraces are to be expected. It seems that successful applcaton of genomc selecton n plant breedng requres some genetc relatedness between ndvduals n the tranng and valdaton sets, and that lnkage dsequlbrum nformaton per se does not suffce (e.g. Makowsky et al. 2011). ACKNOWLEDGMENTS Fnancal support by the Wsconsn Agrculture Experment Staton and the AVIAGEN, Ltd. (Newbrdge, Scotland) to Paulno Pérez and Danel Ganola s acknowledged. We thank the Centro Internaconal de Mejoramento de Maíz y Trgo (CIMMYT) researchers who carred out the wheat trals and provded the phenotypc data analyzed n ths artcle. LITERATURE CITED Bernardo, R., and J. M. Yu, 2007 Prospects for genome-wde selecton for quanttatve trats n maze. Crop Sc. 47(3): Broomhead, D. S., and D. Lowe, 1988 Multvarable functonal nterpolaton and adaptve networks. Complex Systems 2: Burgueño, J., J. Crossa, P. L. Cornelus, R. Trethowan, G. McLaren et al., 2007 Modelng addtve envronment and addtve addtve envronment usng genetc covarances of relatves of wheat genotypes. Crop Sc. 47(1): Burgueño, J., J. Crossa, J. M. Cotes, F. San Vcente, and B. Das, 2011 Predcton assessment of lnear mxed models for multenvronment trals. Crop Sc. 51(3): Chen, S., C. F. N. Cowan, and P. M. Grant, 1991 Orthogonal least squares learnng algorthm for radal bass functon networks. Neural Networks, IEEE Transactons on 2(2): Cockram, J., H. Jones, F. J. Legh, D. O Sullvan, W. Powell et al., 2007 Control of flowerng tme n temperate cereals: genes, domestcaton, and sustanable productvty. J. Exp. Bot. 58(6): Cont, V., P. F. Roncallo, V. Beaufort, G. L. Cervgn, R. Mranda et al., 2011 Mappng of man and epstatc effect QTLs assocated to gran proten and gluten strength usng a RIL populaton of durum wheat. J. Appl. Genet. 52(3): Crossa, J., J. Burgueño, P. L. Cornelus, G. McLaren, R. Trethowan et al., 2006 Modelng genotype envronment nteracton usng addtve genetc covarances of relatves for predctng breedng values of wheat genotypes. Crop Sc. 46(4): Crossa, J., G. de los Campos, P. Perez, D. Ganola, J. Burgueño et al., 2010 Predcton of genetc values of quanttatve trats n plant breedng usng pedgree and molecular markers. Genetcs 186(2): Crossa, J., P. Perez, G. de los Campos, G. Mahuku, S. Dresgacker et al., 2011 Genomc selecton and predcton n plant breedng. J. Crop Improv. 25(3): de los Campos, G., and P. Perez, BLR: Bayesan Lnear Regresson R package, verson 1.2. de los Campos, G., H. Naya, D. Ganola, J. Crossa, A. Legarra et al., 2009 Predctng quanttatve trats wth regresson models for dense molecular markers and pedgree. Genetcs 182(1): de los Campos, G., D. Ganola, G. J. M. Rosa, K. A. Wegel, and J. Crossa, 2010 Sem-parametrc genomc-enabled predcton of genetc values usng reproducng kernel Hlbert spaces methods. Genet. Res. 92(4): de los Campos, G., J. M. Hckey, R. Pong-Wong, H. D. Daetwyler, and M. P. L. Calus, 2012 Whole genome regresson and predcton methods appled to plant and anmal breedng. Genetcs DOI: /genetcs Foresee, D., and M. T. Hagan, Gauss-Newton approxmaton to Bayesan learnng. Internatonal Conference on Neural Networks, June 9 12, Houston, TX. Ganola, D., and J. B. C. H. M. van Kaam, 2008 Reproducng kernel Hlbert spaces regresson methods for genomc asssted predcton of quanttatve trats. Genetcs 178(4): Ganola, D., R. L. Fernando, and A. Stella, 2006 Genomc-asssted predcton of genetc value wth semparametrc procedures. Genetcs 173(3): Ganola, D., H. Okut, K. A. Wegel, and G. J. M. Rosa, 2011 Predctng complex quanttatve trats wth Bayesan neural networks: a case study wth Jersey cows and wheat. BMC Genet. 12: 87. Goldrnger, I., P. Brabant, and A. Gallas, 1997 Estmaton of addtve and epstatc genetc varances for agronomc trats n a populaton of doubled-haplod lnes of wheat. Heredty 79: González-Camacho, J. M., G. de los Campos, P. Perez, D. Ganola, J. Carns et al., 2012 Genome-enabled predcton of genetc values usng radal bass functon. Theor. Appl. Genet. 125: Haber, D., R. L. Fernando, K. Kzlkaya, and D. J. Garrk, 2011 Extenson of the Bayesan alphabet for genomc selecton. BMC Bonformatcs 12: 186. Haste, T., R. Tbshran, and J. Fredman, 2009 The Elements of Statstcal Learnng: Data Mnng, Inference and Predcton, Ed. 2. Sprnger, New York P. Pérez-Rodríguez et al.

11 Heffner, E. L., M. E. Sorrells, and J. L. Jannnk, 2009 Genomc selecton for crop mprovement. Crop Sc. 49(1): Heslot, N., H. P. Yang, M. E. Sorrells, and J. L. Jannnk, 2012 Genomc selecton n plant breedng: a comparson of models. Crop Sc. 52(1): Hckey, J. M., and B. Ter, 2009 AlphaBayes (Beta): Software for Polygenc and Whole Genome Analyss. User Manual. Unversty of New England, Armdale, Australa. Hoerl, A. E., and R. W. Kennard, 1970 Rdge regresson: based estmaton for nonorthogonal problems. Technometrcs 12(1): Holland, J. B., 2001 Epstass and plant breedng. Plant Breedng Revews 21: Holland, J. B., 2008 Theoretcal and bologcal foundatons of plant breedng, pp n Plant Breedng: The Arnel R. Hallauer Internatonal Symposum, edted by K. R. Lamkey and M. Lee. Blackwell Publshng, Ames, IA. Lampnen, J., and A. Vehtar, 2001 Bayesan approach for neural networks - revew and case studes. Neural Netw. 14(3): Laure, D. A., N. Pratchett, J. W. Snape, and J. H. Bezant, 1995 RFLP mappng of fve major genes and eght quanttatve trat loc controllng flowerng tme n a wnter sprng barley (Hordeum vulgare L.) cross. Genome 38(3): Long, N. Y., D. Ganola, G. J. M. Rosa, K. A. Wegel, A. Krans et al., 2010 Radal bass functon regresson methods for predctng quanttatve trats usng SNP markers. Genet. Res. 92(3): MacKay, D. J. C., 1992 A practcal Bayesan framework for backpropagaton networks. Neural Comput. 4(3): MacKay, D. J. C., 1994 Bayesan non-lnear modellng for the predcton competton. ASHRAE Transactons 100(Pt. 2): Makowsky, R., N. M. Pajewsk, Y. C. Klmentds, A. I. Vazquez, C. W. Duarte et al., 2011 Beyond mssng hertablty: predcton of complex trats. PLoS Genet. 7(4): e McKnney, B. A., and N. M. Pajewsk, Sx degrees of epstass: statstcal network models for GWAS. Front. Genet. 2: 109. Meuwssen, T. H. E., B. J. Hayes, and M. E. Goddard, 2001 Predcton of total genetc value usng genome-wde dense marker maps. Genetcs 157 (4): Neal, R. M., Bayesan Learnng for Neural Networks (Lecture Notes n Statstcs), Vol Sprnger-Verlag, NY. Ober, U., J. F. Ayroles, E. A. Stone, S. Rchards, D. Zhu et al., 2012 Usng whole-genome sequence data to predct quanttatve trat phenotypes n Drosophla melanogaster. PLoS Genet. 8(5): e Okut, H., D. Ganola, G. J. Rosa, and K. A. Wegel, 2011 Predcton of body mass ndex n mce usng dense molecular markers and a regularzed neural network. Genet. Res. Camb. 93: Park, T., and G. Casella, 2008 The Bayesan LASSO. J. Am. Stat. Assoc. 103: Perez, P., G. de los Campos, J. Crossa, and D. Ganola, 2010 Genomcenabled predcton based on molecular markers and pedgree usng the Bayesan lnear regresson package n R. Plant Genome 3(2): Poggo, T., and F. Gros, 1990 Networks for approxmaton and learnng. Proc. IEEE 78(9): Resende, M. F. R., P. Muñoz, M. D. V. Resende, D. J. Garrck, R. L. Fernando et al., 2012 Accuracy of genomc selecton methods n a standard data set of loblolly pne (Pnus taeda L.). Genetcs 4: Shmada, S., T. Ogawa, and S. Ktagawa, 2009 A genetc network of flowerng-tme genes n wheat leaves, n whch an APETALA1/FRUITFULLlke gene, VRN-1, s upstream of FLOWERING LOCUS T. PlantJ. 58: Wang, C. S., J. J. Rutledge, and D. Ganola, 1994 Bayesan analyss of mxed lnear models va Gbbs samplng wth an applcaton to ltter sze n Iberan pgs. Genet. Sel. Evol. 26: Zhang, K., J. Tan, L. Zhao, and S. Wang, 2008 Mappng QTLs wth epstatc effects and QTL envronment nteractons for plant heght usng a doubled haplod populaton n cultvated wheat. J. Genet. Genomcs 35 (2): Communcatng edtor: J. B. Holland Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1605

The BGLR (Bayesian Generalized Linear Regression) R- Package. Gustavo de los Campos, Amit Pataki & Paulino Pérez. (August- 2013)

The BGLR (Bayesian Generalized Linear Regression) R- Package. Gustavo de los Campos, Amit Pataki & Paulino Pérez. (August- 2013) Bostatstcs Department Bayesan Generalzed Lnear Regresson (BGLR) The BGLR (Bayesan Generalzed Lnear Regresson) R- Package By Gustavo de los Campos, Amt Patak & Paulno Pérez (August- 03) (contact: gdeloscampos@gmal.com

More information

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Semi-parametric Regression Model to Estimate Variability of NO 2 Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Available online at ScienceDirect. Procedia Environmental Sciences 26 (2015 )

Available online at   ScienceDirect. Procedia Environmental Sciences 26 (2015 ) Avalable onlne at www.scencedrect.com ScenceDrect Proceda Envronmental Scences 26 (2015 ) 109 114 Spatal Statstcs 2015: Emergng Patterns Calbratng a Geographcally Weghted Regresson Model wth Parameter-Specfc

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Adjustment methods for differential measurement errors in multimode surveys

Adjustment methods for differential measurement errors in multimode surveys Adjustment methods for dfferental measurement errors n multmode surveys Salah Merad UK Offce for Natonal Statstcs ESSnet MM DCSS, Fnal Meetng Wesbaden, Germany, 4-5 September 2014 Outlne Introducton Stablsng

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

Parameter estimation for incomplete bivariate longitudinal data in clinical trials Parameter estmaton for ncomplete bvarate longtudnal data n clncal trals Naum M. Khutoryansky Novo Nordsk Pharmaceutcals, Inc., Prnceton, NJ ABSTRACT Bvarate models are useful when analyzng longtudnal data

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

Lecture #15 Lecture Notes

Lecture #15 Lecture Notes Lecture #15 Lecture Notes The ocean water column s very much a 3-D spatal entt and we need to represent that structure n an economcal way to deal wth t n calculatons. We wll dscuss one way to do so, emprcal

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Modeling Local Uncertainty accounting for Uncertainty in the Data

Modeling Local Uncertainty accounting for Uncertainty in the Data Modelng Local Uncertanty accontng for Uncertanty n the Data Olena Babak and Clayton V Detsch Consder the problem of estmaton at an nsampled locaton sng srrondng samples The standard approach to ths problem

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

Analysis of Malaysian Wind Direction Data Using ORIANA

Analysis of Malaysian Wind Direction Data Using ORIANA Modern Appled Scence March, 29 Analyss of Malaysan Wnd Drecton Data Usng ORIANA St Fatmah Hassan (Correspondng author) Centre for Foundaton Studes n Scence Unversty of Malaya, 63 Kuala Lumpur, Malaysa

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

C2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis

C2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis C2 Tranng: June 8 9, 2010 Introducton to meta-analyss The Campbell Collaboraton www.campbellcollaboraton.org Combnng effect szes across studes Compute effect szes wthn each study Create a set of ndependent

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005 Exercses (Part 4) Introducton to R UCLA/CCPR John Fox, February 2005 1. A challengng problem: Iterated weghted least squares (IWLS) s a standard method of fttng generalzed lnear models to data. As descrbed

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

An Ensemble Learning algorithm for Blind Signal Separation Problem

An Ensemble Learning algorithm for Blind Signal Separation Problem An Ensemble Learnng algorthm for Blnd Sgnal Separaton Problem Yan L 1 and Peng Wen 1 Department of Mathematcs and Computng, Faculty of Engneerng and Surveyng The Unversty of Southern Queensland, Queensland,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

A Multivariate Analysis of Static Code Attributes for Defect Prediction

A Multivariate Analysis of Static Code Attributes for Defect Prediction Research Paper) A Multvarate Analyss of Statc Code Attrbutes for Defect Predcton Burak Turhan, Ayşe Bener Department of Computer Engneerng, Bogazc Unversty 3434, Bebek, Istanbul, Turkey {turhanb, bener}@boun.edu.tr

More information

Fast Sparse Gaussian Processes Learning for Man-Made Structure Classification

Fast Sparse Gaussian Processes Learning for Man-Made Structure Classification Fast Sparse Gaussan Processes Learnng for Man-Made Structure Classfcaton Hang Zhou Insttute for Vson Systems Engneerng, Dept Elec. & Comp. Syst. Eng. PO Box 35, Monash Unversty, Clayton, VIC 3800, Australa

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Nondestructive and intuitive determination of circadian chlorophyll rhythms in soybean leaves using multispectral imaging

Nondestructive and intuitive determination of circadian chlorophyll rhythms in soybean leaves using multispectral imaging Supportng nformaton for Nondestructve and ntutve determnaton of crcadan chlorophyll rhythms n soybean leaves usng multspectral magng Wen-Juan Pan 1, Xa Wang 2, Yong-Ren Deng 3, Ja-Hang L 3, We Chen 1,

More information

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES 1 Fetosa, R.Q., 2 Merelles, M.S.P., 3 Blos, P. A. 1,3 Dept. of Electrcal Engneerng ; Catholc Unversty of

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA RFr"W/FZD JAN 2 4 1995 OST control # 1385 John J Q U ~ M Argonne Natonal Laboratory Argonne, L 60439 Tel: 708-252-5357, Fax: 708-252-3 611 APPLCATON OF A COMPUTATONALLY EFFCENT GEOSTATSTCAL APPROACH TO

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Machine Learning. K-means Algorithm

Machine Learning. K-means Algorithm Macne Learnng CS 6375 --- Sprng 2015 Gaussan Mture Model GMM pectaton Mamzaton M Acknowledgement: some sldes adopted from Crstoper Bsop Vncent Ng. 1 K-means Algortm Specal case of M Goal: represent a data

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

APPLICATION OF PREDICTION-BASED PARTICLE FILTERS FOR TELEOPERATIONS OVER THE INTERNET

APPLICATION OF PREDICTION-BASED PARTICLE FILTERS FOR TELEOPERATIONS OVER THE INTERNET APPLICATION OF PREDICTION-BASED PARTICLE FILTERS FOR TELEOPERATIONS OVER THE INTERNET Jae-young Lee, Shahram Payandeh, and Ljljana Trajovć School of Engneerng Scence Smon Fraser Unversty 8888 Unversty

More information

Mixed Linear System Estimation and Identification

Mixed Linear System Estimation and Identification 48th IEEE Conference on Decson and Control, Shangha, Chna, December 2009 Mxed Lnear System Estmaton and Identfcaton A. Zymns S. Boyd D. Gornevsky Abstract We consder a mxed lnear system model, wth both

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Evolutionary Wavelet Neural Network for Large Scale Function Estimation in Optimization

Evolutionary Wavelet Neural Network for Large Scale Function Estimation in Optimization AIAA Paper AIAA-006-6955, th AIAA/ISSMO Multdscplnary Analyss and Optmzaton Conference, Portsmouth, VA, September 6-8, 006. Evolutonary Wavelet Neural Network for Large Scale Functon Estmaton n Optmzaton

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Anonymisation of Public Use Data Sets

Anonymisation of Public Use Data Sets Anonymsaton of Publc Use Data Sets Methods for Reducng Dsclosure Rsk and the Analyss of Perturbed Data Harvey Goldsten Unversty of Brstol and Unversty College London and Natale Shlomo Unversty of Manchester

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

APPLICATION OF PREDICTION-BASED PARTICLE FILTERS FOR TELEOPERATIONS OVER THE INTERNET

APPLICATION OF PREDICTION-BASED PARTICLE FILTERS FOR TELEOPERATIONS OVER THE INTERNET APPLICATION OF PREDICTION-BASED PARTICLE FILTERS FOR TELEOPERATIONS OVER THE INTERNET Jae-young Lee, Shahram Payandeh, and Ljljana Trajovć School of Engneerng Scence Smon Fraser Unversty 8888 Unversty

More information

EXTENDED BIC CRITERION FOR MODEL SELECTION

EXTENDED BIC CRITERION FOR MODEL SELECTION IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7

More information

Adaptive Transfer Learning

Adaptive Transfer Learning Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Research on Categorization of Animation Effect Based on Data Mining

Research on Categorization of Animation Effect Based on Data Mining MATEC Web of Conferences 22, 0102 0 ( 2015) DOI: 10.1051/ matecconf/ 2015220102 0 C Owned by the authors, publshed by EDP Scences, 2015 Research on Categorzaton of Anmaton Effect Based on Data Mnng Na

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

Context-Specific Bayesian Clustering for Gene Expression Data

Context-Specific Bayesian Clustering for Gene Expression Data Context-Specfc Bayesan Clusterng for Gene Expresson Data Yoseph Barash School of Computer Scence & Engneerng Hebrew Unversty, Jerusalem, 91904, Israel hoan@cs.huj.ac.l Nr Fredman School of Computer Scence

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Model Selection with Cross-Validations and Bootstraps Application to Time Series Prediction with RBFN Models

Model Selection with Cross-Validations and Bootstraps Application to Time Series Prediction with RBFN Models Model Selecton wth Cross-Valdatons and Bootstraps Applcaton to Tme Seres Predcton wth RBF Models Amaury Lendasse Vncent Wertz and Mchel Verleysen Unversté catholque de Louvan CESAME av. G. Lemaître 3 B-348

More information

Fusion Performance Model for Distributed Tracking and Classification

Fusion Performance Model for Distributed Tracking and Classification Fuson Performance Model for Dstrbuted rackng and Classfcaton K.C. Chang and Yng Song Dept. of SEOR, School of I&E George Mason Unversty FAIRFAX, VA kchang@gmu.edu Martn Lggns Verdan Systems Dvson, Inc.

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray

More information

REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES

REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES Laser dffracton s one of the most wdely used methods for partcle sze analyss of mcron and submcron sze powders and dspersons. It s quck and easy and provdes

More information

Some variations on the standard theoretical models for the h-index: A comparative analysis. C. Malesios 1

Some variations on the standard theoretical models for the h-index: A comparative analysis. C. Malesios 1 *********Ths s the fnal draft for the paper submtted to the Journal of the Assocaton for Informaton Scence and Technology. For the offcal verson please refer to DOI: 10.1002/as.23410********* Some varatons

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup

More information