Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

Size: px

Start display at page:

Download "Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat"

Ashlie Dickerson
5 years ago
Views:

1 GENOMIC SELECTION Comparson Between Lnear and Non-parametrc Regresson Models for Genome-Enabled Predcton n Wheat Paulno Pérez-Rodríguez,*,1 Danel Ganola, Juan Manuel González-Camacho,* José Crossa, Yann Manès, and Susanne Dresgacker *Colego de Postgraduados, Montecllo, Texcoco 56230, Méxco, Departments of Anmal Scences, Dary Scence, and Bostatstcs and Medcal Informatcs, Unversty of Wsconsn-Madson, Madson, Wsconsn 53706, and Bometrcs and Statstcs Unt and Global Wheat Program, Internatonal Maze and Wheat Improvement Center (CIMMYT), Mexco, D.F., Méxco ABSTRACT In genome-enabled predcton, parametrc, sem-parametrc, and non-parametrc regresson models have been used. Ths study assessed the predctve ablty of lnear and non-lnear models usng dense molecular markers. The lnear models were lnear on marker effects and ncluded the Bayesan LASSO, Bayesan rdge regresson, Bayes A, and Bayes B. The non-lnear models (ths refers to non-lnearty on markers) were reproducng kernel Hlbert space (RKHS) regresson, Bayesan regularzed neural networks (BRNN), and radal bass functon neural networks (RBFNN). These statstcal models were compared usng 306 elte wheat lnes from CIMMYT genotyped wth 1717 dversty array technology (DArT) markers and two trats, days to headng (DTH) and gran yeld (GY), measured n each of 12 envronments. It was found that the three non-lnear models had better overall predcton accuracy than the lnear regresson specfcaton. Results showed a consstent superorty of RKHS and RBFNN over the Bayesan LASSO, Bayesan rdge regresson, Bayes A, and Bayes B models. KEYWORDS GenPred Shared data resources Genome-enabled predcton of complex trats based on marker data are becomng mportant n plant and anmal breedng, personalzed medcne, and evolutonary bology (Meuwssen et al. 2001; Bernardo and Yu 2007; de los Campos et al. 2009, 2010; Crossa et al. 2010, 2011; Ober et al. 2012). In the standard, nfntesmal, pedgree-based model of quanttatve genetcs, the famly structure of a populaton s reflected n some expected resemblance between relatves. The latter s measured as an expected covarance matrx among ndvduals and s used to predct genetc values (e.g. Crossa et al. 2006; Burgueño et al. 2007, 2011). Whereas pedgree-based models do not account for Mendelan segregaton and the expected covarance matrx s constructed usng assumptons that do not hold (e.g. absence of selecton and Copyrght 2012 Pérez-Rodríguez et al. do: /g Manuscrpt receved July 9, 2012; accepted for publcaton October 5, 2012 Ths s an open-access artcle dstrbuted under the terms of the Creatve Commons Attrbuton Unported Lcense ( by/3.0/), whch permts unrestrcted use, dstrbuton, and reproducton n any medum, provded the orgnal work s properly cted. Supportng nformaton s avalable onlne at suppl/do: /g /-/dc1 1 Correspondng author: Colego de Postgraduados, Montecllo, Texcoco 56230, Méxco. E-mal: perpdgo@gmal.com mutaton and random matng), the marker-based models allow tracng Mendelan segregaton at several postons of the genome and observng realzed (as opposed to expected) covarances. Ths enhances the potental for mprovng the accuracy of estmates of genetc values, thus ncreasng the genetc progress attanable when these predctons are used for selecton purposes n leu of pedgree-based predctons. Recently, de los Campos et al. (2009, 2010) and Crossa et al. (2010, 2011) used Bayesan estmates from genomc parametrc and sem-parametrc regressons, and they found that models that ncorporate pedgree and markers smultaneously had better predcton accuracy for several trats n wheat and maze than models based only on pedgree or only on markers. The standard lnear genetc model represents the phenotypc response of the th ndvdual (y ) as the sum of a genetc value, g,and of a model resdual, e, such that the lnear model for n ndvduals ð ¼ 1; :::; nþ s represented as y ¼ g þ e. However, buldng predctve models for complex trats usng a large number of molecular markers (p) wth a set of lnes comprsng ndvduals (n) wthp n s challengng because ndvdual marker effects are not lkelhooddentfed. In ths case, marker effects can be estmated va penalzed parametrc or sem-parametrc methods or ther Bayesan counterparts, rather than va ordnary least squares. Ths reduces Volume 2 December

2 the mean-squared error of estmates; t also ncreases predcton accuracy of out-of-sample cases and prevents over-fttng (de los Campos et al. 2010). In addton to the well-known Bayes A and B lnear regresson models orgnally proposed by Meuwssen et al. (2001) for ncorporatng marker effects nto g, there are several penalzed parametrc regresson methods for estmatng marker effects, such as rdge regresson, the least absolute shrnkage and selecton operator (LASSO), and the elastc net (Haste et al. 2009). The Bayesan counterparts of these models have proved to be useful because approprate prors can be assgned to the regularzaton parameter(s), and uncertanty n the estmatons and predctons can be measured drectly by applyng the Bayesan paradgm. Regresson methods assume a lnear relatonshp between phenotype and genotype, and they typcally account for addtve allelc effects only; however, evdence of epstatc effects on plant trats s vast and well documented (e.g. Holland 2001, 2008). In wheat, for nstance, detaled analyses have revealed a complex crcutry of epstatc nteractons n the regulaton of headng tme nvolvng dfferent vernalzaton genes, day-length senstvty genes, and earlness per se genes, as well as the envronment (Laure et al. 1995; Cockram et al. 2007). Epstatc effects have also been found to be an mportant component of the genetc bass of plant heght and bread-makng qualty trats (Zhang et al. 2008; Cont et al. 2011). It s becomng common to study gene gene nteractons by usng a paradgm of networks that ncludes aggregatng gene gene nteracton that exsts even n the absence of man effects (McKnney and Pajewsk 2012). Interactons between alleles at two or more loc could theoretcally be represented n a lnear model va use of approprate contrasts. However, ths does not scale when the number of markers (p) s large, as the number of 2-locus, 3-locus, etc., nteractons s mnd bogglng. An alternatve approach to the standard parametrc modelng of complex nteractons s provded by non-lnear, sem-parametrc methods, such as kernel-based models (e.g. Ganola et al. 2006; Ganola and van Kaam 2008) or artfcal neural networks (NN) (Okut et al. 2011; Ganola et al. 2011), under the assumpton that such procedures can capture sgnals from hgh-order nteractons. The potental of these methods, however, depends on the kernel chosen and on the neural network archtecture. In a recent study, Heslot et al. (2012) compared the predctve accuracy of several genome-enabled predcton models, ncludng reproducng kernel Hlbert space (RKHS) and NN, usng barley and wheat data; the authors found that the non-lnear models gave a modest but consstent predctve superorty (as measured by correlatons between predctons and realzatons) over the lnear models. In partcular, the RKHS model had a better predctve ablty than that obtaned usng the parametrc regressons. The use of RKHS for predctng complex trats was frst proposed by Ganola et al. (2006) and Ganola and van Kaam (2008). de los Campos et al. (2010) further developed the theoretcal bass of RHKS wth kernel averagng (smultaneous use of varous kernels n the model) and showed ts good predcton accuracy. Other emprcal studes n plants have corroborated the ncrease n predcton accuracy of kernel methods (e.g. Crossa et al. 2010, 2011; de los Campos et al. 2010; Heslot et al. 2012). Recently, Long et al. (2010), usng chcken data, and González-Camacho et al. (2012), usng maze data, showed that NN methods provded predcton accuracy comparable to that obtaned usng the RKHS method. In NN, the bases functons (adaptve covarates ) are nferred from the data, whch gves the NN great potental and flexblty for capturng complex nteractons between nput varables (Haste et al. 2009). In partcular, Bayesan regularzed neural networks (BRNN) and radal bass functon neural networks (RBFNN) have features that make them attractve for use n genomc selecton (GS). In ths study, we examned the predctve ablty of varous lnear and non-lnear models, ncludng the Bayes A and B lnear regresson models of Meuwssen et al. (2001); the Bayesan LASSO, as n Park and Casella (2008) and de los Campos et al. (2009); RKHS, usng the kernel averagng strategy proposed by de los Campos et al. (2010); the RBFNN, proposed and used by González-Camacho et al. (2012); and the BRNN, as descrbed by Neal (1996) and used n the context of GS by Ganola et al. (2011). The predctve ablty of these models was compared usng a cross-valdaton scheme appled to a wheat data set from CIMMYT s Global Wheat Program. MATERIALS AND METHODS Expermental data The data set ncluded 306 elte wheat lnes, 263 lnes that are canddates for the 29 th Sem-Ard Wheat Screenng Nursery (SAWSN), and 43 lnes from the 18 th Sem-Ard Wheat Yeld Tral (SAWYT) from CIMMYT s Global Wheat Program. These lnes were genotyped wth 1717 dversty array technology (DArT) markers generated by Trtcarte Pty. Ltd. (Canberra, Australa; com.au). Two trats were analyzed: gran yeld (GY) and days to headng (DTH) (see Supportng Informaton, Fle S1). The trats were measured n a total of 12 dfferent envronments (1 12) (Table 1): GY n envronments 1 7 and DTH n envronments 1 5 and 8 12 (10 n all). Dfferent agronomc practces were used. Yeld trals were planted n 2009 and 2010 usng prepared beds and flat plots under controlled drought or rrgated condtons. Yeld data from experments n 2010 were replcated, whereas data from trals n 2009 were adjusted means from an alpha lattce ncomplete block desgn wth adjustment for spatal varablty n the drecton of rows and columns usng the autoregressve model ftted n both drectons. Data used to tran the models for GY and DTH n 2009 were the best lnear unbased estmator (BLUE) after spatal analyss, whereas the BLUE data for 2010 were obtaned after performng analyses n each of the 12 envronments and combned. The expermental desgns n each locaton conssted of alpha lattce ncomplete block desgns of dfferent szes, wth two replcates each. Broad-sense hertablty at ndvdual envronments was calculated as h 2 ¼ s 2 g =ðs2 g þ s2 e nreps Þ,wheres2 g and s2 e are the genotype and error varance components, respectvely, and nreps s the number of replcates. For the combned analyses across envronments, broad-sense hertablty was calculated as h 2 ¼ s 2 g =ðs2 g þ s2 ge nenv þ s 2 e Þ, where the term nenv nrepsþ s2 ge s the genotype envronment nteracton varance component, and nenv s the number of envronments ncluded n the analyss. Statstcal models One method for ncorporatng markers s to defne g as a parametrc lnear regresson on marker covarates x j wth form g ¼ Pp x jb j, such that y ¼ Pp x jb j þ e ( j = 1,2,...,p markers); here, b j s the partal regresson of y on the j th marker covarate (Meuwssen et al. 2001). Extendng the model to allow for an ntercept y ¼ m þ Xp x j b j þ e (1) We adopted Gaussan assumptons for model resduals; specfcally, the jont dstrbuton of model resduals n Equaton 1 was 1596 P. Pérez-Rodríguez et al.

3 n Table 1 Twelve envronments representng combnatons of dverse agronomc management (drought or full rrgaton, sowng n standard, bed, or flat systems), stes n Mexco, and years for two trats, gran yeld (GY) and days to headng (DTH), wth ther broadsense hertablty (h 2 ) measured n 2010 Envronment Code Agronomc Management Ste n Mexco Year Trat Measured h 2 (GY) h 2 (DTH) 1 Drought-bed Cd. Obregon 2009 GY, DTH 2 Drought-bed Cd. Obregon 2010 GY, DTH Drought-flat Cd. Obregon 2010 GY, DTH Full rrgaton-bed Cd. Obregon 2009 GY, DTH 5 Full rrgaton-bed Cd. Obregon 2010 GY, DTH Heat-bed Cd. Obregon 2010 GY Full rrgaton-flat melga Cd. Obregon 2010 GY Standard Toluca 2009 DTH 9 Standard El Batan 2009 DTH 10 Small observaton plot Cd. Obregon 2009 DTH 11 Small observaton plot Cd. Obregon 2010 DTH Standard Agua Fra 2010 DTH assumed normal wth mean zero and varance s 2 e. The lkelhood functon s p yjm; g; s 2 Y n e ¼ ¼1 where N y jm þ Pp x j b j ; s 2 e y centered at m þ Pp 0 1 X p N@y m þ x j b j ; s 2 A e (2) s a normal densty for random varable x j b j and wth varance s 2 e.dependngon how prors on the marker effects are assgned, dfferent Bayesan lnear regresson models result. Lnear models: Bayesan rdge regresson, Bayesan LASSO, Bayes A, and Bayes B A standard penalzed regresson method s rdge regresson (Hoerl and Kennard 1970); ts Bayesan counterpart, Bayesan rdge regresson (BRR), uses a pror densty of marker effects, pðb j jvþ, thats, Gaussan, centered at zero and wth varance common to all the markers, that s, pðb j js 2 b Þ¼Nðb jj0; s 2 b Þ,wheres2 b s a pror-varance of marker effects. Marker effects are assumed ndependent and dentcally dstrbuted apror. We assgned scaled nverse ch dstrbutons x 22 ðdf : ; s : Þ to the varance parameters s 2 e and s2 b. The pror degrees of freedom parameters were set to df : ¼ 4ands : ¼ 1. It can be shown that the posteror mean of marker effects s the best lnear unbased predctor (BLUP) of marker effects, so Bayesan rdge regresson s often referred to as RR-BLUP (de los Campos et al. 2012). The Bayesan LASSO, Bayes A, and Bayes B relax the assumpton of common pror varance to all marker effects. The relatonshp among these three models s as follows: Bayes B can be consdered as the most general of the three, n the sense that Bayes A and Bayesan rdge regresson can be vewed as specal cases of Bayes B. Ths s because Bayes A s obtaned from Bayes B by settng p ¼ 0(the proporton of markers wth null effects), and Bayesan rdge regresson s obtaned from Bayes B by settng p ¼ 0 and assumng that all the markers have the same varance. Bayes B uses a mxture dstrbuton wth a mass at zero, such that the (condtonal) pror dstrbuton of marker effects s gven by b j s 2 0 wth probablty p j ; p ¼ Nð0; s 2 j Þ wth probablty 1-p (3) The pror assgned to s 2 j ; j ¼ 1; ::::; p s the same for all markers,.e. a scaled nverted ch squared dstrbuton x 22 ðdf b ; s b Þ,wheredf b are the degrees of freedom and s b s a scalng parameter. Bayes B becomes Bayes A by settng p =0. In the case of Bayes B, we h took p ¼ 0:95; df b ¼ 4, and s b ¼ ~s 2 a ðdf b 2 2Þ=df b wth ~s 2 a ¼ ~s2 S = ð1 2 pþ Pp 2q j ð1 2 q j Þ,where q j s the allele frequency for marker j and ~s 2 S s the addtve genetc varance explaned by markers [see Haber et al. (2011) and Resende et al. (2012) for more detals]. In the case of s 2 e,weassgnedaflat pror as n Wang et al. (1994). The Bayesan LASSO assgns a double exponental (DE) dstrbuton to all marker effects (condtonally on a regularzaton parameter l), centered at zero and wth marker-specfc varance, that s, pðb j jl; s e Þ¼DE b j j0; l. The DE dstrbuton does not conjugate se 2 wth the Gaussan lkelhood, but t can be represented as a mxture of scaled normal denstes, whch allows easy mplementaton of the model (Park and Casella 2008; de los Campos et al. 2009). The prors used were exactly the same as those used n González-Camacho et al. (2012). The models used n ths study, the Bayesan rdge regresson, Bayesan LASSO (BL), Bayes A, and Bayes B, are explaned n detal n several artcles; for example, Bayes A and Bayes B are descrbed n Meuwssen et al. (2001), Haber et al. (2011), and Resende et al. (2012), and an account of BL s gven n de los Campos et al. (2009, 2012), Crossa et al. (2010, 2011), Perez et al. (2010), and González- Camacho et al. (2012). Non-lnear models: RBFNN, BRNN, and RKHS In ths secton, we descrbe the basc structure of the non-lnear sngle hdden layer feed-forward neural network (SLNN) wth two of ts varants, the radal bass functon neural network and the Bayesan regularzed neural network. We also gve a bref explanaton of RKHS wth the averagng kernel method at the end of ths secton. Sngle hdden layer feed-forward neural network: In a sngle-layer feed-forward (SLNN), the non-lnear actvaton functons n the hdden layer enable a NN to have unversal approxmaton ablty, gvng t great potental and flexblty n terms of capturng complex patterns. The structure of the SLNN s depcted n Fgure 1, whch llustrates the structure of the method for a phenotypc contnuous response. Ths NN can be thought of as a two-step regresson (e.g. Haste et al. 2009). In the frst step, n the non-lnear hdden layer, S data-derved bass functons (k = 1, 2,..., S neurons), fz ½kŠ g, are nferred, and n the second step, n the lnear output layer, the response s regressed on the bass functons (nferred n the hdden layer). The nner product between the nput vector and the weght Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1597

4 Fgure 1 Structure of a sngle-layer feedforward neural network (SLNN) adapted from González-Camacho et al. (2012). In the hdden layer, nput varables x = ðx 1 ; :::; x p Þ (j =1,...,p markers) are combned for each neuron (k=1,...,s neurons) usng a lnear functon, u ½kŠ ¼ b k þ Pp xj b ½kŠ, and subsequently transformed usng a non-lnear actvaton functon, j yeldng a set of nferred scores, z ½kŠ ¼ g k ðu ½kŠ Þ. These scores are used n the output layer as bass functons to regress the response usng the lnear actvaton functon on the dataderved predctors y ¼ m þ PS w k z ½kŠ þ e. vector (b ½kŠ ) of each neuron of the hdden layer, plus a bas (ntercept b k ), s performed, that s, u ½kŠ ¼ b k þ Pp x j b ½kŠ j ; (j =1,...,p markers); ths s then transformed usng a non-lnear actvaton functon g k ðu ½kŠ Þ. One obtans z ½kŠ ¼ g k b k þ Pp x j b ½kŠ j,whereb k s an ntercept and (b [1] 1,..., b [1] p ;..., b [S] 1,..., b [S] p )9 s a vector of regresson coeffcents or weghts of each neuron k n the hdden layer. The g k ð:þ s the actvaton functon, whch maps the nputs nto the real lne n the closed nterval [21,1]; for example, g k ðxþ ¼ expð2xþ 2 1 s known as the tangent hyperbolc functon. expð2xþþ1 Fnally, n the lnear output layer, phenotypes are regressed on the data-derved features, fz ½kŠ g, accordng to 0 1 y ¼ m þ XS w k z ½kŠ þ e ¼ m þ XS w k g k þ Xp x j b ½kŠ j Aþ e : (4) Radal bass functon neural network: The RBFNN was frst proposed by Broomhead and Lowe (1988) and Poggo and Gros (1990). Fgure 2 shows the archtecture of a sngle hdden layer RBFNN wth S non-lnear neurons. Each non-lnear neuron n the hdden layer has a Gaussan radal bass functon (RBF) defned as z ½kŠ ¼ exp½ 2 h k kx 2c k k 2 Š,where kx 2 c k k s the Eucldean norm between the nput vector x and the center vector c k and h k s the bandwdth of the Gaussan RBF. Subsequently, n the lnear output layer, phenotypes are regressed on the data-derved features, fz ½kŠ g, accordng to y ¼ m þ PS w k z ½kŠ þ e,wheree s a model resdual. Estmatng the parameters of the RBFNN: The vector of weghts v ¼ fw 1 ; :::; w S g of the lnear output layer s obtaned usng the ordnary least-squares ft that mnmzes the mean squared dfferences between the ^y (from RBFNN) and the observed responses y n the tranng set, provded that the Gaussan RBFs for centers c k and h k of the hdden layer are defned. The centers are selected usng an orthogonalzaton least-squares learnng algorthm, as descrbed by Chen et al. (1991) and mplemented n Matlab 2010b. The centers are added teratvely such that each new selected center s orthogonal to the others. The selected centers maxmze the decrease n the meansquared error of the RBFNN, and the algorthm stops when the number of centers (neurons) added to the RBFNN attans a desred precson (goal error) or when the number of centers s equal to the number of nput vectors, that s, when S=n. The bandwdth h k of the Gaussan RBF s defned n terms of a desgn parameter of the net 0: spread, thats,h k ¼ for each Gaussan RBF of the hdden spread layer. To select the best RBFNN, a grd for tranng the net was generated, contanng dfferent values of spread and dfferent precson values (goal error). The ntal value of the spread was the medan of the Eucldean dstances between each par of nput vectors (x ), and an ntal value of 0.02 for the goal error was consdered. The parameter spread allows adjustng the form of the Gaussan RBF such that t s suffcently large to respond to overlappng regons of the nput space but not so bg that t mght nduce the Gaussan RBF to have a smlar response. Bayesan regularzed neural networks: The dfference between SLNN and BRNN s n the functon to be mnmzed (see the penalzed functon below); therefore, the basc structure of a BRNN can be represented n Fgure 1 as well. The SLNN descrbed above s flexble enough to approxmate any non-lnear functon; ths great flexblty allows NN to capture complex nteractons among predctor 1598 P. Pérez-Rodríguez et al.

5 Fgure 2 Structure of a radal bass functon neural network adapted from González- Camacho et al. (2012). In the hdden layer, nformaton from nput varables ðx 1 ; :::; x p Þ (j = 1,...,p markers) s frst summarzed by means of the Eucldean dstance between each of the nput vectors fx g wth respect to S (data-nferred) (k=1,...,s neurons) centers fc k g, that s, u ½kŠ ¼ h k jjx 2 c k jj 2. These dstances are then transformed usng the Gaussan functon z ½kŠ scores are used n the output layer as bass ¼ expð2u ½kŠ Þ. These functons for the lnear regresson y ¼ m þ PS w k z ½kŠ þ e. varables (Haste et al. 2009). However, ths flexblty also leads to two mportant ssues: (1) as the number of neurons ncreases, the number of parameters to be estmated also ncreases; and (2) as the number of parameters rses, the rsk of over-fttng also ncreases. It s common practce to use penalzed methods va Bayesan methods to prevent or pallate over-fttng. MacKay (1992, 1994) developed a framework for obtanng estmates of all the parameters n a feed-forward sngle neural network by usng an emprcal Bayes approach. Let u ¼ (w 1,...,w S ; b 1,...,b S ; b [1] 1,..., b [1] p ;..., b [S] 1,..., b [S] p, m)9 be the vector contanng all the weghts, bases, and connecton strengths. The author showed that the estmaton problem can be solved n two steps, followed by teraton: (1) Obtan the condtonal posteror modes of the elements n u assumng that the varance components s 2 e and s 2 u are known and that the pror dstrbuton for the all the elements n u s gven by pðujs 2 u Þ¼MNð0; s2 uiþ. It s mportant to note that ths approach assgns the same pror to all elements of u, even though ths may not always be the best thng to do. The densty of the condtonal (gven the varance parameters) posteror dstrbuton of the elements of u, accordngtobayes theorem, s gven by pðu y; s 2 e ; s 2 u Þ¼ pðy u; s 2 e Þpðu s 2 u Þ pðy s 2 e ; s 2 u Þ (5) The condtonal modes can be obtaned by maxmzng Equaton 5 over u. However, the problem s equvalent to mnmzng the followng penalzed sum of squares [see Ganola et al. (2011) for more detals] FðuÞ ¼b Xn ¼1 e 2 þ a Xm u 2 j where b ¼ 1=ð2s 2 e Þ, a ¼ 1=ð2s2 u Þ, e s the dfference between observed and predcted phenotypes for the ftted model, and u j (j ¼ 1; :::; m) s the j th element of vector u. (2) Update s 2 e and s2 u. The updatng formulas are obtaned by maxmzng an approxmaton to the margnal lkelhood of the data pðyjs 2 e ; s2 uþ (the evdence ) gven by the denomnator of Equaton 5. (3) Iterate between (1) and (2) untl convergence. The orgnal algorthm developed by MacKay was further mproved by Foresee and Hagan (1997) and adopted by Ganola et al. (2011) n the context of genome and pedgree-enabled predcton. The algorthm s equvalent to estmaton va maxmum penalzed lkelhood estmaton when weght decay s used, but t has the advantage of provdng a way of settng the extent of weght decay through the varance component s 2 u. Neal (1996) ponted out that the procedure of MacKay (1992, 1994) can be further generalzed. For example, there s no need to approxmate probabltes va Gaussan assumptons; furthermore, t s possble to estmate the entre posteror dstrbutons of all the elements n u, not only ther (condtonal) posteror modes. Next, we brefly revew Neal s approach to solvng the problem; a comprehensve revson can be found n Lampnen and Vehtar (2001). Pror dstrbutons: a) Varance component of the resduals: Neal (1996) used a conjugate nverse Gamma dstrbuton as a pror for the varance assocated wth the resdual, e, gven n Equaton 4, that s, s 2 e Inv-Gammaðs e ; df e Þ,wheres e and df e are the scale and degrees of freedom parameters, respectvely. These parameters can be set to the default values gven by Neal (1996), s e =0.05, df e =0.5. These values were also used by Lampnen and Vehtar (2001). b) Connecton strengths, weghts, and bases: Neal (1996) suggested dvdng the network parameters n u nto groups and then usng herarchcal models for each group of parameters; for example, connecton strengths (b [1] 1,..., b [1] p ;...; b [S] 1,..., b [S] p ), bases (b 1,...,b S ) of the hdden layer, and output weghts (w 1,...,w S ), and general mean or bas (m) of the lnear output layer. Suppose that u 1,...,u k are parameters of a gven group; then assume Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1599

6 ( pðu 1 ; :::; u k s 2 u Þ¼ð2pÞ 2k=2 s k u exp 2 1 2s 2 u X S u 2 k And, at the last stage of the model, assgn the pror s 2 u Inv-Gammaðs u ; df u Þ. The scale parameter of the dstrbuton assocated wth the group of parameters contanng the connecton strengths (b [1] 1,..., b [1] p ;...; b [S] 1,..., b [S] p ) changes accordng to the number of nputs, n ths case, s u ¼ð0:05=p 1=dfu Þ 2 wth df u ¼ 0:5 andp s the number of markers n the data set. By usng Markov chan Monte Carlo (MCMC) technques through an algorthm called hybrd Monte Carlo, Neal (1996) developed a software termed flexble Bayesan modelng (FBM) capable of obtanng samples from the posteror dstrbutons of all unknowns n a neural network (as n Fgure 1). Reproducng kernel Hlbert spaces regresson: RKHS models have been suggested as an alternatve to multple lnear regresson for capturng complex nteracton patterns that may be dffcult to account for n a lnear model framework (Ganola et al. 2006). In RKHS model, the regresson functon takes the form f ðx Þ¼m þ Xn 9 ¼1 ) a 9Kðx ; x 9Þ (6) where x ¼ðx 1 ; :::; x p Þ 9 and x 9 ¼ðx 1; :::; x 9 pþ 9 are nput vectors of 9 marker genotypes n ndvduals and 9; a 9 are regresson coeffcents; and Kðx ; x 9Þ¼expð2 hkx 2x 9k 2 Þ s the reproducng kernel defned (here) wth a Gaussan RBF, where h s a bandwdth parameter and kx 2 x 9k s the Eucldean norm between each par of nput vectors. The strategy termed kernel averagng for selectng optmal values of h wthn a set of canddate values was mplemented usng the Bayesan approach descrbed n de los Campos et al. (2010). Smlartes and connectons between the RKHS and the RBFNN are gven n González-Camacho et al. (2012). Assessment of the models predctve ablty The predctve ablty of the models gven above was compared usng Pearson s correlaton and predctve mean-squared error (PMSE) usng predcted and realzed values. A total of 50 random parttons were generated for each of the data sets, and each partton randomly assgned 90% of the lnes to the tranng set and the remanng 10% to the valdaton set. The partton scheme used was smlar to that n Ganola et al. (2011) and González-Camacho et al. (2012). All scrpts were run n a Lnux work staton; for Bayesan rdge regresson and Bayesan LASSO, we used the R package BLR (de los Campos and Perez 2010), whereas for RKHS, we used the R mplementaton descrbed n de los Campos et al. (2010), whch was kndly provded by the authors. In the case of Bayes A and Bayes B, we used a program descrbed by Hckey and Ter (2009), whch s freely avalable at For the BRNN, we used the FMB software avalable at toronto.edu/~radford/fbm.software.html. Because the computatonal tme requred to evaluate the predctve ablty of the BRNN network was great, we used the Condor hgh throughput computng system at the Unversty of Wsconsn-Madson ( condor). The RBFNN model was run usng Matlab 2010b for Lnux. The dfferences n computng tmes between the models were great. The computng tmes for evaluatng the predcton ablty of the 50 parttons for each trat were as follows, 10 mn for RBFNN, 1.5 hr for RKHS, 3 hr for BRR, 3.5 hr for BL, 4.5 hr for Bayes B, 5.5 hr for Bayes A, and 30 days for BRNN. In the case of RKHS, BRR, BL, Bayes A, and Bayes B, nferences were based on 35,000 MCMC samples, and on 10,000 samples for BRNN. The estmated computng tmes were obtaned usng, as reference, a sngle Intel Xeon CPU GHz and 8 Gb of RAM memory. Sgnfcant reducton n computng tme was acheved by parallelzng the tasks. RESULTS Data from replcated experments n 2010 were used to calculate the broad-sense hertablty for each trat n each envronment (Table 1). Broad-sense hertablty across locatons for 2010 data were 0.67 for GY and 0.92 for DTH. These hgh estmates can be explaned, at least n part, by the strct envronmental control of trals conducted at CIMMYT s experment staton at Cudad Obregon. The hertablty of the two trats for 2009 was not estmated because the only avalable phenotypc data were adjusted means for each envronment. Predctve assessment of the models The predctve ablty of the dfferent models for GY and DTH vared among the 12 envronments. The model deemed best usng correlatons (Table 2) tended to be the one wth the smallest average PMSE (Table 3). The three non-parametrc models had hgher predctve correlatons and smaller PMSE than the lnear models for both GY and DTH. Wthn the lnear models, the results are mxed, and all models gave smlar predctons. Wthn the non-parametrc models, RBFNN and RKHS always gave hgher correlatons between predcted values and realzed phenotypes, and a smaller average PMSE than the BRNN. The mean of the correlatons and the assocated standard errors can be used to test for statstcally sgnfcant mprovements n the predctablty of the non-lnear models vs. the lnear models. The t-test (wth a ¼ 0:05) showed that RKHS gave sgnfcant mprovements n predcton n 13/19 cases (Table 3) compared wth the BL, whereas RBFNN was sgnfcantly better than the BL n 10/19 cases. Smlar results were obtaned when comparng RKHS and RBFNN wth Bayes A and Bayes B. Correlatons between observed and predcted values for DTH were lowest overall n envronments 4 and 8, n Cd. Obregon, 2009, and n Toluca, Average PMSE was n agreement wth the fndngs based on correlatons. Although accuraces n envronment 4 were much lower than n other envronments, the hgher accuracy of the non-parametrc models (RKHS, RBFNN, and BRNN) over that of the lnear models (BL, BRR, Bayes A, and Bayes B) was consstent wth what was observed n the other envronments. Fgures 3 and 4 gve scatter plots of the correlatons obtaned wth the three non-parametrc models vs. the BL for DTH and GY, respectvely; each crcle represents the estmated correlatons for each of the two models ncluded n the plot. In Fgure 3, A C, DTH had a total of 500 ponts (10 envronments and 50 random tranng-testng parttons). In Fgure 4, A C, GY had a total of 350 ponts (7 envronments and 50 random parttons n each envronment). A pont above the 45-degree lne represents an analyss where the method whose predctve correlaton s gven on the vertcal axs (RKHS, RBFNN, BRNN) outperformed the one whose correlaton s gven on the horzontal axs (BL). Both fgures show that although there s a great deal of varablty due to partton, for both DTH and GY, the overall superorty of RKHS and RBFNN over the lnear model BL s clear. For both trats, BL had slghtly better predcton accuracy than the BRNN n terms of the number of ndvdual correlaton ponts. It s nterestng to note that some cross-valdaton parttons pcked subsets of tranng data that had negatve, zero, or very low correlatons wth the observed values n 1600 P. Pérez-Rodríguez et al.

7 n Table 2 Average correlaton (SE n parentheses) between observed and predcted values for gran yeld (GY) and days to headng (DTH) n 12 envronments for seven models Trat Envronment BL BRR Bayes A Bayes B RKHS RBFNN BRNN (0.11) 0.59 (0.11) 0.59 (0.11) 0.56 (0.11) 0.66 (0.09) 0.66 (0.10) 0.64 (0.11) (0.14) 0.57 (0.14) 0.61 (0.12) 0.57 (0.13) 0.63 (0.13) 0.61 (0.13) 0.62 (0.13) (0.13) 0.60 (0.12) 0.62 (0.11) 0.60 (0.12) 0.68 (0.10) 0.69 (0.10) 0.67 (0.11) (0.18) 0.07 (0.17) 0.06 (0.17) 0.06 (0.17) 0.12 (0.18) 0.16 (0.18) 0.02 (0.19) DTH (0.09) 0.64 (0.10) 0.66 (0.09) 0.66 (0.09) 0.69 (0.08) 0.68 (0.08) 0.68 (0.08) (0.15) 0.37 (0.15) 0.36 (0.15) 0.35 (0.14) 0.46 (0.13) 0.46 (0.14) 0.39 (0.15) (0.12) 0.59 (0.11) 0.53 (0.12) 0.52 (0.11) 0.62 (0.11) 0.63 (0.11) 0.61 (0.12) (0.14) 0.52 (0.14) 0.56 (0.13) 0.54 (0.14) 0.61 (0.13) 0.62 (0.12) 0.57 (0.13) (0.15) 0.52 (0.16) 0.53 (0.13) 0.51 (0.13) 0.58 (0.14) 0.59 (0.13) 0.55 (0.14) (0.19) 0.42 (0.18) 0.45 (0.18) 0.45 (0.18) 0.47 (0.18) 0.39 (0.19) 0.35 (0.19) Average 0.59 (0.12) 0.58 (0.12) 0.60 (0.12) 0.57 (0.12) 0.65 (0.10) 0.48 (0.14) 0.48 (0.14) (0.13) 0.43 (0.14) 0.48 (0.13) 0.46 (0.13) 0.51 (0.12) 0.51 (0.12) 0.50 (0.13) (0.14) 0.41 (0.17) 0.48 (0.14) 0.48 (0.14) 0.50 (0.14) 0.43 (0.16) 0.43 (0.16) (0.21) 0.29 (0.22) 0.20 (0.22) 0.18 (0.22) 0.37 (0.20) 0.42 (0.21) 0.32 (0.24) GY (0.15) 0.46 (0.13) 0.43 (0.15) 0.42 (0.15) 0.53 (0.12) 0.55 (0.11) 0.49 (0.14) (0.14) 0.56 (0.16) 0.75 (0.11) 0.74 (0.12) 0.64 (0.13) 0.66 (0.13) 0.63 (0.13) (0.10) 0.67 (0.11) 0.73 (0.08) 0.71 (0.08) 0.73 (0.08) 0.71 (0.08) 0.69 (0.10) (0.14) 0.50 (0.14) 0.42 (0.14) 0.40 (0.15) 0.53 (0.13) 0.54 (0.14) 0.50 (0.14) Average 0.62 (0.10) 0.57 (0.14) 0.69 (0.10) 0.70 (0.09) 0.67 (0.09) 0.56 (0.12) 0.65 (0.10) Ftted models were Bayesan LASSO (BL), RR-BLUP (BRR), Bayes A, Bayes B, reproducng kernel Hlbert spaces regresson (RKHS), radal bass functon neural networks (RBFNN) and Bayesan regularzed neural networks (BRNN) across 50 random parttons of the data wth 90% n the tranng set and 10% n the valdaton set. The models wth hghest correlatons are underlned. the valdaton set. These results ndcate that lnes n the tranng set are not necessarly related to those n the valdaton set. DISCUSSION AND CONCLUSIONS Understandng the mpact of epstass on quanttatve trats remans a major challenge. In wheat, several studes have reported sgnfcant epstass for gran yeld and headng or flowerng tme (Goldrnger et al. 1997). Detaled analyses have shown that vernalzaton, daylength senstvty, and earlness per se genes are manly responsble for regulatng headng tme. The vernalzaton requrement relates to the senstvty of the plant to cold temperatures, whch causes t to accelerate spke prmordal formaton. Transgenc and mutant analyses, for example, have suggested a pathway nvolvng epstatc nteractons that combnes envronment-nduced suppresson and upregulaton of several genes, leadng to fnal floral transton (Shmada et al. 2009). There s evdence that the aggregaton of multple gene gene nteractons (epstass) wth small effects nto small epstatc networks n Table 3 Predctve mean- squared error (PMSE) between observed and predcted values for gran yeld (GY) and days to headng (DTH) n 12 envronments for seven models Trat Envronment BL BRR Bayes A Bayes B RKHS RBFNN BRNN DTH Average GY Average Ftted models were Bayesan LASSO (BL), RR-BLUP (BRR), Bayes A, Bayes B, reproducng kernel Hlbert space regresson (RKHS), radal bass functon neural networks (RBFNN) and Bayesan regularzed neural networks (BRNN) across 50 random parttons of the data wth 90% n the tranng set and 10% n the valdaton set. The models wth lowest PMSE are underlned. Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1601

8 Fgure 3 Plots of the predctve correlaton for each of 50 cross-valdaton parttons and 10 envronments for days to headng (DTH) n dfferent combnatons of models. (A) When the best non-parametrc model s RKHS, ths s represented by an open crcle; when the best lnear model s BL, ths s represented by a flled crcle. (B) When the best non-parametrc model s RBFNN, ths s represented by an open crcle; when the best lnear model s BL, ths s represented by a flled crcle. (C) When the best non-parametrc model s BRNN, ths s represented by an open crcle; when the best lnear model s BL, ths s represented by a flled crcle. The hstograms depct the dstrbuton of the correlatons n the testng set obtaned from the 50 parttons for dfferent models. The horzontal (vertcal) dashed lne represents the average of the correlatons for the testng set n the 50 parttons for the model shown on the Y (X) axs. The sold lne represents Y = X;.e. both models have the same predcton ablty. s mportant for explanng the hertablty of complex trats n genome-wde assocaton studes (McKnney and Pajewsk 2012). Epstatc networks and gene gene nteractons can also be exploted for GS va sutable statstcal-genetc models that ncorporate network complextes. Evdence from ths study, as well as from other research nvolvng other plant and anmal speces, suggests that models that are non-lnear n nput varables (e.g. SNPs) predct outcomes n testng sets better than standard lnear regresson models for genome-enabled predcton. However, t should be ponted out that better predctve ablty can have several causes, one of them the ablty of some nonlnear models to capture epstatc effects. Furthermore, the random cross-valdaton scheme used n ths study was not desgned to 1602 P. Pérez-Rodríguez et al.

9 Fgure 4 Plot of the correlaton for each of 50 cross-valdaton parttons and seven envronments for gran yeld (GY) n dfferent combnatons of models. (A) When the best model s RKHS, ths s represented by an open crcle; when the best model s BL, ths s represented by a flled crcle. (B) When best model s RBFNN, ths s represented by an open crcle; when the best model s BL, ths s represented by a flled crcle. (C) When the best model s BRNN, ths s represented by an open crcle; when the best model s BL, ths s represented by a flled crcle. The hstograms depct the dstrbuton of the correlatons n the testng set obtaned from the 50 parttons for dfferent models. The horzontal (vertcal) dashed lne represents the average of the correlatons for the testng set n the 50 parttons for the model shown on the Y (X) axs. The sold lne represents Y = X;.e. both models have the same predcton ablty. specfcally assess epstass but rather to compare the models predctve ablty. It s nterestng to compare results from dfferent predctve machneres when appled to ether maze or wheat. Dfferences n the predcton accuracy of non-parametrc and lnear models (at least for the data sets ncluded n ths and other studes) seem to be more pronounced n wheat than n maze. Although dfferences depend, among other factors, on the trat-envronment combnaton and the number of markers, t s clear from González-Camacho et al. (2012) that for flowerng trats (hghly addtve) and trats such as gran yeld (addtve and epstatc) n maze, the BL model performed very smlarly to the RKHS and RBFNN. On the other hand, n the present study, whch nvolves wheat, the RKHS, RBFNN, and BRNN models clearly had a markedly better predctve accuracy than BL, BRR, Bayes A, or Bayes B. Ths may be due to the fact that, n wheat, addtve addtve epstass plays an mportant role n gran yeld, as found by Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1603

10 Crossa et al. (2006) and Burgueño et al. (2007, 2011) when assessng addtve, addtve addtve, addtve envronment, and addtve addtve envronment nteractons usng a pedgree-based model wth the relatonshp matrx A. As ponted out frst by Ganola et al. (2006) and subsequently by Long et al. (2010), non-parametrc models do not mpose strong assumptons on the phenotype-genotype relatonshp, and they have the potental of capturng nteractons among loc. Our results wth real wheat data sets agreed wth prevous fndngs n anmal and plant breedng and wth smulated experments, n that a non-parametrc treatment of markers may account for epstatc effects that are not captured by lnear addtve regresson models. Usng extensve maze data sets, González-Camacho et al. (2012) found that RBFNN and RKHS had some smlartes and seemed to be useful for predctng quanttatve trats wth dfferent complex underlyng gene acton under varyng types of nteracton n dfferent envronmental condtons. These authors suggested that t s possble to make further mprovements n the accuracy of the RKHS and RBFNN models by ntroducng dfferental weghts n SNPs, as shown by Long et al. (2010) for RBFs. The tranng populaton used here was not developed specfcally for ths study; t was made up of a set of elte lnes from the CIMMYT ran-fed sprng wheat breedng program. Our results show that t s possble to acheve good predctons of lne performance by combnng phenotypc and genotypc data generated on elte lnes. As genotypng costs decrease, breedng programs could make use of genome-enabled predcton models to predct the values of new breedng lnes generated from crosses between elte lnes n the tranng set before they reach the yeld testng stage. Lnes wth the hghest estmated breedng values could be ntercrossed before beng phenotyped. Such a rapd cyclng scheme would accelerate the fxaton rate of favorable alleles n elte materals and should ncrease the genetc gan per unt of tme, as descrbed by Heffner et al. (2009). It s mportant to pont out that proof-of-concept experments are requred before genome-enabled selecton can be mplemented successfully n plant breedng programs. It s necessary to test genomc predctons on breedng materals derved from crosses between lnes of the tranng populaton. If predctons are relable enough, an experment usng the same set of parental materals could be carred out to compare the feld performance of lnes comng from a genomc-asssted recurrent selecton program scheme vs. lnes comng from a conventonal breedng scheme. The accuraces reported n ths study represent predcton of wheat lnes usng a tranng set comprsng lnes wth some degree of relatedness to lnes n the valdaton set. When the valdaton and the tranng sets are not genetcally related (unrelated famles) or represent populatons wth dfferent genetc structures and dfferent lnkage dsequlbrum patterns, then neglgble accuraces are to be expected. It seems that successful applcaton of genomc selecton n plant breedng requres some genetc relatedness between ndvduals n the tranng and valdaton sets, and that lnkage dsequlbrum nformaton per se does not suffce (e.g. Makowsky et al. 2011). ACKNOWLEDGMENTS Fnancal support by the Wsconsn Agrculture Experment Staton and the AVIAGEN, Ltd. (Newbrdge, Scotland) to Paulno Pérez and Danel Ganola s acknowledged. We thank the Centro Internaconal de Mejoramento de Maíz y Trgo (CIMMYT) researchers who carred out the wheat trals and provded the phenotypc data analyzed n ths artcle. LITERATURE CITED Bernardo, R., and J. M. Yu, 2007 Prospects for genome-wde selecton for quanttatve trats n maze. Crop Sc. 47(3): Broomhead, D. S., and D. Lowe, 1988 Multvarable functonal nterpolaton and adaptve networks. Complex Systems 2: Burgueño, J., J. Crossa, P. L. Cornelus, R. Trethowan, G. McLaren et al., 2007 Modelng addtve envronment and addtve addtve envronment usng genetc covarances of relatves of wheat genotypes. Crop Sc. 47(1): Burgueño, J., J. Crossa, J. M. Cotes, F. San Vcente, and B. Das, 2011 Predcton assessment of lnear mxed models for multenvronment trals. Crop Sc. 51(3): Chen, S., C. F. N. Cowan, and P. M. Grant, 1991 Orthogonal least squares learnng algorthm for radal bass functon networks. Neural Networks, IEEE Transactons on 2(2): Cockram, J., H. Jones, F. J. Legh, D. O Sullvan, W. Powell et al., 2007 Control of flowerng tme n temperate cereals: genes, domestcaton, and sustanable productvty. J. Exp. Bot. 58(6): Cont, V., P. F. Roncallo, V. Beaufort, G. L. Cervgn, R. Mranda et al., 2011 Mappng of man and epstatc effect QTLs assocated to gran proten and gluten strength usng a RIL populaton of durum wheat. J. Appl. Genet. 52(3): Crossa, J., J. Burgueño, P. L. Cornelus, G. McLaren, R. Trethowan et al., 2006 Modelng genotype envronment nteracton usng addtve genetc covarances of relatves for predctng breedng values of wheat genotypes. Crop Sc. 46(4): Crossa, J., G. de los Campos, P. Perez, D. Ganola, J. Burgueño et al., 2010 Predcton of genetc values of quanttatve trats n plant breedng usng pedgree and molecular markers. Genetcs 186(2): Crossa, J., P. Perez, G. de los Campos, G. Mahuku, S. Dresgacker et al., 2011 Genomc selecton and predcton n plant breedng. J. Crop Improv. 25(3): de los Campos, G., and P. Perez, BLR: Bayesan Lnear Regresson R package, verson 1.2. de los Campos, G., H. Naya, D. Ganola, J. Crossa, A. Legarra et al., 2009 Predctng quanttatve trats wth regresson models for dense molecular markers and pedgree. Genetcs 182(1): de los Campos, G., D. Ganola, G. J. M. Rosa, K. A. Wegel, and J. Crossa, 2010 Sem-parametrc genomc-enabled predcton of genetc values usng reproducng kernel Hlbert spaces methods. Genet. Res. 92(4): de los Campos, G., J. M. Hckey, R. Pong-Wong, H. D. Daetwyler, and M. P. L. Calus, 2012 Whole genome regresson and predcton methods appled to plant and anmal breedng. Genetcs DOI: /genetcs Foresee, D., and M. T. Hagan, Gauss-Newton approxmaton to Bayesan learnng. Internatonal Conference on Neural Networks, June 9 12, Houston, TX. Ganola, D., and J. B. C. H. M. van Kaam, 2008 Reproducng kernel Hlbert spaces regresson methods for genomc asssted predcton of quanttatve trats. Genetcs 178(4): Ganola, D., R. L. Fernando, and A. Stella, 2006 Genomc-asssted predcton of genetc value wth semparametrc procedures. Genetcs 173(3): Ganola, D., H. Okut, K. A. Wegel, and G. J. M. Rosa, 2011 Predctng complex quanttatve trats wth Bayesan neural networks: a case study wth Jersey cows and wheat. BMC Genet. 12: 87. Goldrnger, I., P. Brabant, and A. Gallas, 1997 Estmaton of addtve and epstatc genetc varances for agronomc trats n a populaton of doubled-haplod lnes of wheat. Heredty 79: González-Camacho, J. M., G. de los Campos, P. Perez, D. Ganola, J. Carns et al., 2012 Genome-enabled predcton of genetc values usng radal bass functon. Theor. Appl. Genet. 125: Haber, D., R. L. Fernando, K. Kzlkaya, and D. J. Garrk, 2011 Extenson of the Bayesan alphabet for genomc selecton. BMC Bonformatcs 12: 186. Haste, T., R. Tbshran, and J. Fredman, 2009 The Elements of Statstcal Learnng: Data Mnng, Inference and Predcton, Ed. 2. Sprnger, New York P. Pérez-Rodríguez et al.

11 Heffner, E. L., M. E. Sorrells, and J. L. Jannnk, 2009 Genomc selecton for crop mprovement. Crop Sc. 49(1): Heslot, N., H. P. Yang, M. E. Sorrells, and J. L. Jannnk, 2012 Genomc selecton n plant breedng: a comparson of models. Crop Sc. 52(1): Hckey, J. M., and B. Ter, 2009 AlphaBayes (Beta): Software for Polygenc and Whole Genome Analyss. User Manual. Unversty of New England, Armdale, Australa. Hoerl, A. E., and R. W. Kennard, 1970 Rdge regresson: based estmaton for nonorthogonal problems. Technometrcs 12(1): Holland, J. B., 2001 Epstass and plant breedng. Plant Breedng Revews 21: Holland, J. B., 2008 Theoretcal and bologcal foundatons of plant breedng, pp n Plant Breedng: The Arnel R. Hallauer Internatonal Symposum, edted by K. R. Lamkey and M. Lee. Blackwell Publshng, Ames, IA. Lampnen, J., and A. Vehtar, 2001 Bayesan approach for neural networks - revew and case studes. Neural Netw. 14(3): Laure, D. A., N. Pratchett, J. W. Snape, and J. H. Bezant, 1995 RFLP mappng of fve major genes and eght quanttatve trat loc controllng flowerng tme n a wnter sprng barley (Hordeum vulgare L.) cross. Genome 38(3): Long, N. Y., D. Ganola, G. J. M. Rosa, K. A. Wegel, A. Krans et al., 2010 Radal bass functon regresson methods for predctng quanttatve trats usng SNP markers. Genet. Res. 92(3): MacKay, D. J. C., 1992 A practcal Bayesan framework for backpropagaton networks. Neural Comput. 4(3): MacKay, D. J. C., 1994 Bayesan non-lnear modellng for the predcton competton. ASHRAE Transactons 100(Pt. 2): Makowsky, R., N. M. Pajewsk, Y. C. Klmentds, A. I. Vazquez, C. W. Duarte et al., 2011 Beyond mssng hertablty: predcton of complex trats. PLoS Genet. 7(4): e McKnney, B. A., and N. M. Pajewsk, Sx degrees of epstass: statstcal network models for GWAS. Front. Genet. 2: 109. Meuwssen, T. H. E., B. J. Hayes, and M. E. Goddard, 2001 Predcton of total genetc value usng genome-wde dense marker maps. Genetcs 157 (4): Neal, R. M., Bayesan Learnng for Neural Networks (Lecture Notes n Statstcs), Vol Sprnger-Verlag, NY. Ober, U., J. F. Ayroles, E. A. Stone, S. Rchards, D. Zhu et al., 2012 Usng whole-genome sequence data to predct quanttatve trat phenotypes n Drosophla melanogaster. PLoS Genet. 8(5): e Okut, H., D. Ganola, G. J. Rosa, and K. A. Wegel, 2011 Predcton of body mass ndex n mce usng dense molecular markers and a regularzed neural network. Genet. Res. Camb. 93: Park, T., and G. Casella, 2008 The Bayesan LASSO. J. Am. Stat. Assoc. 103: Perez, P., G. de los Campos, J. Crossa, and D. Ganola, 2010 Genomcenabled predcton based on molecular markers and pedgree usng the Bayesan lnear regresson package n R. Plant Genome 3(2): Poggo, T., and F. Gros, 1990 Networks for approxmaton and learnng. Proc. IEEE 78(9): Resende, M. F. R., P. Muñoz, M. D. V. Resende, D. J. Garrck, R. L. Fernando et al., 2012 Accuracy of genomc selecton methods n a standard data set of loblolly pne (Pnus taeda L.). Genetcs 4: Shmada, S., T. Ogawa, and S. Ktagawa, 2009 A genetc network of flowerng-tme genes n wheat leaves, n whch an APETALA1/FRUITFULLlke gene, VRN-1, s upstream of FLOWERING LOCUS T. PlantJ. 58: Wang, C. S., J. J. Rutledge, and D. Ganola, 1994 Bayesan analyss of mxed lnear models va Gbbs samplng wth an applcaton to ltter sze n Iberan pgs. Genet. Sel. Evol. 26: Zhang, K., J. Tan, L. Zhao, and S. Wang, 2008 Mappng QTLs wth epstatc effects and QTL envronment nteractons for plant heght usng a doubled haplod populaton n cultvated wheat. J. Genet. Genomcs 35 (2): Communcatng edtor: J. B. Holland Volume 2 December 2012 Lnear and Non-parametrc Regresson Models for GS 1605

The BGLR (Bayesian Generalized Linear Regression) R- Package. Gustavo de los Campos, Amit Pataki & Paulino Pérez. (August- 2013)

The BGLR (Bayesian Generalized Linear Regression) R- Package. Gustavo de los Campos, Amit Pataki & Paulino Pérez. (August- 2013) Bostatstcs Department Bayesan Generalzed Lnear Regresson (BGLR) The BGLR (Bayesan Generalzed Lnear Regresson) R- Package By Gustavo de los Campos, Amt Patak & Paulno Pérez (August- 03) (contact: gdeloscampos@gmal.com