Supporting Figures Figure S1. FIGURE S2.

Similar documents
Bayesian Multiple QTL Mapping

Overview. Background. Locating quantitative trait loci (QTL)

The Imprinting Model

Practical OmicsFusion

DeltaGen: Quick start manual

Breeding View A visual tool for running analytical pipelines User Guide Darren Murray, Roger Payne & Zhengzheng Zhang VSN International Ltd

Package DSPRqtl. R topics documented: June 7, Maintainer Elizabeth King License GPL-2. Title Analysis of DSPR phenotypes

Supplementary text S6 Comparison studies on simulated data

To finish the current project and start a new project. File Open a text data

CTL mapping in R. Danny Arends, Pjotr Prins, and Ritsert C. Jansen. University of Groningen Groningen Bioinformatics Centre & GCC Revision # 1

Package dlmap. February 19, 2015

FVGWAS- 3.0 Manual. 1. Schematic overview of FVGWAS

QTL DETECTION WITH GS3. A Legarra, 19/06/2013

Notes on QTL Cartographer

MIPE: Model Informing Probability of Eradication of non-indigenous aquatic species. User Manual. Version 2.4

Genetic Analysis. Page 1

A practical example of tomato QTL mapping using a RIL population. R R/QTL

ChIP-seq hands-on practical using Galaxy

Spotter Documentation Version 0.5, Released 4/12/2010

Improved Ancestry Estimation for both Genotyping and Sequencing Data using Projection Procrustes Analysis and Genotype Imputation

PROMO 2017a - Tutorial

Spatial Distributions of Precipitation Events from Regional Climate Models

Estimating. Local Ancestry in admixed Populations (LAMP)

Tutorial 3. Chiun-How Kao 高君豪

SNPViewer Documentation

The fgwas Package. Version 1.0. Pennsylvannia State University

Minitab 17 commands Prepared by Jeffrey S. Simonoff

MACAU User Manual. Xiang Zhou. March 15, 2017

Estimating Variance Components in MMAP

2 T. x + 2 T. , T( x, y = 0) = T 1

KC-SMART Vignette. Jorma de Ronde, Christiaan Klijn April 30, 2018

Manual code: MSU_pigs.R

Install RStudio from - use the standard installation.

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS

Louis Fourrier Fabien Gaie Thomas Rolf

Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD

Step-by-Step Guide to Advanced Genetic Analysis

Breeding Guide. Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel

Step-by-Step Guide to Relatedness and Association Mapping Contents

Screening Design Selection

EECS730: Introduction to Bioinformatics

xmapbridge: Graphically Displaying Numeric Data in the X:Map Genome Browser

Analyzing Genomic Data with NOJAH

Version 2.0 Zhong Wang Department of Public Health Sciences Pennsylvannia State University

Comparison of different solvers for two-dimensional steady heat conduction equation ME 412 Project 2

Package inversion. R topics documented: July 18, Type Package. Title Inversions in genotype data. Version

Convex Optimization / Homework 2, due Oct 3

Scanning Real World Objects without Worries 3D Reconstruction

RoHS. Specifications. Part Number Configurator

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Breeze - Segmentation guide

CHAOS Chaos Chaos Iterate

Logistic Regression. Abstract

How to Reveal Magnitude of Gene Signals: Hierarchical Hypergeometric Complementary Cumulative Distribution Function

GMS 9.0 Tutorial MODFLOW Stochastic Modeling, PEST Null Space Monte Carlo I Use PEST to create multiple calibrated MODFLOW simulations

Package pheno2geno. February 15, 2013

Programming Exercise 7: K-means Clustering and Principal Component Analysis

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1

Package GGEBiplots. July 24, 2017

ChIP-seq hands-on practical using Galaxy

Neural Network Weight Selection Using Genetic Algorithms

Dimension reduction : PCA and Clustering

Model Selection and Assessment

An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures

Assignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018

Interlude: Solving systems of Equations

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Package r.jive. R topics documented: April 12, Type Package

Linkage analysis with paramlink Appendix: Running MERLIN from paramlink

Gene Clustering & Classification

An efficient algorithm for sparse PCA

INTRODUCTION TO THE MATLAB APPLICATION DESIGNER EXERCISES

Package multihiccompare

A guide to QTL mapping with R/qtl

Crew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm. Santos and Mateus (2007)

Package LEA. December 24, 2017

Superpixel Tracking. The detail of our motion model: The motion (or dynamical) model of our tracker is assumed to be Gaussian distributed:

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM

Bioconductor s tspair package

CHAPTER 4. OPTIMIZATION OF PROCESS PARAMETER OF TURNING Al-SiC p (10P) MMC USING TAGUCHI METHOD (SINGLE OBJECTIVE)

Applied Regression Modeling: A Business Approach

Development of linkage map using Mapmaker/Exp3.0

K-means clustering Based in part on slides from textbook, slides of Susan Holmes. December 2, Statistics 202: Data Mining.

CS145: INTRODUCTION TO DATA MINING

Prelims Data Analysis TT 2018 Sheet 7

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs

Nature Publishing Group

CHAPTER 3 MAINTENANCE STRATEGY SELECTION USING AHP AND FAHP

Image Enhancement: To improve the quality of images

Package SeleMix. R topics documented: November 22, 2016

Iterative Signature Algorithm for the Analysis of Large-Scale Gene Expression Data. By S. Bergmann, J. Ihmels, N. Barkai

Package MultiMeta. February 19, 2015

Sealed snap-action pushbutton switches bushing Ø 12 mm momentary

A short manual for LFMM (command-line version)

University of Wisconsin-Madison Spring 2018 BMI/CS 776: Advanced Bioinformatics Homework #2

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

3. Raster Algorithms. 3.1 Raster Displays. (see Chapter 1 of the Notes) CRT LCD. CS Dept, Univ of Kentucky

Step-by-step user instructions to the hamlet-package

SPSS. (Statistical Packages for the Social Sciences)

Case Study: Attempts at Parametric Reduction

Transcription:

Supporting Figures Figure S1. QTL scans for the 12 environments of the yield data for pepper simulated from the physiological genotype-to-phenotype model with seven physiological parameters (Rodrigues et al. 2012). Each column represents a different level of temperature. The first two rows correspond to the highest error variance in this realization of the simulation and the last two rows to the lowest. The dashed grey line represents the scans for the actual data, the grey line the scans for the AMMI2 predicted values, and the black line the scans for the W-AMMI2 predicted values. All scans are based on composite interval mapping. The horizontal lines correspond to the thresholds based on a permutation test with 1000 permutations, at the 0.05 significance level. These scans are based on one randomly chosen realization out of the 100 simulations. The codes for the captions of the individual scans are described in Table 1. FIGURE S2. Summary of the number of detected QTLs for the actual data (yellow), AMMI2 predicted values (orange), W-AMMI2 predicted values (red) and linear mixed model (dark red). The graph on the left hand side shows the box plots for the number of detected QTLs per model parameter (Table 2), and the graph on the right hand side shows the number of detected QTLs per environmental cluster (Table 1). These values are for QTLs detected when considering an interval of 20 cm centred on the exact simulated QTL position. These plots are for a heritability of 0.3 in all environments.

FIGURE S3. Number of QTLs detected per environment for an expected maximum of 7 (environments with temperature of 15ºC or 20ºC, Table 2) or 8 (environments with temperature of 25ºC, Table 8). The boxplots are presented for the actual data (yellow), AMMI2 predicted values (orange), W-AMMI2 predicted values (red) and linear mixed model (dark red). These values are for QTLs detected when considering an interval of 20 cm centred on the exact simulated QTL position. These plots are for a heritability of 0.3 in all environments. FIGURE S4. Summary of the number of detected QTLs for the actual data (yellow), AMMI2 predicted values (orange), W-AMMI2 predicted values (red) and linear mixed model (dark red). The graph on the left hand side shows the box plots for the number of detected QTLs per model parameter (Table 2), and the graph on the right hand side shows the number of detected QTLs per environmental cluster (Table 1). These values are for QTLs detected when considering an interval of 20 cm centred on the exact simulated QTL position. These plots are for a heritability of 0.8 in all environments.

FIGURE S5. Number of QTLs detected per environment for an expected maximum of 7 (environments with temperature of 15ºC or 20ºC, Table 2) or 8 (environments with temperature of 25ºC, Table 8). The boxplots are presented for the actual data (yellow), AMMI2 predicted values (orange), W-AMMI2 predicted values (red) and linear mixed model (dark red). These values are for QTLs detected when considering an interval of 20 cm centred on the exact simulated QTL position. These plots are for a heritability of 0.8 in all environments. FIGURE S6. Genome scan for the means of the SxM yield data. The log10(p)-values for the QTL main effects plus QEI are shown. The red horizontal line is the 5% genomewide significance threshold. The green horizontal line in the bottom section summarizes the top panel. The environment specific QTL effects are shown. Blue (red) indicates that parent Steptoe (Morex) has significantly higher yield contribution. The considered variance-covariance (VCOV) structure was the factor analytic with two multiplicative terms.

TABLE S1. Chromosome (Chr) and respective positions (Pos) where a QTL was detected for each of the four approaches: QTL scans of the actual data (R); QTL scans of the AMMI3 predicted values (A); QTL scans of the W-AMMI3 predicted values (W); and linear mixed model framework (L), for the SxM yield data in all 13 environments which have at least one partial replication. ALL means that the QTL was detected with all the four approaches. The last column gives the reference where the same detection was observed. None indicates that no reference was found where a similar QTL was detected. Chr Pos ID91 ID92 MAN92 MIN92 MTd91 MTd92 MTi91 MTi92 NY92 ONT92 OR91 WA91 WA92 Reference 1 [0; 33.5] R+A R+W R W R+A HAYES et al. 1993 1 [116.7; 170.1] L W W W+L R+W+L None HAYES et al. 1993 LACAZE et al. 2009 MALOSETTI et al. 2004 2 [41.2; 52.6] W ALL L ALL R+A A+W+L W A W+L L ZHU et al. 1999 2 [68.8; 88.2] W W+L W L ALL A+W+L W+L R+L L ALL MALOSETTI et al. 2004 HAYES et al. 1993 LACAZE et al. 2009 LARSON et al. 1996 3 [73; 83.6] ALL ALL A+L ALL ALL L ALL ALL ALL ALL W ALL A 4 1.4 R+A None 4 63.2 A+W+L L R+W LACAZE et al. 2009 4 96.5 R+A+W R+L W None 4 118.3 A A None 5 49.6 R None 5 [146.3; 150.8] A R+W R R+W A+W A A None 6 8.1 W None 6 [53.1; 72.5] ALL W W A L W A+W+L W+L W 7 [45.6; 78.2] W R+W+L ALL W ALL R R+W A+W+L R+A+W ALL

Supporting Files FILE S1. The R code for the weighted low-rank SVD (SREBRO and JAAKKOLA 2003) (adapted from the Marlin s MatLab code in http://www.cs.toronto.edu/~marlin/code/wsvd.m). # Inputs # Y (IxJ) data matrix # W (IxJ) weight matrix with 0 <= W ij <= 1; I = 1,,I; j = 1,,J # N rank of approximation # # Outputs # U,D,V such that Y~UDV' Y<- read.csv( data.csv, header=t) X<- matrix(0,ngen,nenv) aux<- matrix(1,ngen,nenv) Xold=Inf*aux Err=Inf eps<- 1e-10 # Read the original data set (IxJ) # Matrix (IxJ) with zero in all positions to initialize the algorithm # Initial distance between consecutive iterations X(i) and X(i+1) # Maximum admissible distance between consecutive iterations while(err>eps){ # Repeats the code until the distance between X(i) and X(i+1) is below eps=1e-10 Xold=X # Update Xold to X(i) wsvd<- svd(w*y + (1-W)*X) # Weighted SVD U<- wsvd$u # Left singular vectors D<- diag(wsvd$d) # Singular values V<- wsvd$v # Right singular vectors D[(N+1):length(wsvd$d),(N+1):length(wsvd$d)]<- 0 # Discard singular values above N X<- U %*% D %*% t(v) # Update X (i.e. compute X(i+1)) Err=sum(sum((X-Xold)^2)) # Update the distance between consecutive iterations (Err) } FILE S2. The R code to compute the error variances. The data file includes columns with the phenotypic data ( yield ), and codes for genotype ( gen ), environment ( env ) and replication ( rep ). library(lme4) data<- read.csv("sxm.csv", header=t) ErrVar<- NULL p<- nlevels(data$env) for(i in 1:p){ n1<- sum(table(data$env)[0:(i-1)]) + 1 n2<- sum(table(data$env)[0:i]) data<- data[n1:n2,] x<- lmer(yield ~ rep + (1 gen), data=data) ErrVar[i]<- attr(varcorr(x), "sc")^2 } ErrVar<- data.frame(errvar, Env=levels(env)) # p is the number of environments # computes the error variance for each environment # ErrVar includes the error variances for all environments