Package EMLRT. August 7, 2014

Similar documents
Package GWAF. March 12, 2015

Package coloc. February 24, 2018

Package SIPI. R topics documented: September 23, 2016

Package semisup. March 10, Version Title Semi-Supervised Mixture Model

Package ridge. R topics documented: February 15, Title Ridge Regression with automatic selection of the penalty parameter. Version 2.

Package FREGAT. April 21, 2017

Package EBglmnet. January 30, 2016

SEQGWAS: Integrative Analysis of SEQuencing and GWAS Data

Package LGRF. September 13, 2015

Package stockr. June 14, 2018

NAME QUICKTEST Quick association testing, for quantitative traits, allowing genotype uncertainty

Package ssa. July 24, 2016

Package SMAT. January 29, 2013

Package OmicKriging. August 29, 2016

BICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017

[/TTEST [PERCENT={5}] [{T }] [{DF } [{PROB }] [{COUNTS }] [{MEANS }]] {n} {NOT} {NODF} {NOPROB}] {NOCOUNTS} {NOMEANS}

Package lodgwas. R topics documented: November 30, Type Package

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering

Package dkdna. June 1, Description Compute diffusion kernels on DNA polymorphisms, including SNP and bi-allelic genotypes.

run ld50 /* Plot the onserved proportions and the fitted curve */ DATA SETR1 SET SETR1 PROB=X1/(X1+X2) /* Use this to create graphs in Windows */ gopt

Package PedCNV. February 19, 2015

Package GEM. R topics documented: January 31, Type Package

Package MOJOV. R topics documented: February 19, 2015

Estimating Variance Components in MMAP

ANNOUNCING THE RELEASE OF LISREL VERSION BACKGROUND 2 COMBINING LISREL AND PRELIS FUNCTIONALITY 2 FIML FOR ORDINAL AND CONTINUOUS VARIABLES 3

Introduction to Mixed Models: Multivariate Regression

Package triogxe. April 3, 2013

Package RPMM. August 10, 2010

Documentation for BayesAss 1.3

Package snpstatswriter

Package RPMM. September 14, 2014

Mendel and His Peas Investigating Monhybrid Crosses Using the Graphing Calculator

- 1 - Fig. A5.1 Missing value analysis dialog box

The SimHap Package. September 7, 2007

Package allehap. August 19, 2017

A comprehensive modelling framework and a multiple-imputation approach to haplotypic analysis of unrelated individuals

MACAU User Manual. Xiang Zhou. March 15, 2017

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

GMDR User Manual. GMDR software Beta 0.9. Updated March 2011

FVGWAS- 3.0 Manual. 1. Schematic overview of FVGWAS

Overview. Background. Locating quantitative trait loci (QTL)

MAGA: Meta-Analysis of Gene-level Associations

Chapter 6: Linear Model Selection and Regularization

The glmmml Package. August 20, 2006

Estimating. Local Ancestry in admixed Populations (LAMP)

Package BayesCR. September 11, 2017

High dimensional data analysis

Package logicfs. R topics documented:

Package globalgsa. February 19, 2015

GCTA: a tool for Genome- wide Complex Trait Analysis

High throughput Data Analysis 2. Cluster Analysis

Bivariate (Simple) Regression Analysis

Week 4: Simple Linear Regression III

Using HLM for Presenting Meta Analysis Results. R, C, Gardner Department of Psychology

QUICKTEST user guide

Statistical Analysis of List Experiments

Package RVS0.0 Jiafen Gong, Zeynep Baskurt, Andriy Derkach, Angelina Pesevski and Lisa Strug October, 2016

Fathom Dynamic Data TM Version 2 Specifications

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA

CHAPTER 1 INTRODUCTION

K-Means Clustering. Sargur Srihari

BOLT-LMM v1.2 User Manual

Probabilistic Analysis Tutorial

CHAPTER 18 OUTPUT, SAVEDATA, AND PLOT COMMANDS

Package HMMASE. February 4, HMMASE R package

Stat 5100 Handout #14.a SAS: Logistic Regression

Package SimHap. February 15, 2013

Package bingat. R topics documented: July 5, Type Package Title Binary Graph Analysis Tools Version 1.3 Date

GCTA: a tool for Genome- wide Complex Trait Analysis

Package DCEM. September 30, 2018

Package MultiMeta. February 19, 2015

BIMBAM user manual. Yongtao Guan and Matthew Stephens Baylor College of Medicine and University of Chicago. Version 1.0 Revised on 25 June 2015

Mutations for Permutations

GMMAT: Generalized linear Mixed Model Association Tests Version 0.7

Missing Data and Imputation

Package SimGbyE. July 20, 2009

The Lander-Green Algorithm in Practice. Biostatistics 666

MAGMA manual (version 1.06)

Package texteffect. November 27, 2017

Package sensory. R topics documented: February 23, Type Package

Package TestDataImputation

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS

Package RFGLS. February 19, 2015

Package rubias. February 2, 2019

Package BDMMAcorrect

Package LEA. December 24, 2017

Package TVsMiss. April 5, 2018

KGG: A systematic biological Knowledge-based mining system for Genomewide Genetic studies (Version 3.5) User Manual. Miao-Xin Li, Jiang Li

Documentation for FCgene

Random Number Generation and Monte Carlo Methods

Package rqt. November 21, 2017

Package LINselect. June 9, 2018

Stats 300C project about HMM knockoffs Prepared by Matteo Sesia Due Friday May 11, 2018

Package ICsurv. February 19, 2015

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1

Clustering Lecture 5: Mixture Model

Package RobustSNP. January 1, 2011

Package quickreg. R topics documented:

Package endogenous. October 29, 2016

Package GWRM. R topics documented: July 31, Type Package

Transcription:

Package EMLRT August 7, 2014 Type Package Title Association Studies with Imputed SNPs Using Expectation-Maximization-Likelihood-Ratio Test LazyData yes Version 1.0 Date 2014-08-01 Author Maintainer <kchuang@live.unc.edu> This package contains the EM-LRT-PROB and EM-LRT-DOSE for testing whether an outcome variable is associated with imputed SNPs. Specifically, EM-LRT- PROB is used when posterior probabilities of all potential genotypes are estimated while EM- LRT-DOSE is used when only one-dimensional summary statistic imputed dosages are available. Moreover, this package allows user to simulate imputed genotype data, such as dosage and posterior probabilites, with preferred Rsq and MAF. This package is still under development and subject to change. License GPL-2 Depends MASS, gtools R topics documented: Assoc_EM_LRT_DOSE.................................. 2 data.gen........................................... 2 EM_LRT_DOSE...................................... 3 EM_LRT_PROB...................................... 6 Mixture........................................... 7 Index 9 1

2 data.gen Assoc_EM_LRT_DOSE Association Test with EM-LRT-DOSE method. Conduct association testing using imputed data (output from MaCH) using EM-LRT-DOSE method. Assoc_EM_LRT_DOSE(datfile, pedfile, dosefile, infofile, match = TRUE, outfil datfile pedfile dosefile infofile match outfile filename hideinfo nsnps ncore Phenotypic Data - datfile Phenotypic Data - pedfile Imputed Allele Counts - dosefile Imputed Allele Counts - infofile Match the individual ID among pedfile and dosefile. Default is TRUE. Output a file. Specific filename for output file. Hide program information or not. The number of SNPs for association testing from the 1st SNPs. The number of cores to use, i.e. at most how many child processes will be run simultaneously. ##---- Should be DIRECTLY executable!! ---- ##-- ==> Define data, use random, ##--or do help(data=index) for the standard data sets. data.gen Generate Data Generate data of phenotype (Y), genotype (G), posterior probabilities (P), and imputed dosage (D) with user specified Rsq and MAF. data.gen(n, r2, q, beta, sde, gamma = NULL)

EM_LRT_DOSE 3 n r2 q beta gamma sde Sample size of the data generated. Squared Pearson s correlation between true genotype (G) and dosage (D). Minor allele frequency. Coefficients of intercept and genotype. Coefficients of covariates. Standard deviation of error term. ## 1 Generate data ## n<-2000 r2<-0.3 q<-0.4 beta<-c(1,1.5) ##under the alternative gamma<-c(1,2,3) ##three covariates sde<-1 data<-data.gen(n,r2,q,beta,sde,gamma) ##---------------------------## ## 2 Confirm data generated ## ##---------------------------## lr<-lm(y~g+z,data) lr$coef ##beta; gamma summary(lr)$sigma ##sde cor(data$g,data$d)^2 ##r2 mean(data$g)/2 ##q EM_LRT_DOSE EM-LRT-Dose An expectation-maximization (EM) likelihood-ratio test (LRT) for association based on imputed dosages. EM_LRT_DOSE(param0, r2, q, RejSampling, seed, MI, Intermediate, hideinfo, data)

4 EM_LRT_DOSE param0 r2 q RejSampling seed MI Value Initial values for parameters (beta0, beta1, gamma1,..., sde). Squared Pearson s correlation between true genotype (G) and dosage (D). Minor allele frequency. Default rejection sampling approach will be used to sample probability of having one copy of minor allele. If false, dosage sampling will be used instead. Seed for rejection sampling. If not specified, seed will be randomly selected from 1 to 2000. Number of multiple imputation. Default is 1. i.e., single imputation. Intermediate Interested intermediate chi-square test statistic(s) and p-value(s) will be shown in the output. For example, if Intermediate == c(1,3), the chi-square test statistic and p-value calculated based on the first and the first three data sets will be shown in the result. hideinfo data Rejection sampling information. If FALSE, number of rejection sampling and the corresponding sampling time will be shown. Data contains phenotype (Y), dosage (D), and covariates (Z). See data.gen() for more details. chisq Chi-square test statistic for association test based on multiple data sets. chisq_intermediate_steps Chi-square test statistic(s) for association test based on first interested intermediate data sets. pval P-value for association test w.r.t. chisq. pval_intermediate_steps P-value(s) for association test w.r.t. chisq_intermediate_steps. See Also data.gen ## 1 Generate data ## n<-2000 r2<-0.5 q<-0.4 beta<-c(1,1.5) gamma<-c(1,2,3) sde<-1 data<-data.gen(n,r2,q,beta,sde,gamma)

EM_LRT_DOSE 5 ## 2 Mask true genotype ## data_scenario2<-list(y=data$y,d=data$d,z=data$z) ## 3 Generate initial values ## lr2<-lm(y~d+z,data_scenario2) beta_gamma0<-lr2$coef sde0<-summary(lr2)$sigma param0<-c(beta_gamma0,sde0) ## 4 Apply method ## ##Scenario II ######## Dosage Sampling ######## ##Example 1 dose1<-em_lrt_dose(param0,r2,q,rejsampling=false,seed=null,mi=1,intermediate=null, hideinfo=false,data=data_scenario2) ##Dosage sampling approach is used to sample 1 set of probability ##of having one copy of minor allele. ######## Rejection Sampling ######## ##Example 1 dose1<-em_lrt_dose(param0,r2,q,rejsampling=true,seed=123,mi=1,intermediate=null, hideinfo=false,data=data_scenario2) ##Rejection sampling approach is used to sample 1 set of probability ##of having one copy of minor allele. ##Example 2 dose2<-em_lrt_dose(param0,r2,q,rejsampling=true,seed=123,mi=5,intermediate=null, hideinfo=false,data=data_scenario2) ##Rejection sampling approach is used to sample 5 sets of probability ##of having one copy of minor allele. ##Example 3 dose3<-em_lrt_dose(param0,r2,q,rejsampling=true,seed=123,mi=5,intermediate=c(1,3), hideinfo=false,data=data_scenario2) ##Rejection sampling approach is used to sample 5 sets of probability of ##having one copy of minor allele. In addition to the chi-square test ##statistic and p-value calculated based on the 5 data sets, chi-square test ##statistics and p-values calculated based on the first data set and the first ##3 data sets will be shown in the result.

6 EM_LRT_PROB EM_LRT_PROB EM-LRT-Prob An expectation-maximization (EM) likelihood-ratio test (LRT) for association based on posterior probabilities. EM_LRT_PROB(param0, r2, q, data) param0 r2 q data Initial values for parameters (beta0, beta1, gamma1,..., sde). Squared Pearson s correlation between true genotype (G) and dosage (D). Minor allele frequency. Data contains phenotype (Y), dosage (D), and covariates (Z). See data.gen() for more details. Value chisq pval Chi-squared test statistic for association test P-value for association test See Also Mixture, data.gen ## 1 Generate data ## n<-2000 r2<-0.3 q<-0.4 beta<-c(1,1.5) gamma<-c(1,2,3) sde<-1 data<-data.gen(n,r2,q,beta,sde,gamma) ## 2 Mask true genotype ## data_scenario1<-list(y=data$y,d=data$d,p=data$p,z=data$z)

Mixture 7 ## 4 Generate initial values ## lr2<-lm(y~d+z,data_scenario1) beta_gamma0<-lr2$coef sde0<-summary(lr2)$sigma param0<-c(beta_gamma0,sde0) ## 5 Apply method ## ##Scenario I prob<-em_lrt_prob(param0,r2,q,data_scenario1) ##EM-LRT-PROB Mixture Mixture (Zhang et al.) Mixture of regression model that described in Zheng et al. (2011). Mixture(param0, r2, q, data) param0 r2 q data Initial values for parameters (beta0, beta1, gamma1,..., sde). Squared Pearson s correlation between true genotype (G) and dosage (D). Minor allele frequency. Data contains phenotype (Y), probabilites (P), and covariates (Z). Value chisq pval Chi-squared test statistic for association. P-value for association. References Zheng, J., et al., A comparison of Approaches to account for uncertainty in analysis of imputed genotypes. Genetic Epidemiology, 2011. 35(2): p. 102-10. See Also EM_LRT_PROB, data.gen

8 Mixture ## 1 Generate data ## n<-2000 r2<-0.5 q<-0.4 beta<-c(1,1.5) gamma<-c(1,2,3) sde<-1 data<-data.gen(n,r2,q,beta,sde,gamma) ## 2 Mask true genotype ## data_scenario1<-list(y=data$y,d=data$d,p=data$p,z=data$z) ## 4 Generate initial values ## lr2<-lm(y~d+z,data_scenario1) beta_gamma0<-lr2$coef sde0<-summary(lr2)$sigma param0<-c(beta_gamma0,sde0) ## 5 Apply method ## ##Scenario I mixture<-mixture(param0,r2,q,data_scenario1) ##Mixture (Zhang et al.)

Index Topic \textasciitildekwd1 Assoc_EM_LRT_DOSE, 2 data.gen, 2 EM_LRT_DOSE, 3 EM_LRT_PROB, 6 Mixture, 7 Topic \textasciitildekwd2 Assoc_EM_LRT_DOSE, 2 data.gen, 2 EM_LRT_DOSE, 3 EM_LRT_PROB, 6 Mixture, 7 Assoc_EM_LRT_DOSE, 2 data.gen, 2, 4, 6, 7 EM_LRT_DOSE, 3 EM_LRT_PROB, 6, 7 Mixture, 6, 7 9