Package SimHap. February 15, 2013
|
|
- Sharon Tate
- 5 years ago
- Views:
Transcription
1 Package SimHap February 15, 2013 Type Package Title A comprehensive modeling framework for epidemiological outcomes and a simulation-based approach to haplotypic analysis of population-based data Version Date Depends stats, survival, nlme Author Pamela A. McCaskie Maintainer Matthew Kowgier <matthew.kowgier@oicr.on.ca> SimHap is a package for genetic association testing. It can perform single SNP and multi-locus (haplotype) association analyses for continuous Normal, binary, longitudinal and right-censored outcomes measured in population-based samples. SimHap uses estimation maximisation techniques for inferring haplotypic phase in individuals, and incorporates a novel simulation-based approach to deal with the uncertainty of imputed haplotypes in association testing. License GPL-2 Repository CRAN Date/Publication :39:28 NeedsCompilation no R topics documented: epi.bin epi.cc.match epi.long epi.quant epi.surv
2 2 epi.bin haplo.bin haplo.cc.match haplo.long haplo.quant haplo.surv infer.haplos infer.haplos.cc longpheno.dat make.haplo.rare pheno.dat prepare.cc snp.bin snp.cc.match SNP.dat snp.long snp.quant snp.surv SNP2Geno SNP2Haplo SNPlong.dat SNPsurv.dat summary.epibin summary.epiclogit summary.epilong summary.epiquant summary.episurv summary.hapbin summary.hapclogit summary.haplong summary.hapquant summary.hapsurv summary.snpbin summary.snpclogit summary.snplong summary.snpquant summary.snpsurv survpheno.dat Index 73 epi.bin Epidemiological analysis for binary outcomes epi.bin is used to fit generalized linear regression models to epidemiological phenotype data for a binary outcome, assuming a binomial error distribution and logit link function.
3 epi.bin 3 epi.bin(formula, pheno, sub = NULL) Arguments formula pheno sub a symbolic description of the model to be fit. The details of model specification are given below. a dataframe containing phenotype data. an expression representing a subset of the data on which to perform the model. Details formula should be in the form of outcome ~ predictor(s). A formula has an implied intercept term. See documentation for formula function for more details of allowed formulae. Value epi.bin returns an object of class epibin containing the following items formula results fit.glm ANOD loglik AIC formula passed to epi.bin. a table containing the odds ratios, confidence intervals and p-values of the parameter estimates. a glm object fit using formula. analysis of deviance table for the model fit using formula. the log-likelihood for the linear model fit using formula. Akaike Information Criterion for the linear model fit using formula. Author(s) Pamela A. McCaskie References Dobson, A.J. (1990) An Introduction to Generalized Linear Models. London: Chapman and Hall. Hastie, T.J., Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S, eds Chambers, J.M., Hastie, T.J., Wadsworth & Brooks/Cole. McCaskie, P.A., Carter, K.W. Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple-imputation approach to haplotypic analysis of population-based data, [online] See Also McCullagh, P., Nelder, J.A. (1989) Generalized Linear Models. London: Chapman and Hall. Venables, W.N., Ripley, D.B. (2002) Modern Applied Statistics with S. New York: Springer. snp.bin, haplo.bin, epi.quant
4 4 epi.cc.match data(pheno.dat) mymodel <- epi.bin(formula=plaque~age+sbp, pheno=pheno.dat) # example with a subsetting variable, looking at males only mymodel <- epi.bin(formula=plaque~age+sbp, pheno=pheno.dat, sub=expression(sex==1)) epi.cc.match Epidemiological analysis for matched case-control data epi.cc.match is used to fit conditional logistic regression models to matched case-control data. epi.cc.match(formula, pheno, sub = NULL) Arguments formula pheno sub a symbolic description of the model to be fit. The response must be binary indicator of case-control status, and the formula must contain a variable indicating strata, or the matching sequence. a dataframe containing phenotype data. an expression representing a subset of the data on which to perform the models. Details formula should be in the form: response ~ predictor(s) + strata(strata_variable). Value epi.cc.match returns an object of class epiclogit containing the following items results formula Wald loglik fit.coxph rsquared a table containing the odds ratios, confidence intervals and p-values of the parameter estimates. formula passed to epi.cc.match. The Wald test for overall significance of the fitted model including SNP parameters. the log-likelihood for the linear model fit using formula. an object of class clogit fit using formula1. See clogit for details. r-squared values for the fitted model.
5 epi.long 5 Author(s) Pamela A McCaskie References McCaskie, P.A., Carter, K.W. Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple-imputation approach to haplotypic analysis of population-based data, [online] See Also epi.bin, clogit data(pheno.dat) mymodel <- epi.cc.match(formula=disease~sbp+dbp+strata(strat), pheno=pheno.dat) # example with subsetting variable mymodel <- epi.cc.match(formula=disease~sbp+dbp+strata(strat), pheno=pheno.dat, sub=expression(sex==1)) epi.long Epidemiological analysis for longitudinal data epi.long is used to fit linear mixed effects models to epidemiological data with longitudinal outcomes. epi.long(fixed, random, pheno, cor="corcar1", value = 0.2, form=~1, sub = NULL) Arguments fixed as per lme. A two-sided linear formula object describing the fixed-effects part of the model including SNP parameters, with the response on the left of a ~ operator and the terms, separated by + operators.
6 6 epi.long random pheno cor as per lme. A one-sided formula of the form ~x1+...+xn g1/.../gm, with x1+...+xn specifying the model for the random effects and g1/.../gm the grouping structure (m may be equal to 1, in which case no / is required). The random effects formula will be repeated for all levels of grouping, in the case of multiple levels of grouping. a dataframe containing phenotype data. a corstruct object describing the within-group correlation structure. Available correlation structures are corar1, corcar1, and corcompsymm. See the documentation of corclasses for a description of these. Defaults to corcar1. value for corar1 - the value of the lag 1 autocorrelation, which must be between -1 and 1. For corcar1 - the correlation between two observations one unit of time apart. Must be between 0 and 1. For corcompsymm - the correlation between any two correlated observations. Defaults to 0.2. form sub Details Value a one sided formula of the form ~ t, or ~ t g, specifying a time covariate t and, optionally, a grouping factor g. A covariate for this correlation structure must be integer valued. When a grouping factor is present in form, the correlation structure is assumed to apply only to observations within the same grouping level; observations with different grouping levels are assumed to be uncorrelated. Defaults to ~ 1, which corresponds to using the order of the observations in the data as a covariate, and no groups. an expression representing a subset of the data on which to perform the models. cor will always default to corcar1 and value will always default to 0.2. Be sure to change both parameters accordingly if desired. See corclasses for more details. epi.long returns an object of class epilong. The summary function can be used to obtain and print a summary of the results. An object of class epilong is a list containing the following components: results fixed_formula a table containing the coefficients, standard errors and p-values of the parameter estimates. fixed effects formula. random_formula random effects formula. fit.lme ANOD loglik AIC corstruct Author(s) Pamela A McCaskie a lme object fit using formula. analysis of deviance table for the fitted model. the log-likelihood for the fitted model. Akaike Information Criterion for the model fit using formula. correlation structure used in the fitted model.
7 epi.long 7 References Bates, D.M., Pinheiro, J.C. (1998) Computational methods for multilevel models. Available in PostScript or PDF formats at Box, G.E.P., Jenkins, G.M., Reinsel, G.C. (1994) Time Series Analysis: Forecasting and Control, 3rd Edition, Holden-Day. Davidian, M., Giltinan, D.M. (1995) Nonlinear Mixed Effects Models for Repeated Measurement Data, Chapman and Hall. Laird, N.M., Ware, J.H. (1982) Random-Effects Models for Longitudinal Data, Biometrics, 38, Lindstrom, M.J., Bates, D.M. (1988) Newton-Raphson and EM Algorithms for Linear Mixed- Effects Models for Repeated-Measures Data, Journal of the American Statistical Association, 83, Littel, R.C., Milliken, G.A., Stroup, W.W., Wolfinger, R.D. (1996) SAS Systems for Mixed Models, SAS Institute. McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] Pinheiro, J.C., Bates, D.M. (1996) Unconstrained Parametrizations for Variance-Covariance Matrices, Statistics and Computing, 6, Pinheiro, J.C., Bates, D.M. (2000) Mixed-Effects Models in S and S-PLUS, Springer. See Also snp.long, haplo.long, corclasses data(longpheno.dat) mymodel <- epi.long(fixed=fev1f~height+weight, random=~1 ID, pheno=longpheno.dat, form=~year ID) # example with a subsetting variable, looking at males only mymodel <- epi.long(fixed=fev1f~height+weight, random=~1 ID, pheno=longpheno.dat, form=~year ID, sub=expression(sex==1))
8 8 epi.quant epi.quant Epidemiological analysis for quantitative outcomes epi.quant is used to fit linear regression models to single SNP genotype and phenotype data for a continuous Normal outcome. epi.quant(formula, pheno, sub = NULL) Arguments formula pheno sub a symbolic description of the full model to be fit. The details of model specification are given below. a dataframe containing phenotype data. an expression representing a subset of the data on which to perform the models. Details formula should be in the form of response ~ predictor(s). A formula has an implied intercept term. See documentation for formula function for more details of allowed formulae. Value epi.quant returns an object of class epiquant containing the following items formula results fit.lm ANOD loglik AIC formula passed to epi.quant. a table containing the coefficients, standard errors and p-values of the parameter estimates. a lm object fit using formula. analysis of deviance table for the model fit using formula. the log-likelihood for the linear model fit using formula. Akaike Information Criterion for the linear model fit using formula. Author(s) Pamela A. McCaskie
9 epi.surv 9 References Chambers, J.M. (1992) Linear models. Chapter 4 of Statistical Models in S, eds Chambers, J.M., Hastie, T.J., Wadsworth & Brooks/Cole. McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] Wilkinson, G.N., Rogers, C.E. (1973) Symbolic descriptions of factorial models for analysis of variance. Applied Statistics, 22, See Also snp.quant, haplo.quant, epi.bin data(pheno.dat) mymodel <- epi.quant(formula=ldl~age+sbp, pheno=pheno.dat) # example with a subsetting variable, looking at males only mymodel <- epi.quant(formula=ldl~age+sbp, pheno=pheno.dat, sub=expression(sex==1)) epi.surv Epidemiological analysis for survival data epi.surv is used to fit Cox proportional hazards models to epidemiological survival data. epi.surv(formula, pheno, sub = NULL) Arguments formula pheno sub a symbolic description of the model to be fit. The response must be a survival object as returned by the Surv function. a dataframe containing phenotype data. an expression representing a subset of the data on which to perform the models. Details formula should be in the form of response ~ predictor(s). A formula has an implied intercept term. See documentation for the formula function for more details of allowed formulae.
10 10 epi.surv Value epi.surv returns an object of class episurv containing the following items results formula Wald loglik fit.coxph rsquared a table containing the hazard ratios, confidence intervals and p-values of the parameter estimates. formula passed to epi.surv. The Wald test for overall significance of the fitted model including SNP parameters. the log-likelihood for the linear model fit using formula. an object of class coxph fit using formula1. See coxph.object for details. r-squared values for the fitted model. Author(s) Pamela A McCaskie References Andersen, P., Gill, R. (1982) Cox s regression model for counting processes, a large sample study, Annals of Statistics, 10: McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] Therneau, T., Grambsch, P., Fleming, T. Martingale based residuals for survival models, Biometrika, 77(1): See Also snp.surv, haplo.surv data(survpheno.dat) mymodel <- epi.surv(formula=surv(time, status)~age, pheno=survpheno.dat) # example with subsetting variable mymodel <- epi.surv(formula=surv(time, status)~age, pheno=survpheno.dat, sub=expression(sex==1))
11 haplo.bin 11 haplo.bin Haplotype analysis for a binary trait haplo.bin performes a series of generalized linear models using a simulation-based approach to account for uncertainty in haplotype assignment when phase is unknown. haplo.bin(formula1, formula2, pheno, haplo, sim, effect = "add", sub = NULL, adjust=false) Arguments formula1 formula2 pheno haplo sim effect sub adjust a symbolic description of the full model including haplotype parameters to be fit. The details of model specification are given below. a symbolic description of the nested model excluding haplotype parameters, to be compared to formula1 in a likelihood ratio test. a phenotype data set. a haplo object made by make.haplo.rare. The subjects must in the same order as they are in the phenotype data. the number of simulations from which to evaluate the results. the genetic effect type: "add" for additive, "dom" for dominant and "rec" for recessive. Defaults to additive. See note. an expression representing a subset of the data on which to perform the models. a logical flag. If adjust=true, the adjusted degrees of freedom is used. This is recommended when the computed degrees of freedom is larger than the complete data degrees of freedom. By default, adjust=false. Details Value formula1 should be in the form outcome ~ predictor(s) + haplotype(s) and formula2 should be in the form outcome ~ predictor(s). A formula has an implied intercept term. See documentation for the formula function for more details of allowed formulae. haplo.bin returns an object of class hapbin. The summary function can be used to obtain and print a summary of the results. An object of class hapbin is a list containing the following components: formula1 formula1 formula1 passed to haplo.bin. formula2 passed to haplo.bin.
12 12 haplo.bin Note results a table containing the coefficients, averaged over the sim models performed; standard errors, computed as the sum of the between-imputation and withinimputation variance; and p-values, based on a t-distribution with appropriately computed degrees of freedom, of the parameter estimates. empiricalresults a list containing the odds ratios, confidence intervals and p-values calculated at each simulation. ANOD loglik WALD aic aicpredicted effect analysis of deviance table for the model fit using formula1, averaged over all simulations. the average log-likelihood for the generalized linear model fit using formula1. a Wald test, testing for significant improvement of the model when haplotypic parameters are included. Akaike Information Criterion for the generalized linear model fit using formula1, averaged over all simulations. Akaike Information Criteria calculated at each simulation. the haplotypic effect modelled, ADDITIVE, DOMINANT or RECESSIVE. To model a codominant haplotypic effect, define the desired haplotype as a factor in the formula1 argument. e.g. factor(h.aaa), and use the default option for effect. Author(s) Pamela A. McCaskie References Dobson, A.J. (1990) An Introduction to Generalized Linear Models. London: Chapman and Hall. Hastie, T.J., Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S, eds Chambers, J.M., Hastie, T.J., Wadsworth & Brooks/Cole. Little, R.J.A., Rubin, D.B. (2002) Statistical Analysis with Missing Data. John Wiley and Sons, New Jersey. McCaskie, P.A., Carter, K.W. Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple-imputation approach to haplotypic analysis of population-based data, [online] McCullagh, P., Nelder, J.A. (1989) Generalized Linear Models. London: Chapman and Hall. Rubin, D.B. (1996) Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Society, 91: Venables, W.N., Ripley, D.B. (2002) Modern Applied Statistics with S. New York: Springer. Barnard, J., Rubin, D.B. (1999) Small-sample degrees of freedom with multiple imputation. Biometrika, 86, See Also snp.bin, haplo.quant, haplo.quant, haplo.long
13 haplo.cc.match 13 data(snp.dat) # convert SNP.dat to format required by infer.haplos haplo.dat <- SNP2Haplo(SNP.dat) data(pheno.dat) # generate haplotype frequencies and haplotype design matrix myinfer<-infer.haplos(haplo.dat) # print haplotype frequencies generated by infer.haplos myinfer$hap.freq # generate haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) mymodel <- haplo.bin(formula1=plaque~age+sbp+h.n1aa, formula2=plaque~age+sbp, pheno=pheno.dat, haplo=myhaplo, sim=10) # example with a subsetting variable, looking at males only # and modelling a dominant haplotypic effect mymodel <- haplo.bin(formula1=plaque~age+sbp+h.n1aa, formula2=plaque~age+sbp, pheno=pheno.dat, haplo=myhaplo, sim=10, effect="dom", sub=expression(sex==1)) haplo.cc.match Haplotype analysis for matched case-control data haplo.surv performs a series of conditional logistic regression models to matched case-control data with haplotypes using a simulation-based approach to account for uncertainty in haplotype assignment when phase is unknown. haplo.cc.match(formula1, formula2, pheno, haplo, sim, effect = "add", sub = NULL) Arguments formula1 a symbolic description of the full model to be fit, including haplotype parameters. The response must be binary indicator of case-control status, and the formula must contain a variable indicating strata, or the matching sequence.
14 14 haplo.cc.match formula2 pheno haplo sim effect sub a symbolic description of the nested model excluding haplotype parameters, to be compared to formula1 in a likelihood ratio test. The response must be binary indicator of case-control status, and the formula must contain a variable indicating strata, or the matching sequence. a dataframe containing phenotype data. a haplotype object made by make.haplo.rare. the number of simulations from which to evaluate the results. the genetic effect type: "add" for additive, "dom" for dominant and "rec" for recessive. Defaults to additive. See note. optional. An expression using a binary operator, representing a subset of individuals on which to perform analysis. e.g. sub=expression(sex==1). Details Value formula1 should be in the form: response ~ predictor(s) + strata(strata_variable) + haplotype(s) and formula2 should be in the form: response ~ predictor(s) + strata(strata_variable). If case-control data is not matched, the haplo.bin function should be used. haplo.cc.match returns an object of class hapclogit. The summary function can be used to obtain and print a summary of the results. An object of class hapclogit is a list containing the following components: formula1 formula2 formula1 passed to haplo.cc.match. formula2 passed to haplo.cc.match. results a table containing the odds ratios, confidence intervals and p-values of the parameter estimates, averaged over the n=sim models performed. empiricalresults a list containing the odds ratios, confidence intervals and p-values calculated at each simulation loglik LRT ANOVA Wald rsquared effect the average log-likelihood for the n=sim linear models fit using formula1. a likelihood ratio test, testing for significant improvement of the model when haplotypic parameters are included analysis of variance, comparing the two models fit with and without haplotypic parameters. The Wald test for overall significance of the fitted model including haplotypes. r-squared values for models fit using formula1 and formula2. the haplotypic effect modelled, ADDITIVE, DOMINANT or RECESSIVE.
15 haplo.cc.match 15 Note To model a codominant haplotypic effect, define the desired haplotype as a factor in the formula1 argument. e.g. factor(h.aaa), and use the default option for effect. Author(s) Pamela A. McCaskie References Little, R.J.A., Rubin, D.B. (2002) Statistical Analysis with Missing Data. John Wiley and Sons, New Jersey. McCaskie, P.A., Carter, K.W. Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple-imputation approach to haplotypic analysis of population-based data, [online] Rubin, D.B. (1996) Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Society, 91: See Also snp.cc.match, haplo.bin data(snp.dat) # convert SNP.dat to format required by infer.haplos haplo.dat <- SNP2Haplo(SNP.dat) data(pheno.dat) newdata <- prepare.cc(geno=haplo.dat, pheno=pheno.dat, cc.var="disease") newhaplo.dat <- newdata$geno newpheno.dat <- newdata$pheno # generates haplotype frequencies and haplotype design matrix myinfer<-infer.haplos.cc(geno=newhaplo.dat, pheno=newpheno.dat, cc.var="disease") # prints haplotype frequencies among cases myinfer$hap.freq.cases # prints haplotype frequencies among controls myinfer$hap.freq.controls # generate haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05)
16 16 haplo.long mymodel <- haplo.cc.match(formula1=disease~sbp+dbp+h.n1aa+strata(strat), formula2=disease~sbp+dbp+strata(strat), haplo=myhaplo, pheno=pheno.dat, sim=10) # example using a subsetting variable - looking at males only mymodel <- haplo.cc.match(formula1=disease~sbp+dbp+h.n1aa+strata(strat), formula2=disease~sbp+dbp+strata(strat), haplo=myhaplo, pheno=pheno.dat, sim=10, sub=expression(sex==1)) haplo.long Haplotype analysis for longitudinal data haplo.long performes a series of linear mixed effects models using a simulation-based approach to account for uncertainty in haplotype assignment when phase is unknown. haplo.long(fixed, random, pheno, haplo, cor=null, value = 0.2, form=~1, sim, effect = "add", sub = NULL, adjust=false) Arguments fixed random pheno haplo cor as per lme. A two-sided linear formula object describing the fixed-effects part of the model including SNP parameters, with the response on the left of a ~ operator and the terms, separated by + operators. as per lme. A one-sided formula of the form ~x1+...+xn g1/.../gm, with x1+...+xn specifying the model for the random effects and g1/.../gm the grouping structure (m may be equal to 1, in which case no / is required). The random effects formula will be repeated for all levels of grouping, in the case of multiple levels of grouping. a dataframe containing phenotype data. a haplotype object made by make.haplo.rare. The subjects must in the same order as they are in the phenotype data. a corstruct object describing the within-group correlation structure. Available correlation structures are corar1, corcar1, and corcompsymm. See the documentation of corclasses for a description of these. Defaults to NULL corresponding to no within-subject correlation. value for corar1 - the value of the lag 1 autocorrelation, which must be between -1 and 1. For corcar1 - the correlation between two observations one unit of time apart. Must be between 0 and 1. For corcompsymm - the correlation between any two correlated observations. Defaults to 0.2.
17 haplo.long 17 form sim effect sub adjust a one sided formula of the form ~ t, or ~ t g, specifying a time covariate t and, optionally, a grouping factor g. A covariate for this correlation structure must be integer valued. When a grouping factor is present in form, the correlation structure is assumed to apply only to observations within the same grouping level; observations with different grouping levels are assumed to be uncorrelated. Defaults to ~ 1, which corresponds to using the order of the observations in the data as a covariate, and no groups. the number of simulations from which to evaluate the results. the haplotypic effect type: "add" for additive, "dom" for dominant and "rec" for recessive. Defaults to additive. See note. optional. An expression representing a subset of individuals on which to perform analysis. e.g. sub=expression(sex==1). a logical flag. If adjust=true, the adjusted degrees of freedom is used. This is recommended when the computed degrees of freedom is larger than the complete data degrees of freedom. By default, adjust=false. Value Note haplo.long returns an object of class haplong. The summary function can be used to obtain and print a summary of the results. An object of class haplong is a list containing the following components: fixed_formula fixed effects formula. random_formula random effects formula. results a table containing the coefficients, averaged over the sim models performed; standard errors, computed as the sum of the between-imputation and withinimputation variance; and p-values, based on a t-distribution with appropriately computed degrees of freedom, of the parameter estimates. empiricalresults a list containing the coefficients, standard errors and p-values calculated at each simulation. ANOD loglik AIC aicempirical corstruct effect analysis of deviance table for the fitted model. the log-likelihood for the fitted model. Akaike Information Criterion for the linear model fit using formula. Akaike Information Criteria calculated at each simulation. correlation structure used in the fitted model. the haplotypic effect modelled, ADDITIVE, DOMINANT or RECESSIVE To model a codominant haplotypic effect, define the desired haplotype as a factor in the formula1 argument. e.g. factor(h.aaa), and use the default option for effect Author(s) Pamela A. McCaskie
18 18 haplo.long References Bates, D.M., Pinheiro, J.C. (1998) Computational methods for multilevel models. Available in PostScript or PDF formats at Box, G.E.P., Jenkins, G.M., Reinsel, G.C. (1994) Time Series Analysis: Forecasting and Control, 3rd Edition, Holden-Day. Davidian, M., Giltinan, D.M. (1995) Nonlinear Mixed Effects Models for Repeated Measurement Data, Chapman and Hall. Laird, N.M., Ware, J.H. (1982) Random-Effects Models for Longitudinal Data, Biometrics, 38, Lindstrom, M.J., Bates, D.M. (1988) Newton-Raphson and EM Algorithms for Linear Mixed- Effects Models for Repeated-Measures Data, Journal of the American Statistical Association, 83, Littel, R.C., Milliken, G.A., Stroup, W.W., Wolfinger, R.D. (1996) SAS Systems for Mixed Models, SAS Institute. Little, R.J.A., Rubin, D.B. (2002) Statistical Analysis with Missing Data. John Wiley and Sons, New Jersey. McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] Pinheiro, J.C., Bates, D.M. (1996) Unconstrained Parametrizations for Variance-Covariance Matrices, Statistics and Computing, 6, Pinheiro, J.C., Bates, D.M. (2000) Mixed-Effects Models in S and S-PLUS, Springer. Rubin, D.B. (1996) Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Society, 91: Barnard, J., Rubin, D.B. (1999) Small-sample degrees of freedom with multiple imputation. Biometrika, 86, See Also snp.long, haplo.quant, haplo.quant, haplo.long data(snplong.dat) # convert SNP.dat to format required by infer.haplos longhaplo.dat <- SNP2Haplo(SNPlong.dat) data(longpheno.dat) # generate haplotype frequencies and haplotype design matrix myinfer<-infer.haplos(longhaplo.dat) # print haplotype frequencies generated by infer.haplos myinfer$hap.freq
19 haplo.quant 19 # generate haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) mymodel <- haplo.long(fixed=fev1f~h.acv2, random=~1 ID, pheno=longpheno.dat, haplo=myhaplo, cor="corcar1", form=~year ID, sim=10) # example with a subsetting variable - looking at males only mymodel <- haplo.long(fixed=fev1f~height+h.acv2, random=~1 ID, pheno=longpheno.dat, haplo=myhaplo, cor="corcar1", form=~year ID, sim=10, sub=expression(sex==1)) haplo.quant Haplotype analysis for a Normally distributed quantitative trait haplo.quant performs a series of linear models using a simulation-based approach to account for uncertainty in haplotype assignment when phase is unknown. haplo.quant(formula1, formula2, pheno, haplo, sim, effect = "add", sub = NULL, predict_variable = NULL, adjust = FALSE) Arguments formula1 formula2 pheno haplo sim effect a symbolic description of the full model including haplotype parameters to be fit. The details of model specification are given below. a symbolic description of the nested model excluding haplotype parameters, to be compared to formula1 in a likelihood ratio test. a dataframe containing phenotype data. a haplotype object made by make.haplo.rare. The subjects must in the same order as they are in the phenotype data. the number of simulations from which to evaluate the results. the haplotypic effect type: "add" for additive, "dom" for dominant and "rec" for recessive. Defaults to additive. See note. sub optional. An expression representing a subset of individuals on which to perform analysis. e.g. sub=expression(sex==1). predict_variable an expression using a binary operator, representing a subset of the data on which to perform the models
20 20 haplo.quant adjust a logical flag. If adjust=true, the adjusted degrees of freedom is used. This is recommended when the computed degrees of freedom is larger than the complete data degrees of freedom. By default, adjust=false. Details Value Note formula1 should be in the form of response ~ predictor(s) + haplotype(s) and formula2 should be in the form response ~ predictor(s). A formula has an implied intercept term. See formula for more details of allowed formulae. haplo.quant returns an object of class hapquant. The summary function can be used to obtain and print a summary of the results. An object of class hapquant is a list containing the following components: formula1 formula1 formula1 passed to haplo.quant. formula2 passed to haplo.quant. results a table containing the coefficients, averaged over the sim models performed; standard errors, computed as the sum of the between-imputation and withinimputation variance; and p-values, based on a t-distribution with appropriately computed degrees of freedom, of the parameter estimates. empiricalresults a list containing the coefficients, standard errors and p-values calculated at each simulation. rsquared ANOD loglik WALD r-squared values for models fit using formula1 and formula2. analysis of deviance table for the model fit using formula1, averaged over all simulations. the average log-likelihood for the linear model fit using formula1. a Wald test, testing for significant improvement of the model when haplotypic parameters are included. predicted estimated marginal means of the outcome variable broken down by haplotype levels, evaluated at mean values of the model predictors, averaged over all simulations. empiricalpredicted estimated marginal means calculated at each simulation. aic aicpredicted effect Akaike Information Criterion for the linear model fit using formula1, averaged over all simulations. Akaike Information Criteria calculated at each simulation. the haplotypic effect modelled, ADDITIVE, DOMINANT or RECESSIVE. To model a codominant haplotypic effect, define the desired haplotype as a factor in the formula1 argument. e.g. factor(h.aaa), and use the default option for effect
21 haplo.quant 21 Author(s) Pamela A. McCaskie References Chambers, J.M. (1992) Linear models. Chapter 4 of Statistical Models in S, eds Chambers, J.M., Hastie, T.J., Wadsworth & Brooks/Cole. Little, R.J.A., Rubin, D.B. (2002) Statistical Analysis with Missing Data. John Wiley and Sons, New Jersey. McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] Rubin, D.B. (1996) Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Society, 91: Wilkinson, G.N., Rogers, C.E. (1973) Symbolic descriptions of factorial models for analysis of variance. Applied Statistics, 22, Barnard, J., Rubin, D.B. (1999) Small-sample degrees of freedom with multiple imputation. Biometrika, 86, See Also haplo.bin data(snp.dat) # convert SNP.dat to format required by infer.haplos haplo.dat <- SNP2Haplo(SNP.dat) data(pheno.dat) # generate haplotype frequencies and haplotype design matrix myinfer<-infer.haplos(haplo.dat) # print haplotype frequencies generated by infer.haplos myinfer$hap.freq # generate haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) mymodel <- haplo.quant(formula1=hdl~age+sbp+h.n1aa, formula2=hdl~age+sbp, pheno=pheno.dat, haplo=myhaplo, sim=10) # example using a variable for which to predict marginal means mymodel <- haplo.quant(formula1=hdl~age+sbp+factor(h.n1aa), formula2=hdl~age+sbp, pheno=pheno.dat, haplo=myhaplo, sim=10,
22 22 haplo.surv predict_variable="h.n1aa") # example with a subsetting variable, looking at males only # and modelling a dominant haplotypic effect mymodel <- haplo.quant(formula1=hdl~age+sbp+h.n1aa, formula2=hdl~age+sbp, pheno=pheno.dat, haplo=myhaplo, sim=10, effect="dom", sub=expression(sex==1)) haplo.surv Haplotype analysis for survival data haplo.surv performs a series of Cox proportional hazards models to survival data with haplotypes using a simulation-based approach to account for uncertainty in haplotype assignment when phase is unknown. haplo.surv(formula1, formula2, pheno, haplo, sim, effect = "add", sub = NULL) Arguments formula1 formula2 pheno haplo sim effect sub a symbolic description of the full model to be fit, including haplotype parameters. The response must be a survival object as returned by the Surv function. a symbolic description of the nested model excluding haplotype parameters, to be compared to formula1 in a likelihood ratio test. The response must be a survival object as returned by the Surv function. a dataframe containing phenotype data. a haplotype object made by make.haplo.rare. the number of simulations from which to evaluate the results. the genetic effect type: "add" for additive, "dom" for dominant and "rec" for recessive. Defaults to additive. See note. optional. An expression using a binary operator, representing a subset of individuals on which to perform analysis. e.g. sub=expression(sex==1). Details formula1 should be in the form of response ~ predictor(s) + haplotype(s) and formula2 should be in the form response ~ predictor(s). A formula has an implied intercept term. See documentation for the formula function for more details of allowed formulae.
23 haplo.surv 23 Value Note haplo.surv returns an object of class hapsurv. The summary function can be used to obtain and print a summary of the results. An object of class hapsurv is a list containing the following components: formula1 formula2 results formula1 passed to haplo.surv. formula2 passed to haplo.surv. a table containing the hazard ratios, confidence intervals and p-values of the parameter estimates, averaged over the n=sim models performed. empiricalresults a list containing the hazard ratios, confidence intervals and p-values calculated at each simulation loglik the average log-likelihood for the n=sim linear models fit using formula1. LRT predicted ANOVA Wald rsquared effect a likelihood ratio test, testing for significant improvement of the model when haplotypic parameters are included estimated marginal means of the outcome variable broken down by haplotype levels, evaluated at mean values of the model predictors, averaged over all simulations. analysis of variance, comparing the two models fit with and without haplotypic parameters. The Wald test for overall significance of the fitted model including haplotypes. r-squared values for models fit using formula1 and formula2. the haplotypic effect modelled, ADDITIVE, DOMINANT or RECESSIVE. To model a codominant haplotypic effect, define the desired haplotype as a factor in the formula1 argument. e.g. factor(h.aaa), and use the default option for effect. Author(s) Pamela A. McCaskie References Andersen, P., Gill, R. (1982) Cox s regression model for counting processes, a large sample study, Annals of Statistics, 10: Little, R.J.A., Rubin, D.B. (2002) Statistical Analysis with Missing Data. John Wiley and Sons, New Jersey. McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] Rubin, D.B. (1996) Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Society, 91: Therneau, T., Grambsch, P., Fleming, T. Martingale based residuals for survival models, Biometrika, 77(1):
24 24 infer.haplos See Also snp.surv, haplo.bin, haplo.quant, haplo.long data(snpsurv.dat) # convert SNP.dat to format required by infer.haplos survhaplo.dat <- SNP2Haplo(SNPsurv.dat) data(survpheno.dat) # generate haplotype frequencies and haplotype design matrix myinfer<-infer.haplos(survhaplo.dat) # print haplotype frequencies generated by infer.haplos myinfer$hap.freq # generate haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) mymodel <- haplo.surv(formula1=surv(time, status)~age+h.v1aa, formula2=surv(time, status)~age, haplo=myhaplo, pheno=survpheno.dat, sim=10) # example using a subsetting variable - looking at males only mymodel <- haplo.surv(formula1=surv(time, status)~age+h.v1aa, formula2=surv(time, status)~age, haplo=myhaplo, pheno=survpheno.dat, sim=10, sub=expression(sex==1)) infer.haplos Infer haplotype configurations when phase is unknown infer.haplos generates a haplotype object to be used in association analysis. infer.haplos(geno) Arguments geno a genotype data frame where each SNP is represented by two columns, one for each allele, in the form of haplo.dat
25 infer.haplos 25 Value infer.haplos returns a list containing the following items hapmat hap.freq initfreq a dataframe containing all possible haplotype configurations with their respective likelihoods, for each individual. haplotype frequencies estimated using the EM algorithm, and the standard errors of these frequencies. initial haplotype frequencies to be used by other SimHap functions. Author(s) Pamela A. McCaskie References Excoffier, L., Slatkin, M.. (1995) Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology Evolution, 12(5): McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] See Also make.haplo.rare data(snp.dat) # convert SNP.dat to format required by infer.haplos haplo.dat <- SNP2Haplo(SNP.dat) data(pheno.dat) # generates haplotype frequencies and haplotype design matrix myinfer<-infer.haplos(haplo.dat) # prints haplotype frequencies generated by infer.haplos myinfer$hap.freq # generates haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) mymodel <- haplo.quant(formula1=hdl~age+sbp+h.n1aa, formula2=hdl~age+sbp, pheno=pheno.dat, haplo=myhaplo, sim=10)
26 26 infer.haplos.cc infer.haplos.cc Infer haplotype configuration independently in cases and controls infer.haplos.cc generates a haplotype object to be used in association analysis. infer.haplos.cc(geno, pheno, cc.var) Arguments geno pheno cc.var a genotype data frame where each SNP is represented by two columns, one for each allele, in the form of haplo.dat. a data frame containing phenotype data with at least two columns - a subject identifier and an indicator of disease status. the column name of the parameter indicating disease status. Must be entered with quotations, e.g. DISEASE". Details Value Note cc.var must be binary, taking only values 0 or 1. infer.haplos.cc returns a list containing the following items hapmat a dataframe containing all possible haplotype configurations with their respective likelihoods, for each individual. hap.freq.cases haplotype frequencies among cases estimated using the EM algorithm, and the standard errors of these frequencies. hap.freq.controls haplotype frequencies among controls estimated using the EM algorithm, and the standard errors of these frequencies. init.freq.cases initial haplotype frequencies among cases to be used by other SimHap functions. init.freq.controls initial haplotype frequencies among controls to be used by other SimHap functions. infer.haplos.cc is to be used in place of infer.haplos when haplotypes and haplotype frequencies are to be inferred independently in cases and controls. geno and pheno should have individuals in the same order, with the subject identifier column in ascending order.
27 infer.haplos.cc 27 Author(s) Pamela A. McCaskie References McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] Stram, D.O., Leigh Pearce, C., Bretsky, P., Freedman, M., Hirschhorn, J.N., Altshuler, D., Kolonel, L.N., Henderson, B.E., Thomas, D.C. (2003) Modeling and EM Estimation of Haplotype-Specific Relative Risks from Genotype Data for a Case-Control Study of Unrelated Individuals, Human Heredity, 55: See Also infer.haplos, prepare.cc data(snp.dat) # convert SNP.dat to format required by infer.haplos haplo.dat <- SNP2Haplo(SNP.dat) data(pheno.dat) newdata <- prepare.cc(geno=haplo.dat, pheno=pheno.dat, cc.var="disease") newhaplo.dat <- newdata$geno newpheno.dat <- newdata$pheno # generates haplotype frequencies and haplotype design matrix myinfer<-infer.haplos.cc(geno=newhaplo.dat, pheno=newpheno.dat, cc.var="disease") # prints haplotype frequencies among cases myinfer$hap.freq.cases # prints haplotype frequencies among controls myinfer$hap.freq.controls # generated haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) mymodel <- haplo.quant(formula1=hdl~age+sbp+h.n1aa, formula2=hdl~age+sbp, pheno=newpheno.dat, haplo=myhaplo, sim=10)
28 28 longpheno.dat longpheno.dat Example phenotypic, longitudinal data Format Details longpheno.dat is an example phenotypic data set containing biological measures to be used by snp.long and haplo.long data(longpheno.dat) A data frame with 601 observations on the following 12 variables. ID patient identifier. year year of survey. time time point in years, where 1966=1. sex 1=male, 0=female. age age in years. height height in metres. weight weight in kilograms. bmi body-mass index. fev1f forced expired volume in the first second - measure of lung function. longpheno.dat was simulated to take the format of the Busselton Health Survey from Western Australia. data(snplong.dat) # transforms SNPlong.dat to an object containing 3 columns # per SNP - additive, dominant and recessive, where genotypes # defined in baseline serve as the baseline genotypes longgeno.dat <- SNP2Geno(SNPlong.dat, baseline=c("aa", "GG", "V2V2")) data(longpheno.dat) mymodel <- snp.long(fixed=fev1f~snp_1_add, random=~1 ID, geno=longgeno.dat, pheno=longpheno.dat, form=~year ID)
29 make.haplo.rare 29 make.haplo.rare Group rare haplotypes together make.haplo.rare groups haplotypes with frequencies below a specified threshold together and processes data into a format compatible with the haplotype analysis functions make.haplo.rare(infer.object, min.freq) Arguments infer.object min.freq result of a call to infer.haplos. minimum frequency of haplotypes to include in analysis. Haplotype with a frequency below this value will be grouped together in a group called rare. Value hapdata hapobject A data frame containing all haplotype configurations and their posterior probabilities for each individual, grouping rare haplotypes into a category called rare. A list containing the original haplotype information for each individual as well as haplotype frequency tables Author(s) Pamela A. McCaskie References McCaskie, P.A., Carter, K.W, Hazelton, M., Palmer, L.J. (2007) SimHap: A comprehensive modeling framework for epidemiological outcomes and a multiple imputation approach to haplotypic analysis of population-based data, [online] See Also infer.haplos data(snp.dat) # convert SNP.dat to format required by infer.haplos haplo.dat <- SNP2Haplo(SNP.dat)
30 30 pheno.dat data(pheno.dat) # generate haplotype frequencies and haplotype design matrix myinfer<-infer.haplos(haplo.dat) # print haplotype frequencies generated by infer.haplos myinfer$hap.freq # generate haplo object where haplotypes with a frequency # below min.freq are grouped as a category called "rare" myhaplo<-make.haplo.rare(myinfer,min.freq=0.05) mymodel <- haplo.quant(formula1=hdl~age+sbp+h.n1aa, formula2=hdl~age+sbp, pheno=pheno.dat, haplo=myhaplo, sim=10) pheno.dat Example phenotypic data Format pheno.dat is an example phenotypic data set containing biological measures to be used by snp.quant, haplo.quant, snp.bin and haplo.bin data(pheno.dat) A data frame with 180 observations on the following 16 variables. ID patient identifiers. SEX 1=male, 0=female. AGE age in years. SBP systolic blood pressure (mmhg). DBP diastolic blood pressure (mmhg). BMI body-mass index. WHR waist-hip ratio. HDL plasma high density lipoprotein (mmol/l). LDL plasma low density lipoprotein (mmol/l). DIABETES a binary indicator of history of type 2 diabetes. FH_IHD a binary indicator of family history of ischaemic heart disease. PLAQUE a binary indicator of the presence of 1 or more carotid plaques. SMOKE a binary indicator of smoking history (0=never smoked, 1=ever smoked).
31 prepare.cc 31 PY pack-years of smoking. DISEASE a binary indicator of ischaemic heart disease. STRAT a matching variable indicating the pairs of matched cases and controls. Details pheno.dat is a simulated data set of coronary heart disease related phenotypes data(snp.dat) # convert SNP.dat to format required by snp.quant geno.dat <- SNP2Geno(SNP.dat, baseline=c("mm", "11", "GG", "CC")) data(pheno.dat) mymodel <- snp.quant(formula1=ldl~age+sbp+factor(snp_1_add), formula2=hdl~age+sbp, geno=geno.dat, pheno=pheno.dat) # example with a subsetting variable, looking at males only mymodel <- snp.quant(formula1=ldl~age+sbp+factor(snp_1_add), formula2=hdl~age+sbp, geno=geno.dat, pheno=pheno.dat, sub=expression(sex==1)) prepare.cc Prepare case-control data for inferring haplotypes prepare.cc prepares case-control data when there may be missing values in the case status variable. This eliminates problems when using infer.haplos.cc. prepare.cc(geno, pheno, cc.var) Arguments geno pheno cc.var a genotype data frame where each SNP is represented by two columns, one for each allele, in the form of haplo.dat. a data frame containing phenotype data with at least two columns - a subject identifier and an indicator of disease status. the column name of the parameter indicating disease status. Must be entered with quotations, e.g. DISEASE".
The SimHap Package. September 7, 2007
The SimHap Package September 7, 2007 Type Package Title A comprehensive modeling framework for epidemiological outcomes and a multiple-imputation approach to haplotypic analysis of population-based data
More informationA comprehensive modelling framework and a multiple-imputation approach to haplotypic analysis of unrelated individuals
A comprehensive modelling framework and a multiple-imputation approach to haplotypic analysis of unrelated individuals GUI Release v1.0.2: User Manual January 2009 If you find this software useful, please
More informationPackage FWDselect. December 19, 2015
Title Selecting Variables in Regression Models Version 2.1.0 Date 2015-12-18 Author Marta Sestelo [aut, cre], Nora M. Villanueva [aut], Javier Roca-Pardinas [aut] Maintainer Marta Sestelo
More informationPackage quickreg. R topics documented:
Package quickreg September 28, 2017 Title Build Regression Models Quickly and Display the Results Using 'ggplot2' Version 1.5.0 A set of functions to extract results from regression models and plot the
More informationCorrectly Compute Complex Samples Statistics
PASW Complex Samples 17.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample
More informationPackage ICsurv. February 19, 2015
Package ICsurv February 19, 2015 Type Package Title A package for semiparametric regression analysis of interval-censored data Version 1.0 Date 2014-6-9 Author Christopher S. McMahan and Lianming Wang
More informationPackage DSBayes. February 19, 2015
Type Package Title Bayesian subgroup analysis in clinical trials Version 1.1 Date 2013-12-28 Copyright Ravi Varadhan Package DSBayes February 19, 2015 URL http: //www.jhsph.edu/agingandhealth/people/faculty_personal_pages/varadhan.html
More informationPackage addhaz. September 26, 2018
Package addhaz September 26, 2018 Title Binomial and Multinomial Additive Hazard Models Version 0.5 Description Functions to fit the binomial and multinomial additive hazard models and to estimate the
More informationPackage mcemglm. November 29, 2015
Type Package Package mcemglm November 29, 2015 Title Maximum Likelihood Estimation for Generalized Linear Mixed Models Version 1.1 Date 2015-11-28 Author Felipe Acosta Archila Maintainer Maximum likelihood
More informationlme for SAS PROC MIXED Users
lme for SAS PROC MIXED Users Douglas M. Bates Department of Statistics University of Wisconsin Madison José C. Pinheiro Bell Laboratories Lucent Technologies 1 Introduction The lme function from the nlme
More informationPackage lodgwas. R topics documented: November 30, Type Package
Type Package Package lodgwas November 30, 2015 Title Genome-Wide Association Analysis of a Biomarker Accounting for Limit of Detection Version 1.0-7 Date 2015-11-10 Author Ahmad Vaez, Ilja M. Nolte, Peter
More informationPackage gee. June 29, 2015
Title Generalized Estimation Equation Solver Version 4.13-19 Depends stats Suggests MASS Date 2015-06-29 DateNote Gee version 1998-01-27 Package gee June 29, 2015 Author Vincent J Carey. Ported to R by
More informationPackage FMsmsnReg. March 30, 2016
Package FMsmsnReg March 30, 2016 Type Package Title Regression Models with Finite Mixtures of Skew Heavy-Tailed Errors Version 1.0 Date 2016-03-29 Author Luis Benites Sanchez and Rocio Paola Maehara and
More informationPackage citools. October 20, 2018
Type Package Package citools October 20, 2018 Title Confidence or Prediction Intervals, Quantiles, and Probabilities for Statistical Models Version 0.5.0 Maintainer John Haman Functions
More informationPackage survivalmpl. December 11, 2017
Package survivalmpl December 11, 2017 Title Penalised Maximum Likelihood for Survival Analysis Models Version 0.2 Date 2017-10-13 Author Dominique-Laurent Couturier, Jun Ma, Stephane Heritier, Maurizio
More informationPackage mtsdi. January 23, 2018
Version 0.3.5 Date 2018-01-02 Package mtsdi January 23, 2018 Author Washington Junger and Antonio Ponce de Leon Maintainer Washington Junger
More informationPackage GWAF. March 12, 2015
Type Package Package GWAF March 12, 2015 Title Genome-Wide Association/Interaction Analysis and Rare Variant Analysis with Family Data Version 2.2 Date 2015-03-12 Author Ming-Huei Chen
More informationPackage ssmn. R topics documented: August 9, Type Package. Title Skew Scale Mixtures of Normal Distributions. Version 1.1.
Package ssmn August 9, 2016 Type Package Title Skew Scale Mixtures of Normal Distributions Version 1.1 Date 2016-08-08 Author Luis Benites Sanchez and Clecio da Silva Ferreira Maintainer Luis Benites Sanchez
More informationPackage blocksdesign
Type Package Package blocksdesign June 12, 2018 Title Nested and Crossed Block Designs for Factorial, Fractional Factorial and Unstructured Treatment Sets Version 2.9 Date 2018-06-11" Author R. N. Edmondson.
More informationStatistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2
Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Weijie Cai, SAS Institute Inc., Cary NC July 1, 2008 ABSTRACT Generalized additive models are useful in finding predictor-response
More informationPackage EBglmnet. January 30, 2016
Type Package Package EBglmnet January 30, 2016 Title Empirical Bayesian Lasso and Elastic Net Methods for Generalized Linear Models Version 4.1 Date 2016-01-15 Author Anhui Huang, Dianting Liu Maintainer
More informationPackage MIICD. May 27, 2017
Type Package Package MIICD May 27, 2017 Title Multiple Imputation for Interval Censored Data Version 2.4 Depends R (>= 2.13.0) Date 2017-05-27 Maintainer Marc Delord Implements multiple
More informationPackage simr. April 30, 2018
Type Package Package simr April 30, 2018 Title Power Analysis for Generalised Linear Mixed Models by Simulation Calculate power for generalised linear mixed models, using simulation. Designed to work with
More informationChapter 15 Mixed Models. Chapter Table of Contents. Introduction Split Plot Experiment Clustered Data References...
Chapter 15 Mixed Models Chapter Table of Contents Introduction...309 Split Plot Experiment...311 Clustered Data...320 References...326 308 Chapter 15. Mixed Models Chapter 15 Mixed Models Introduction
More informationCorrectly Compute Complex Samples Statistics
SPSS Complex Samples 15.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample
More informationEstimating Variance Components in MMAP
Last update: 6/1/2014 Estimating Variance Components in MMAP MMAP implements routines to estimate variance components within the mixed model. These estimates can be used for likelihood ratio tests to compare
More informationMultiple imputation using chained equations: Issues and guidance for practice
Multiple imputation using chained equations: Issues and guidance for practice Ian R. White, Patrick Royston and Angela M. Wood http://onlinelibrary.wiley.com/doi/10.1002/sim.4067/full By Gabrielle Simoneau
More informationMissing Data: What Are You Missing?
Missing Data: What Are You Missing? Craig D. Newgard, MD, MPH Jason S. Haukoos, MD, MS Roger J. Lewis, MD, PhD Society for Academic Emergency Medicine Annual Meeting San Francisco, CA May 006 INTRODUCTION
More informationhaplo.score Score Tests for Association of Traits with Haplotypes when Linkage Phase is Ambiguous
haploscore Score Tests for Association of Traits with Haplotypes when Linkage Phase is Ambiguous Charles M Rowland, David E Tines, and Daniel J Schaid Mayo Clinic Rochester, MN E-mail contact: rowland@mayoedu
More informationPackage EMLRT. August 7, 2014
Package EMLRT August 7, 2014 Type Package Title Association Studies with Imputed SNPs Using Expectation-Maximization-Likelihood-Ratio Test LazyData yes Version 1.0 Date 2014-08-01 Author Maintainer
More informationGenetic Analysis. Page 1
Genetic Analysis Page 1 Genetic Analysis Objectives: 1) Set up Case-Control Association analysis and the Basic Genetics Workflow 2) Use JMP tools to interact with and explore results 3) Learn advanced
More informationPackage glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering
Encoding UTF-8 Version 1.0.3 Date 2018-03-25 Title Generalized Linear Models with Clustering Package glmmml March 25, 2018 Binomial and Poisson regression for clustered data, fixed and random effects with
More informationCHAPTER 1 INTRODUCTION
Introduction CHAPTER 1 INTRODUCTION Mplus is a statistical modeling program that provides researchers with a flexible tool to analyze their data. Mplus offers researchers a wide choice of models, estimators,
More informationPackage dglm. August 24, 2016
Version 1.8.3 Date 2015-10-27 Title Double Generalized Linear Models Package dglm August 24, 2016 Author Peter K Dunn and Gordon K Smyth Maintainer Robert Corty
More informationPackage lmesplines. R topics documented: February 20, Version
Version 1.1-10 Package lmesplines February 20, 2015 Title Add smoothing spline modelling capability to nlme. Author Rod Ball Maintainer Andrzej Galecki
More informationPackage biglars. February 19, 2015
Package biglars February 19, 2015 Type Package Title Scalable Least-Angle Regression and Lasso Version 1.0.2 Date Tue Dec 27 15:06:08 PST 2011 Author Mark Seligman, Chris Fraley, Tim Hesterberg Maintainer
More informationHandling missing data for indicators, Susanne Rässler 1
Handling Missing Data for Indicators Susanne Rässler Institute for Employment Research & Federal Employment Agency Nürnberg, Germany First Workshop on Indicators in the Knowledge Economy, Tübingen, 3-4
More informationPackage flexcwm. May 20, 2018
Type Package Title Flexible Cluster-Weighted Modeling Version 1.8 Date 2018-05-20 Author Mazza A., Punzo A., Ingrassia S. Maintainer Angelo Mazza Package flexcwm May 20, 2018 Description
More informationStep-by-Step Guide to Basic Genetic Analysis
Step-by-Step Guide to Basic Genetic Analysis Page 1 Introduction This document shows you how to clean up your genetic data, assess its statistical properties and perform simple analyses such as case-control
More informationThe linear mixed model: modeling hierarchical and longitudinal data
The linear mixed model: modeling hierarchical and longitudinal data Analysis of Experimental Data AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44 Contents 1 Modeling Hierarchical
More informationLudwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer
Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures
More informationMultiple-imputation analysis using Stata s mi command
Multiple-imputation analysis using Stata s mi command Yulia Marchenko Senior Statistician StataCorp LP 2009 UK Stata Users Group Meeting Yulia Marchenko (StataCorp) Multiple-imputation analysis using mi
More informationGeneralized Additive Models
:p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1
More informationPackage qvcalc. R topics documented: September 19, 2017
Package qvcalc September 19, 2017 Version 0.9-1 Date 2017-09-18 Title Quasi Variances for Factor Effects in Statistical Models Author David Firth Maintainer David Firth URL https://github.com/davidfirth/qvcalc
More informationPackage simsurv. May 18, 2018
Type Package Title Simulate Survival Data Version 0.2.2 Date 2018-05-18 Package simsurv May 18, 2018 Maintainer Sam Brilleman Description Simulate survival times from standard
More informationPackage MMIX. R topics documented: February 15, Type Package. Title Model selection uncertainty and model mixing. Version 1.2.
Package MMIX February 15, 2013 Type Package Title Model selection uncertainty and model mixing Version 1.2 Date 2012-06-18 Author Marie Morfin and David Makowski Maintainer Description
More informationThe partial Package. R topics documented: October 16, Version 0.1. Date Title partial package. Author Andrea Lehnert-Batar
The partial Package October 16, 2006 Version 0.1 Date 2006-09-21 Title partial package Author Andrea Lehnert-Batar Maintainer Andrea Lehnert-Batar Depends R (>= 2.0.1),e1071
More informationCHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA
Examples: Mixture Modeling With Cross-Sectional Data CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Mixture modeling refers to modeling with categorical latent variables that represent
More informationPackage REGENT. R topics documented: August 19, 2015
Package REGENT August 19, 2015 Title Risk Estimation for Genetic and Environmental Traits Version 1.0.6 Date 2015-08-18 Author Daniel J.M. Crouch, Graham H.M. Goddard & Cathryn M. Lewis Maintainer Daniel
More informationFrequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS
ABSTRACT Paper 1938-2018 Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS Robert M. Lucas, Robert M. Lucas Consulting, Fort Collins, CO, USA There is confusion
More informationPackage blocksdesign
Type Package Package blocksdesign September 11, 2017 Title Nested and Crossed Block Designs for Factorial, Fractional Factorial and Unstructured Treatment Sets Version 2.7 Date 2017-09-11 Author R. N.
More informationPackage truncreg. R topics documented: August 3, 2016
Package truncreg August 3, 2016 Version 0.2-4 Date 2016-08-03 Title Truncated Gaussian Regression Models Depends R (>= 1.8.0), maxlik Suggests survival Description Estimation of models for truncated Gaussian
More informationThe brlr Package. March 22, brlr... 1 lizards Index 5. Bias-reduced Logistic Regression
The brlr Package March 22, 2006 Version 0.8-8 Date 2006-03-22 Title Bias-reduced logistic regression Author David Firth URL http://www.warwick.ac.uk/go/dfirth Maintainer David Firth
More informationPackage GWRM. R topics documented: July 31, Type Package
Type Package Package GWRM July 31, 2017 Title Generalized Waring Regression Model for Count Data Version 2.1.0.3 Date 2017-07-18 Maintainer Antonio Jose Saez-Castillo Depends R (>= 3.0.0)
More informationPackage rptr. January 4, 2018
Package rptr January 4, 2018 Title Repeatability Estimation for Gaussian and Non-Gaussian Data Version 0.9.21 Depends R (>= 3.2.1) Date 2018-01-04 Author Martin Stoffel ,
More informationbook 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition
book 2014/5/6 15:21 page v #3 Contents List of figures List of tables Preface to the second edition Preface to the first edition xvii xix xxi xxiii 1 Data input and output 1 1.1 Input........................................
More informationIntroduction to Mixed Models: Multivariate Regression
Introduction to Mixed Models: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #9 March 30, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate
More informationSTENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015
STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................
More informationPackage nricens. R topics documented: May 30, Type Package
Type Package Package nricens May 30, 2018 Title NRI for Risk Prediction Models with Time to Event and Binary Response Data Version 1.6 Date 2018-5-30 Author Eisuke Inoue Depends survival Maintainer Eisuke
More informationPackage epitab. July 4, 2018
Type Package Package epitab July 4, 2018 Title Flexible Contingency Tables for Epidemiology Version 0.2.2 Author Stuart Lacy Maintainer Stuart Lacy Builds contingency tables that
More informationThe glmmml Package. August 20, 2006
The glmmml Package August 20, 2006 Version 0.65-1 Date 2006/08/20 Title Generalized linear models with clustering A Maximum Likelihood and bootstrap approach to mixed models. License GPL version 2 or newer.
More informationGeneralized Additive Model
Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1
More informationdavidr Cornell University
1 NONPARAMETRIC RANDOM EFFECTS MODELS AND LIKELIHOOD RATIO TESTS Oct 11, 2002 David Ruppert Cornell University www.orie.cornell.edu/ davidr (These transparencies and preprints available link to Recent
More informationUsing R for Analyzing Delay Discounting Choice Data. analysis of discounting choice data requires the use of tools that allow for repeated measures
Using R for Analyzing Delay Discounting Choice Data Logistic regression is available in a wide range of statistical software packages, but the analysis of discounting choice data requires the use of tools
More informationPackage ridge. R topics documented: February 15, Title Ridge Regression with automatic selection of the penalty parameter. Version 2.
Package ridge February 15, 2013 Title Ridge Regression with automatic selection of the penalty parameter Version 2.1-2 Date 2012-25-09 Author Erika Cule Linear and logistic ridge regression for small data
More informationBeviMed Guide. Daniel Greene
BeviMed Guide Daniel Greene 1 Introduction BeviMed [1] is a procedure for evaluating the evidence of association between allele configurations across rare variants, typically within a genomic locus, and
More informationSplines and penalized regression
Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,
More informationOverview. Background. Locating quantitative trait loci (QTL)
Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems
More informationQUICKTEST user guide
QUICKTEST user guide Toby Johnson Zoltán Kutalik December 11, 2008 for quicktest version 0.94 Copyright c 2008 Toby Johnson and Zoltán Kutalik Permission is granted to copy, distribute and/or modify this
More informationSUGEN 8.6 Overview. Misa Graff, July 2017
SUGEN 8.6 Overview Misa Graff, July 2017 General Information By Ran Tao, https://sites.google.com/site/dragontaoran/home Website: http://dlin.web.unc.edu/software/sugen/ Standalone command-line software
More informationInference for Generalized Linear Mixed Models
Inference for Generalized Linear Mixed Models Christina Knudson, Ph.D. University of St. Thomas October 18, 2018 Reviewing the Linear Model The usual linear model assumptions: responses normally distributed
More informationPackage acebayes. R topics documented: November 21, Type Package
Type Package Package acebayes November 21, 2018 Title Optimal Bayesian Experimental Design using the ACE Algorithm Version 1.5.2 Date 2018-11-21 Author Antony M. Overstall, David C. Woods & Maria Adamou
More informationThe Lander-Green Algorithm in Practice. Biostatistics 666
The Lander-Green Algorithm in Practice Biostatistics 666 Last Lecture: Lander-Green Algorithm More general definition for I, the "IBD vector" Probability of genotypes given IBD vector Transition probabilities
More informationThe lmekin function. Terry Therneau Mayo Clinic. May 11, 2018
The lmekin function Terry Therneau Mayo Clinic May 11, 2018 1 Background The original kinship library had an implementation of linear mixed effects models using the matrix code found in coxme. Since the
More informationHaplotype analysis in population-based association studies
The Stata Journal (2001) 1, Number 1, pp. 58 75 Haplotype analysis in population-based association studies A. P. Mander MRC Biostatistics Unit, Cambridge, UK adrian.mander@mrc-bsu.cam.ac.uk Abstract. This
More informationMPLUS Analysis Examples Replication Chapter 10
MPLUS Analysis Examples Replication Chapter 10 Mplus includes all input code and output in the *.out file. This document contains selected output from each analysis for Chapter 10. All data preparation
More informationPackage PedCNV. February 19, 2015
Type Package Package PedCNV February 19, 2015 Title An implementation for association analysis with CNV data. Version 0.1 Date 2013-08-03 Author, Sungho Won and Weicheng Zhu Maintainer
More informationPackage sprinter. February 20, 2015
Type Package Package sprinter February 20, 2015 Title Framework for Screening Prognostic Interactions Version 1.1.0 Date 2014-04-11 Author Isabell Hoffmann Maintainer Isabell Hoffmann
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationPackage rqt. November 21, 2017
Type Package Title rqt: utilities for gene-level meta-analysis Version 1.4.0 Package rqt November 21, 2017 Author I. Y. Zhbannikov, K. G. Arbeev, A. I. Yashin. Maintainer Ilya Y. Zhbannikov
More informationPackage SIPI. R topics documented: September 23, 2016
Package SIPI September 23, 2016 Type Package Title SIPI Version 0.2 Date 2016-9-22 Depends logregperm, SNPassoc, sm, car, lmtest, parallel Author Maintainer Po-Yu Huang SNP Interaction
More informationSplines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)
Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function
More informationRegression. Dr. G. Bharadwaja Kumar VIT Chennai
Regression Dr. G. Bharadwaja Kumar VIT Chennai Introduction Statistical models normally specify how one set of variables, called dependent variables, functionally depend on another set of variables, called
More informationPackage coloc. February 24, 2018
Type Package Package coloc February 24, 2018 Imports ggplot2, snpstats, BMA, reshape, methods, flashclust, speedglm Suggests knitr, testthat Title Colocalisation Tests of Two Genetic Traits Version 3.1
More informationPackage samplesizelogisticcasecontrol
Package samplesizelogisticcasecontrol February 4, 2017 Title Sample Size Calculations for Case-Control Studies Version 0.0.6 Date 2017-01-31 Author Mitchell H. Gail To determine sample size for case-control
More informationPackage MSwM. R topics documented: February 19, Type Package Title Fitting Markov Switching Models Version 1.
Type Package Title Fitting Markov Switching Models Version 1.2 Date 2014-02-05 Package MSwM February 19, 2015 Author Josep A. Sanchez-Espigares, Alberto Lopez-Moreno Maintainer Josep A. Sanchez-Espigares
More informationPackage coxme. R topics documented: May 13, Title Mixed Effects Cox Models Priority optional Version
Title Mixed Effects Cox Models Priority optional Version 2.2-10 Package coxme May 13, 2018 Depends survival (>= 2.36.14), methods, bdsmatrix(>= 1.3) Imports nlme, Matrix (>= 1.0) Suggests mvtnorm, kinship2
More informationANNOUNCING THE RELEASE OF LISREL VERSION BACKGROUND 2 COMBINING LISREL AND PRELIS FUNCTIONALITY 2 FIML FOR ORDINAL AND CONTINUOUS VARIABLES 3
ANNOUNCING THE RELEASE OF LISREL VERSION 9.1 2 BACKGROUND 2 COMBINING LISREL AND PRELIS FUNCTIONALITY 2 FIML FOR ORDINAL AND CONTINUOUS VARIABLES 3 THREE-LEVEL MULTILEVEL GENERALIZED LINEAR MODELS 3 FOUR
More informationCH9.Generalized Additive Model
CH9.Generalized Additive Model Regression Model For a response variable and predictor variables can be modeled using a mean function as follows: would be a parametric / nonparametric regression or a smoothing
More informationPackage GLMMRR. August 9, 2016
Type Package Package GLMMRR August 9, 2016 Title Generalized Linear Mixed Model (GLMM) for Binary Randomized Response Data Version 0.2.0 Date 2016-08-08 Author Jean-Paul Fox [aut], Konrad Klotzke [aut],
More informationDiscussion Notes 3 Stepwise Regression and Model Selection
Discussion Notes 3 Stepwise Regression and Model Selection Stepwise Regression There are many different commands for doing stepwise regression. Here we introduce the command step. There are many arguments
More informationPackage simmsm. March 3, 2015
Type Package Package simmsm March 3, 2015 Title Simulation of Event Histories for Multi-State Models Version 1.1.41 Date 2014-02-09 Author Maintainer Simulation of event histories
More informationPackage rereg. May 30, 2018
Title Recurrent Event Regression Version 1.1.4 Package rereg May 30, 2018 A collection of regression models for recurrent event process and failure time. Available methods include these from Xu et al.
More informationSOLOMON: Parentage Analysis 1. Corresponding author: Mark Christie
SOLOMON: Parentage Analysis 1 Corresponding author: Mark Christie christim@science.oregonstate.edu SOLOMON: Parentage Analysis 2 Table of Contents: Installing SOLOMON on Windows/Linux Pg. 3 Installing
More informationPackage SmoothHazard
Package SmoothHazard September 19, 2014 Title Fitting illness-death model for interval-censored data Version 1.2.3 Author Celia Touraine, Pierre Joly, Thomas A. Gerds SmoothHazard is a package for fitting
More informationSmoking and Missingness: Computer Syntax 1
Smoking and Missingness: Computer Syntax 1 Computer Syntax SAS code is provided for the logistic regression imputation described in this article. This code is listed in parts, with description provided
More informationStatistical Methods for the Analysis of Repeated Measurements
Charles S. Davis Statistical Methods for the Analysis of Repeated Measurements With 20 Illustrations #j Springer Contents Preface List of Tables List of Figures v xv xxiii 1 Introduction 1 1.1 Repeated
More informationThe glmc Package. December 6, 2006
The glmc Package December 6, 2006 Version 0.2-1 Date December 6, 2006 Title Fitting Generalized Linear Models Subject to Constraints Author Sanjay Chaudhuri , Mark S. Handcock ,
More informationPackage CALIBERrfimpute
Type Package Package CALIBERrfimpute June 11, 2018 Title Multiple Imputation Using MICE and Random Forest Version 1.0-1 Date 2018-06-05 Functions to impute using Random Forest under Full Conditional Specifications
More informationMACAU User Manual. Xiang Zhou. March 15, 2017
MACAU User Manual Xiang Zhou March 15, 2017 Contents 1 Introduction 2 1.1 What is MACAU...................................... 2 1.2 How to Cite MACAU................................... 2 1.3 The Model.........................................
More information