Unit 5 Logistic Regression Practice Problems

Size: px
Start display at page:

Download "Unit 5 Logistic Regression Practice Problems"

Transcription

1 Unit 5 Logistic Regression Practice Problems SOLUTIONS R Users Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, Exercises #1-#3 utilize a data set provided by Afifi, Clark and May (2004). The data are a study of depression and was a longitudinal study. The purpose of the study was to obtain estimates of the prevalence and incidence of depression and to explore its risk factors. The study variables were of several types demographics, life events, stressors, physical health, health services utilization, medication use, lifestyle, and social support. Before You Begin Download R data sets depress.rdata and illeetvilaine.rdata from the course website page. R Preliminary (ONE TIME) Install Packages as Needed from the console ONLY # install.packages( gmodels ) # install.packages( DescTools ) # install.packages( lmtest ) # install.packages( ResourceSelection ) # install.packages( generalhoslem ) # install.packages( epir ) # install.packages( proc ) Exercises #1-3 utilize the data set depress.rdata Consider the following three variables. Variable Codings Label drink 1 = yes 2 = no Regular Drinker sex 1 = male 2 = female cases 0 = Normal 1 = Case of Depression Depressed is cesd > 16.sol_logistic R Users.docx Page 1 of 15

2 Preliminaries - depress.rdata setwd("/users/cbigelow/desktop") load(file="depress.rdata") options(scipen=1000) options(signif.stars=false) depress$sex <- factor(depress$sex, levels=c(1,2), labels=c("1=male", "2=Female")) depress$drink <- factor(depress$drink, levels=c(1,2), labels=c("1=yes", "2=No")) 1. Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, 2004, Problem 12.9, page 330. Complete the following table by supplying the cell frequencies. Sex Regular Drinker Male Female Total Yes No Total R Solution library(summarytools) library(desctools) summarytools::ctable(depress$sex,depress$drink,prop = 'n', totals = TRUE) ## Cross- Tabulation ## Variables: sex * drink ## Data Frame: depress ## ## ## drink 1=Yes 2=No Total ## sex ## 1=Male ## 2=Female ## Total ## sol_logistic R Users.docx Page 2 of 15

3 paste("relative Odds (OR) Drinker for Males relative to Females") ## [1] "Relative Odds (OR) Drinker for Males relative to Females" DescTools::OddsRatio(depress$drink,depress$sex,method="wald",conf.level=0.95) ## odds ratio lwr.ci upr.ci ## What are the odds that a man is a regular drinker? 95 / 16 = What are the odds that a woman is a regular drinker? 139 / 44 = What is the odds ratio? That is, what is the odds that a man, relative to a woman, is a regular drinker? OR = [odds for man] / [odds for woman] = /3.159 = Repeat the tabulation that you produced for problem #1 two times, one for persons who are depressed and the other for persons who are not depressed. R - Solution library(summarytools) library(desctools) # PERSONS WHO ARE DEPRESSED subset_depressed <- subset(depress,cases==1) paste("subset - DEPRESSED") ## [1] "Subset - DEPRESSED" summarytools::ctable(subset_depressed$sex,subset_depressed$drink,prop = 'n', totals = TRUE) ## Cross- Tabulation ## Variables: sex * drink ## Data Frame: subset_depressed ## ## ## drink 1=Yes 2=No Total ## sex ## 1=Male ## 2=Female ## Total ## sol_logistic R Users.docx Page 3 of 15

4 paste("depressed: Relative Odds (OR) Drinker for Males relative to Females") ## [1] "DEPRESSED: Relative Odds (OR) Drinker for Males relative to Females" DescTools::OddsRatio(subset_depressed$drink,subset_depressed$sex,method="wald",conf.level=0.95) ## odds ratio lwr.ci upr.ci ## OR = [odds for man] / [odds for woman] = (8/2)/(33/7) = # PERSONS WHO ARE NOT DEPRESSED subset_normal <- subset(depress, cases==0) paste("subset - NORMAL") ## [1] "Subset - NORMAL" summarytools::ctable(subset_normal$sex,subset_normal$drink,prop = 'n', totals = TRUE) ## Cross- Tabulation ## Variables: sex * drink ## Data Frame: subset_normal ## ## ## drink 1=Yes 2=No Total ## sex ## 1=Male ## 2=Female ## Total ## paste("normal: Relative Odds (OR) Drinker for Males relative to Females") ## [1] "NORMAL: Relative Odds (OR) Drinker for Males relative to Females" DescTools::OddsRatio(subset_normal$drink,subset_normal$sex,method="wald",conf.level=0.95) ## odds ratio lwr.ci upr.ci ## OR = [odds for man] / [odds for woman] = (87/14)/(106/37) = sol_logistic R Users.docx Page 4 of 15

5 3. Fit a logistic regression model using these variables. Use DRINK as the dependent variable and CASES and FEMALE SEX as independent variables. Also include as an independent variable the appropriate interaction term. Is the interaction term in your model significant? R Solution library(summarytools) # Create 0/1 indicators of regular drinker (drink01) and female gender (female01) # Create interaction of drinker and female gender # KEY: when drink=1 then drink01=1, otherwise drink01=0 depress$drink01 <- ifelse(as.numeric(depress$drink)==1,1,0) summarytools::ctable(depress$drink,depress$drink01,prop = 'n', totals = TRUE) ## Cross- Tabulation ## Variables: drink * drink01 ## Data Frame: depress ## ## ## drink Total ## drink ## 1=Yes ## 2=No ## Total ## # KEY: when sex=2 then female01=1, otherwise female01=0 depress$female01 <- ifelse(as.numeric(depress$sex)==2,1,0) summarytools::ctable(depress$sex,depress$female01,prop = 'n', totals = TRUE) ## Cross- Tabulation ## Variables: sex * female01 ## Data Frame: depress ## ## ## female Total ## sex ## 1=Male ## 2=Female ## Total ## # Interaction fem_case = (female01)*(cases) depress$fem_case <- depress$female01*depress$cases.sol_logistic R Users.docx Page 5 of 15

6 # Fit logistic regression model: Y=drink01 X1=, X2=female01, X3=interaction # glm(yvariable ~ X1 + X Xp, data=dataframename, family=binomial) model_q3 <- glm(drink01 ~ cases + female01 + fem_case, data=depress, family=binomial) summary(model_q3) Call: glm(formula = drink01 ~ cases + female01 + fem_case, family = binomial, data = depress) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) e- 10 *** cases female * fem_case Wald Statistic p- value Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: on 293 degrees of freedom Residual deviance: on 290 degrees of freedom AIC: Number of Fisher Scoring iterations: 4 # Odds Ratios % 95% CI exp(cbind(or=coef(model_q3), confint(model_q3))) OR 2.5 % 97.5 % (Intercept) cases female fem_case # LR Test of Interaction (Null: beta for interaction=0) # Illustration in some detail # Fit reduced model, fit full model, perform LR test reduced <- glm(drink01 ~ cases + female01, data=depress, family=binomial) full <- glm(drink01 ~ cases + female01 + fem_case, data=depress, family=binomial).sol_logistic R Users.docx Page 6 of 15

7 lrtest(reduced,full) Likelihood ratio test Model 1: drink01 ~ cases + female01 Model 2: drink01 ~ cases + female01 + fem_case #Df LogLik Df Chisq Pr(>Chisq) In this sample, there is no statistically significant evidence that the odds regular drinking is associated with an interaction of gender and case of depression (LR test, df=1, p-value =.3493). This is close to the Wald statistic p-value =.327 that can be seen in the coefficients table. Fitted Model: logit [ pr (drinker=yes) ] = *cases *female *fem_case where cases =1 if depressed; 0 otherwise female01 = 1 if female; 0 otherwise fem_case = (cases) * (female01) Is the interaction term in your model significant? No. ˆβ3 = SÊ( ˆβ 3 ) = 0.96 and p-value = Source: Kleinbaum, Kupper, Miller, and Nizam. Applied Regression Analysis and Other Multivariable Methods, Third Edition. Pacific Grove: Duxbury Press, p 683 (problem 2). A five year follow-up study on 600 disease free subjects was carried out to assess the effect of 0/1 exposure E on the development (or not) of a certain disease. The variables AGE (continuous) and obesity status (OBS), the latter a 0/1 variable were determined at the start of the follow-up and were to be considered as control variables in analyzing the data. (a) State the logit form of a logistic regression model that assesses the effect of the 0/1 exposure variable E controlling for the confounding effects of AGE and OBS and the interaction effects of AGE with E and OBS with E..sol_logistic R Users.docx Page 7 of 15

8 Solution: logit[π] = β 0 + β 1*E + β 2*AGE + β 3*OBS + β*agee 4 + β*obse 5 I used the following notation: π = Probability [ disease ] AGEE = AGE * E. This is a created variable that is the interaction of AGE with E OBSE = OBS * E Similarly, this is the interaction of OBS with E. logit[π] = β 0 + β 1*E + β 2*AGE + β 3*OBS + β*agee 4 + β*obse 5 (b) Given the model you have for #4a, give a formula for the odds ratio for the exposure-disease relationship that controls for the confounding and interactive effects of AGE and OBS. Solution: The solution here follows the ideas on pp 9-11 in Lecture Notes 5, Logistic Regression. Value of Predictor for Person who is Predictor Exposed Not Exposed E 1 0 AGE AGE 1 AGE 0 OBS OBS 1 OBS 0 AGEE AGE 1 0 OBSE OBS 1 0 Then OR = exp { logit[π for exposed person] - logit[π for NON exposed person] } = exp { [ β + β + β*age + β*obs + β*age + β*obs] [ β + β*age + β*obs] } = exp { β 1 + β 2*(AGE1-AGE o) + β 3*(OBS 1 - OBS 0) + β 4*AGE 1 + β 5*OBS 1 }.sol_logistic R Users.docx Page 8 of 15

9 (c) Now use the formula that you have for #4b to write an expression for the estimated odds ratio for the exposure-disease relationship that considers both confounding and interaction when AGE=40 and OBS=1. Solution: OR ˆ = exp { β + (40)β + β } Value of Predictor for Person who is Predictor Exposed Not Exposed E 1 0 AGE OBS 1 1 AGEE 40 0 OBSE 1 0 OR = exp { β 1 + β 2*(40-40) + β 3*(1-1) + β 4*40 + β 5*1 } = exp { β 1 + β 4*40 + β 5*1 } Exercises #5-9 utilize the data set illeetvilaine.rdata Consider the following three variables. Variable Codings Label in STATA case 1 = yes 0= no Case status agegp 1=25-34 Age, grouped 2= = = = =75+ tobgrp 1=0-9 g/day 2= = =30+ Tobacco use, grouped 5. Understanding the Global Test of the Fitted Model: In this exercise, you will see that the global chi square test of the current model is equivalent to a LR Ratio test that compares the reduced model containing the intercept only versus the full model that contains the current model. In developing your answer, include as predictors in your full model indicator variables for agegp and tobgp. Tip How to Fit the Intercept Only Model. R Users Fit the dependent variable to the constant 1..sol_logistic R Users.docx Page 9 of 15

10 R Solution # Input illeetvillaine.rdata from desktop setwd("/users/cbigelow/desktop/") load(file= illeetvilaine.rdata ) ille <- illeetvilaine # Create indicators for agegp (REFERENT is age 25-34) ille$age_3544 <- ifelse(as.numeric(ille$agegp)==2,1,0) ille$age_4554 <- ifelse(as.numeric(ille$agegp)==3,1,0) ille$age_5564 <- ifelse(as.numeric(ille$agegp)==4,1,0) ille$age_6574 <- ifelse(as.numeric(ille$agegp)==5,1,0) ille$age_75p <- ifelse(as.numeric(ille$agegp)==6,1,0) # Create indicators for tobgrp (REFERENT is tob 0-9) ille$tob_1019 <- ifelse(as.numeric(ille$tobgp)==2,1,0) ille$tob_2029 <- ifelse(as.numeric(ille$tobgp)==3,1,0) ille$tob_30p <- ifelse(as.numeric(ille$tobgp)==4,1,0) Dear class I followed these indicator variable creations with some cross- tabulations to check that my coding was correct (See page 5 for example using the command tally( ). Output of these checks is not shown. # Fit reduced model, fit full model, perform LR test reduced <- glm(case ~ 1, data=ille, family=binomial) full <- glm(case ~ age_ age_ age_ age_ age_75p + tob_ tob_ tob _30p, data=ille,family=binomial) lrtest(reduced,full) Likelihood ratio test Model 1: case ~ 1 Model 2: case ~ age_ age_ age_ age_ age_75p + tob_ tob_ tob_30p #Df LogLik Df Chisq Pr(>Chisq) < 2.2e- 16 *** Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1.sol_logistic R Users.docx Page 10 of 15

11 # By hand calculation using difference in values of deviance byhand_1 <- reduced$deviance - full$deviance byhand_1 [1] pchisq(byhand_1, 8, lower.tail=false) [1] e- 30 # By hand calculation using difference in values of - 2 log Likelihood byhand_2 <- - 2*logLik(reduced)[1] - (- 2*logLik(full)[1]) byhand_2 [1] pchisq(byhand_2,8,lower.tail=false) [1] e Illustration of Hosmer-Lemeshow Goodness of Fit Test (NULL: model is a good fit) See also unit 5 notes, pp For this illustration, fit the model containing design variables for age group and tobacco use. Then perform the Hosmer-Lemeshow goodness of fit test. R Solution library(resourceselection) library(generalhoslem) # Hosmer- Lemeshow GOF (Null:model is good fit) # Using command hoslem.test() in package ResourceSelection # hoslem.test(nameofmodel$y, fitted(name OF MODEL), g=numberofbins) ResourceSelection::hoslem.test(full$y,fitted(full), g=8) Hosmer and Lemeshow goodness of fit (GOF) test data: full$y, fitted(full) X- squared = , df = 6, p- value = sol_logistic R Users.docx Page 11 of 15

12 # Using command logitgof() in package generalhoslem generalhoslem::logitgof(ille$case,fitted(full),g=8,ord=false) Hosmer and Lemeshow test (binary model) data: ille$case, fitted(full) X- squared = , df = 6, p- value = Illustration of Link Test (NULL: current model is adequate; no additional predictors needed) See also unit 5 notes, pp For the current model, perform the link test for omitted variables. R Solution # R solution requires a bit of programming # Step 1: Obtain fitted values. Name what you like (eg - hat) hat <- predict(full) # Step 2: Obtain (fitted values)- squared. Name what you like (eg - hatsq) hatsq <- hat^2 # Step 3: Fit model to fitted and fitted- squared. Test Null: Beta for fitted squared=0 linktest <- summary(glm(case ~ hat + hatsq, data=ille, family=binomial)) linktest Call: glm(formula = case ~ hat + hatsq, family = binomial, data = ille) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) hat e- 07 *** hatsq Nice. We have no statistically significant evidence of model inadequacy (Null: beta(hatsq)=0,p- value =.657).sol_logistic R Users.docx Page 12 of 15

13 8. Illustration of Classification Table See also unit 5 notes, pp For the current model, obtain the classification table R Solution library(gmodels) library(epir) # R solution requires a bit of programming here, too. # Fit model. full <- glm(case ~ age_ age_ age_ age_ age_75p + tob_ tob_ tob _30p, data=ille,family=binomial) # Save fitted probabilities (y_fitted) y_fitted=fitted(full) # Save predicted 0/1 events using cut- off of 0.50 (pred_50) pred_50 <- ifelse(y_fitted >= 0.50, 1, 0) # Detailed xtab using command CrossTable (package gmodels) classify <- as.data.frame(cbind(pred_50, case=ille$case)) gmodels::crosstable(classify$pred_50,classify$case,dnn=c("predicted (cut=.50)", "Observed"),format ="SAS",prop.r=FALSE,prop.c=FALSE,prop.t=FALSE,prop.chisq=FALSE,chisq=FALSE,fisher=FALSE) Cell Contents N Total Observations in Table: 975 Observed Predicted (cut=.50) 0 1 Row Total Column Total sol_logistic R Users.docx Page 13 of 15

14 # Detailed epi statistics using command epi.tests (package epir) # Must recode 1=event/0=NONevent to 1=event/2=NONevent # And must create table for use with epi.tests classify$test <- ifelse((classify$pred_50)==0,2,1) classify$disease <- ifelse((classify$case)==0,2,1) class_table <- table(classify$test, classify$disease) epir::epi.tests(class_table) Outcome + Outcome - Total Test Test Total Point estimates and 95 % CIs: Apparent prevalence 0.03 (0.02, 0.04) True prevalence 0.21 (0.18, 0.23) Sensitivity 0.10 (0.06, 0.15) Specificity 0.99 (0.98, 0.99) Positive predictive value 0.67 (0.47, 0.83) Negative predictive value 0.81 (0.78, 0.83) Positive likelihood ratio 7.75 (3.69, 16.29) Negative likelihood ratio 0.91 (0.87, 0.96) sol_logistic R Users.docx Page 14 of 15

15 9. Illustration of ROC Curve See also unit 5 notes, pp For the current model, produce the ROC curve. R Solution library(proc) # roc1 <- roc(dataframe$yvariable, fitted(modelname)) # plot.roc(roc1, print.auc=true, legacy.axes=true) roc1 <- roc(ille$case,fitted(full)) proc::plot.roc(roc1,main="ille- et- Vilaine", print.auc=true,legacy.axes=true).sol_logistic R Users.docx Page 15 of 15

Discriminant analysis in R QMMA

Discriminant analysis in R QMMA Discriminant analysis in R QMMA Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20labs/l4-lda-eng.html#(1) 1/26 Default data Get the data set Default library(islr)

More information

Using R for Analyzing Delay Discounting Choice Data. analysis of discounting choice data requires the use of tools that allow for repeated measures

Using R for Analyzing Delay Discounting Choice Data. analysis of discounting choice data requires the use of tools that allow for repeated measures Using R for Analyzing Delay Discounting Choice Data Logistic regression is available in a wide range of statistical software packages, but the analysis of discounting choice data requires the use of tools

More information

Multinomial Logit Models with R

Multinomial Logit Models with R Multinomial Logit Models with R > rm(list=ls()); options(scipen=999) # To avoid scientific notation > # install.packages("mlogit", dependencies=true) # Only need to do this once > library(mlogit) # Load

More information

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015 STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................

More information

The linear mixed model: modeling hierarchical and longitudinal data

The linear mixed model: modeling hierarchical and longitudinal data The linear mixed model: modeling hierarchical and longitudinal data Analysis of Experimental Data AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44 Contents 1 Modeling Hierarchical

More information

STAT Statistical Learning. Predictive Modeling. Statistical Learning. Overview. Predictive Modeling. Classification Methods.

STAT Statistical Learning. Predictive Modeling. Statistical Learning. Overview. Predictive Modeling. Classification Methods. STAT 48 - STAT 48 - December 5, 27 STAT 48 - STAT 48 - Here are a few questions to consider: What does statistical learning mean to you? Is statistical learning different from statistics as a whole? What

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 4, 2016 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 5, 2015 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Poisson Regression and Model Checking

Poisson Regression and Model Checking Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)

More information

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017 Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison

More information

Stat 4510/7510 Homework 4

Stat 4510/7510 Homework 4 Stat 45/75 1/7. Stat 45/75 Homework 4 Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details so that the grader

More information

Purposeful Selection of Variables in Logistic Regression: Macro and Simulation Results

Purposeful Selection of Variables in Logistic Regression: Macro and Simulation Results ection on tatistical Computing urposeful election of Variables in Logistic Regression: Macro and imulation Results Zoran ursac 1, C. Heath Gauss 1, D. Keith Williams 1, David Hosmer 2 1 iostatistics, University

More information

CH9.Generalized Additive Model

CH9.Generalized Additive Model CH9.Generalized Additive Model Regression Model For a response variable and predictor variables can be modeled using a mean function as follows: would be a parametric / nonparametric regression or a smoothing

More information

run ld50 /* Plot the onserved proportions and the fitted curve */ DATA SETR1 SET SETR1 PROB=X1/(X1+X2) /* Use this to create graphs in Windows */ gopt

run ld50 /* Plot the onserved proportions and the fitted curve */ DATA SETR1 SET SETR1 PROB=X1/(X1+X2) /* Use this to create graphs in Windows */ gopt /* This program is stored as bliss.sas */ /* This program uses PROC LOGISTIC in SAS to fit models with logistic, probit, and complimentary log-log link functions to the beetle mortality data collected

More information

Binary Regression in S-Plus

Binary Regression in S-Plus Fall 200 STA 216 September 7, 2000 1 Getting Started in UNIX Binary Regression in S-Plus Create a class working directory and.data directory for S-Plus 5.0. If you have used Splus 3.x before, then it is

More information

Logistic Regression. (Dichotomous predicted variable) Tim Frasier

Logistic Regression. (Dichotomous predicted variable) Tim Frasier Logistic Regression (Dichotomous predicted variable) Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

More information

Stat 5100 Handout #14.a SAS: Logistic Regression

Stat 5100 Handout #14.a SAS: Logistic Regression Stat 5100 Handout #14.a SAS: Logistic Regression Example: (Text Table 14.3) Individuals were randomly sampled within two sectors of a city, and checked for presence of disease (here, spread by mosquitoes).

More information

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Weijie Cai, SAS Institute Inc., Cary NC July 1, 2008 ABSTRACT Generalized additive models are useful in finding predictor-response

More information

GxE.scan. October 30, 2018

GxE.scan. October 30, 2018 GxE.scan October 30, 2018 Overview GxE.scan can process a GWAS scan using the snp.logistic, additive.test, snp.score or snp.matched functions, whereas snp.scan.logistic only calls snp.logistic. GxE.scan

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

Introduction to Mixed Models: Multivariate Regression

Introduction to Mixed Models: Multivariate Regression Introduction to Mixed Models: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #9 March 30, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.

More information

Extensions to the Cox Model: Stratification

Extensions to the Cox Model: Stratification Extensions to the Cox Model: Stratification David M. Rocke May 30, 2017 David M. Rocke Extensions to the Cox Model: Stratification May 30, 2017 1 / 13 Anderson Data Remission survival times on 42 leukemia

More information

Week 4: Simple Linear Regression III

Week 4: Simple Linear Regression III Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of

More information

An introduction to SPSS

An introduction to SPSS An introduction to SPSS To open the SPSS software using U of Iowa Virtual Desktop... Go to https://virtualdesktop.uiowa.edu and choose SPSS 24. Contents NOTE: Save data files in a drive that is accessible

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 327 Geometric Regression Introduction Geometric regression is a special case of negative binomial regression in which the dispersion parameter is set to one. It is similar to regular multiple regression

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users BIOSTATS 640 Spring 2018 Review of Introductory Biostatistics STATA solutions Page 1 of 13 Key Comments begin with an * Commands are in bold black I edited the output so that it appears here in blue Unit

More information

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,

More information

Stata versions 12 & 13 Week 4 Practice Problems

Stata versions 12 & 13 Week 4 Practice Problems Stata versions 12 & 13 Week 4 Practice Problems SOLUTIONS 1 Practice Screen Capture a Create a word document Name it using the convention lastname_lab1docx (eg bigelow_lab1docx) b Using your browser, go

More information

A New Method of Using Polytomous Independent Variables with Many Levels for the Binary Outcome of Big Data Analysis

A New Method of Using Polytomous Independent Variables with Many Levels for the Binary Outcome of Big Data Analysis Paper 2641-2015 A New Method of Using Polytomous Independent Variables with Many Levels for the Binary Outcome of Big Data Analysis ABSTRACT John Gao, ConstantContact; Jesse Harriott, ConstantContact;

More information

A SAS Macro for Covariate Specification in Linear, Logistic, or Survival Regression

A SAS Macro for Covariate Specification in Linear, Logistic, or Survival Regression Paper 1223-2017 A SAS Macro for Covariate Specification in Linear, Logistic, or Survival Regression Sai Liu and Margaret R. Stedman, Stanford University; ABSTRACT Specifying the functional form of a covariate

More information

Correctly Compute Complex Samples Statistics

Correctly Compute Complex Samples Statistics PASW Complex Samples 17.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample

More information

Biostat Methods STAT 5820/6910 Handout #9 Meta-Analysis Examples

Biostat Methods STAT 5820/6910 Handout #9 Meta-Analysis Examples Biostat Methods STAT 5820/6910 Handout #9 Meta-Analysis Examples Example 1 A RCT was conducted to consider whether steroid therapy for expectant mothers affects death rate of premature [less than 37 weeks]

More information

Dynamic Network Regression Using R Package dnr

Dynamic Network Regression Using R Package dnr Dynamic Network Regression Using R Package dnr Abhirup Mallik July 26, 2018 R package dnr enables the user to fit dynamic network regression models for time variate network data available mostly in social

More information

Product Catalog. AcaStat. Software

Product Catalog. AcaStat. Software Product Catalog AcaStat Software AcaStat AcaStat is an inexpensive and easy-to-use data analysis tool. Easily create data files or import data from spreadsheets or delimited text files. Run crosstabulations,

More information

Correctly Compute Complex Samples Statistics

Correctly Compute Complex Samples Statistics SPSS Complex Samples 15.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users BIOSTATS 640 Spring 2019 Review of Introductory Biostatistics R solutions Page 1 of 16 Preliminary Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users a) How are homeworks graded? This

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 21, 2017 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 27, 2018 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Topics Validation of biomedical models Data-splitting Resampling Cross-validation

More information

Hierarchical Generalized Linear Models

Hierarchical Generalized Linear Models Generalized Multilevel Linear Models Introduction to Multilevel Models Workshop University of Georgia: Institute for Interdisciplinary Research in Education and Human Development 07 Generalized Multilevel

More information

Brief Guide on Using SPSS 10.0

Brief Guide on Using SPSS 10.0 Brief Guide on Using SPSS 10.0 (Use student data, 22 cases, studentp.dat in Dr. Chang s Data Directory Page) (Page address: http://www.cis.ysu.edu/~chang/stat/) I. Processing File and Data To open a new

More information

Generalized least squares (GLS) estimates of the level-2 coefficients,

Generalized least squares (GLS) estimates of the level-2 coefficients, Contents 1 Conceptual and Statistical Background for Two-Level Models...7 1.1 The general two-level model... 7 1.1.1 Level-1 model... 8 1.1.2 Level-2 model... 8 1.2 Parameter estimation... 9 1.3 Empirical

More information

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors Week 10: Autocorrelated errors This week, I have done one possible analysis and provided lots of output for you to consider. Case study: predicting body fat Body fat is an important health measure, but

More information

Performing Cluster Bootstrapped Regressions in R

Performing Cluster Bootstrapped Regressions in R Performing Cluster Bootstrapped Regressions in R Francis L. Huang / October 6, 2016 Supplementary material for: Using Cluster Bootstrapping to Analyze Nested Data with a Few Clusters in Educational and

More information

Exercise 3: Multivariable analysis in R part 1: Logistic regression

Exercise 3: Multivariable analysis in R part 1: Logistic regression Exercise 3: Multivariable analysis in R part 1: Logistic regression At the end of this exercise you should be able to: a. Know how to use logistic regression in R b. Know how to properly remove factors

More information

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL SPSS QM II SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform some statistical analyses in SPSS. Details around a certain function/analysis method not covered

More information

Lecture 1: Statistical Reasoning 2. Lecture 1. Simple Regression, An Overview, and Simple Linear Regression

Lecture 1: Statistical Reasoning 2. Lecture 1. Simple Regression, An Overview, and Simple Linear Regression Lecture Simple Regression, An Overview, and Simple Linear Regression Learning Objectives In this set of lectures we will develop a framework for simple linear, logistic, and Cox Proportional Hazards Regression

More information

Zero-Inflated Poisson Regression

Zero-Inflated Poisson Regression Chapter 329 Zero-Inflated Poisson Regression Introduction The zero-inflated Poisson (ZIP) regression is used for count data that exhibit overdispersion and excess zeros. The data distribution combines

More information

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors

More information

Research Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel

Research Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel Research Methods for Business and Management Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel A Simple Example- Gym Purpose of Questionnaire- to determine the participants involvement

More information

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41 1/41 Model selection 560 Hierarchical modeling Peter Hoff Statistics, University of Washington /41 Modeling choices Model: A statistical model is a set of probability distributions for your data. In HLM,

More information

Chapter 10: Extensions to the GLM

Chapter 10: Extensions to the GLM Chapter 10: Extensions to the GLM 10.1 Implement a GAM for the Swedish mortality data, for males, using smooth functions for age and year. Age and year are standardized as described in Section 4.11, for

More information

Introduction to StatsDirect, 15/03/2017 1

Introduction to StatsDirect, 15/03/2017 1 INTRODUCTION TO STATSDIRECT PART 1... 2 INTRODUCTION... 2 Why Use StatsDirect... 2 ACCESSING STATSDIRECT FOR WINDOWS XP... 4 DATA ENTRY... 5 Missing Data... 6 Opening an Excel Workbook... 6 Moving around

More information

Package samplesizelogisticcasecontrol

Package samplesizelogisticcasecontrol Package samplesizelogisticcasecontrol February 4, 2017 Title Sample Size Calculations for Case-Control Studies Version 0.0.6 Date 2017-01-31 Author Mitchell H. Gail To determine sample size for case-control

More information

CHAPTER 7 ASDA ANALYSIS EXAMPLES REPLICATION-SPSS/PASW V18 COMPLEX SAMPLES

CHAPTER 7 ASDA ANALYSIS EXAMPLES REPLICATION-SPSS/PASW V18 COMPLEX SAMPLES CHAPTER 7 ASDA ANALYSIS EXAMPLES REPLICATION-SPSS/PASW V18 COMPLEX SAMPLES GENERAL NOTES ABOUT ANALYSIS EXAMPLES REPLICATION These examples are intended to provide guidance on how to use the commands/procedures

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

Lecture 7: Linear Regression (continued)

Lecture 7: Linear Regression (continued) Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC. Mixed Effects Models Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC March 6, 2018 Resources for statistical assistance Department of Statistics

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

Gelman-Hill Chapter 3

Gelman-Hill Chapter 3 Gelman-Hill Chapter 3 Linear Regression Basics In linear regression with a single independent variable, as we have seen, the fundamental equation is where ŷ bx 1 b0 b b b y 1 yx, 0 y 1 x x Bivariate Normal

More information

Didacticiel - Études de cas

Didacticiel - Études de cas Subject In some circumstances, the goal of the supervised learning is not to classify examples but rather to organize them in order to point up the most interesting individuals. For instance, in the direct

More information

1 The SAS System 23:01 Friday, November 9, 2012

1 The SAS System 23:01 Friday, November 9, 2012 2101f12HW9chickwts.log Saved: Wednesday, November 14, 2012 6:50:49 PM Page 1 of 3 1 The SAS System 23:01 Friday, November 9, 2012 NOTE: Copyright (c) 2002-2010 by SAS Institute Inc., Cary, NC, USA. NOTE:

More information

Statistical Analysis of List Experiments

Statistical Analysis of List Experiments Statistical Analysis of List Experiments Kosuke Imai Princeton University Joint work with Graeme Blair October 29, 2010 Blair and Imai (Princeton) List Experiments NJIT (Mathematics) 1 / 26 Motivation

More information

Panel Data 4: Fixed Effects vs Random Effects Models

Panel Data 4: Fixed Effects vs Random Effects Models Panel Data 4: Fixed Effects vs Random Effects Models Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised April 4, 2017 These notes borrow very heavily, sometimes verbatim,

More information

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University 9/1/2005 Beta-Regression with SPSS 1 Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University (email: Michael.Smithson@anu.edu.au) SPSS Nonlinear Regression syntax

More information

STATA Version 9 10/05/2012 1

STATA Version 9 10/05/2012 1 INTRODUCTION TO STATA PART I... 2 INTRODUCTION... 2 Background... 2 Starting STATA... 3 Window Orientation... 4 Command Structure... 4 The Help Menu... 4 Selecting a Subset of the Data... 5 Inputting Data...

More information

Package mcemglm. November 29, 2015

Package mcemglm. November 29, 2015 Type Package Package mcemglm November 29, 2015 Title Maximum Likelihood Estimation for Generalized Linear Mixed Models Version 1.1 Date 2015-11-28 Author Felipe Acosta Archila Maintainer Maximum likelihood

More information

BUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office)

BUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office) SAS (Base & Advanced) Analytics & Predictive Modeling Tableau BI 96 HOURS Practical Learning WEEKDAY & WEEKEND BATCHES CLASSROOM & LIVE ONLINE DexLab Certified BUSINESS ANALYTICS Training Module Gurgaon

More information

Week 5: Multiple Linear Regression II

Week 5: Multiple Linear Regression II Week 5: Multiple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Adjusted R

More information

Bivariate (Simple) Regression Analysis

Bivariate (Simple) Regression Analysis Revised July 2018 Bivariate (Simple) Regression Analysis This set of notes shows how to use Stata to estimate a simple (two-variable) regression equation. It assumes that you have set Stata up on your

More information

IPUMS Training and Development: Requesting Data

IPUMS Training and Development: Requesting Data IPUMS Training and Development: Requesting Data IPUMS PMA Exercise 2 OBJECTIVE: Gain an understanding of how IPUMS PMA service delivery point datasets are structured and how it can be leveraged to explore

More information

Loglinear and Logit Models for Contingency Tables

Loglinear and Logit Models for Contingency Tables Chapter 8 Loglinear and Logit Models for Contingency Tables Loglinear models comprise another special case of generalized linear models designed for contingency tables of frequencies. They are most easily

More information

Week 4: Simple Linear Regression II

Week 4: Simple Linear Regression II Week 4: Simple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Algebraic properties

More information

ISyE 6416 Basic Statistical Methods - Spring 2016 Bonus Project: Big Data Analytics Final Report. Team Member Names: Xi Yang, Yi Wen, Xue Zhang

ISyE 6416 Basic Statistical Methods - Spring 2016 Bonus Project: Big Data Analytics Final Report. Team Member Names: Xi Yang, Yi Wen, Xue Zhang ISyE 6416 Basic Statistical Methods - Spring 2016 Bonus Project: Big Data Analytics Final Report Team Member Names: Xi Yang, Yi Wen, Xue Zhang Project Title: Improve Room Utilization Introduction Problem

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Applied Statistics and Econometrics Lecture 6

Applied Statistics and Econometrics Lecture 6 Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,

More information

URLs identification task: Istat current status. Istat developed and applied a procedure consisting of the following steps:

URLs identification task: Istat current status. Istat developed and applied a procedure consisting of the following steps: ESSnet BIG DATA WorkPackage 2 URLs identification task: Istat current status Giulio Barcaroli, Monica Scannapieco, Donato Summa Istat developed and applied a procedure consisting of the following steps:

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Regression. Dr. G. Bharadwaja Kumar VIT Chennai

Regression. Dr. G. Bharadwaja Kumar VIT Chennai Regression Dr. G. Bharadwaja Kumar VIT Chennai Introduction Statistical models normally specify how one set of variables, called dependent variables, functionally depend on another set of variables, called

More information

Random coefficients models

Random coefficients models enote 9 1 enote 9 Random coefficients models enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data...........................

More information

Strategies for Modeling Two Categorical Variables with Multiple Category Choices

Strategies for Modeling Two Categorical Variables with Multiple Category Choices 003 Joint Statistical Meetings - Section on Survey Research Methods Strategies for Modeling Two Categorical Variables with Multiple Category Choices Christopher R. Bilder Department of Statistics, University

More information

BICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017

BICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017 BICF Nano Course: GWAS GWAS Workflow Development using PLINK Julia Kozlitina Julia.Kozlitina@UTSouthwestern.edu April 28, 2017 Getting started Open the Terminal (Search -> Applications -> Terminal), and

More information

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation CSSS 510: Lab 2 Introduction to Maximum Likelihood Estimation 2018-10-12 0. Agenda 1. Housekeeping: simcf, tile 2. Questions about Homework 1 or lecture 3. Simulating heteroskedastic normal data 4. Fitting

More information

Nina Zumel and John Mount Win-Vector LLC

Nina Zumel and John Mount Win-Vector LLC SUPERVISED LEARNING IN R: REGRESSION Logistic regression to predict probabilities Nina Zumel and John Mount Win-Vector LLC Predicting Probabilities Predicting whether an event occurs (yes/no): classification

More information

IQR = number. summary: largest. = 2. Upper half: Q3 =

IQR = number. summary: largest. = 2. Upper half: Q3 = Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number

More information

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata Paul Dickman September 2003 1 A brief introduction to Stata Starting the Stata program

More information

Statistical Consulting Topics Using cross-validation for model selection. Cross-validation is a technique that can be used for model evaluation.

Statistical Consulting Topics Using cross-validation for model selection. Cross-validation is a technique that can be used for model evaluation. Statistical Consulting Topics Using cross-validation for model selection Cross-validation is a technique that can be used for model evaluation. We often fit a model to a full data set and then perform

More information

Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers

Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers Why enhance GLM? Shortcomings of the linear modelling approach. GLM being

More information

MACAU User Manual. Xiang Zhou. March 15, 2017

MACAU User Manual. Xiang Zhou. March 15, 2017 MACAU User Manual Xiang Zhou March 15, 2017 Contents 1 Introduction 2 1.1 What is MACAU...................................... 2 1.2 How to Cite MACAU................................... 2 1.3 The Model.........................................

More information

Using the SemiPar Package

Using the SemiPar Package Using the SemiPar Package NICHOLAS J. SALKOWSKI Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA salk0008@umn.edu May 15, 2008 1 Introduction The

More information

An Introduction to Model-Fitting with the R package glmm

An Introduction to Model-Fitting with the R package glmm An Introduction to Model-Fitting with the R package glmm Christina Knudson November 7, 2017 Contents 1 Introduction 2 2 Formatting the Data 2 3 Fitting the Model 4 4 Reading the Model Summary 6 5 Isolating

More information

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals BIOSTATS 540 Fall 017 8. SUPPLEMENT Normal, T, Chi Square, F and Sums of Normals Page 1 of Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals Topic 1. Normal Distribution.. a. Definition..

More information

Instrumental variables, bootstrapping, and generalized linear models

Instrumental variables, bootstrapping, and generalized linear models The Stata Journal (2003) 3, Number 4, pp. 351 360 Instrumental variables, bootstrapping, and generalized linear models James W. Hardin Arnold School of Public Health University of South Carolina Columbia,

More information

Generalized additive models II

Generalized additive models II Generalized additive models II Patrick Breheny October 13 Patrick Breheny BST 764: Applied Statistical Modeling 1/23 Coronary heart disease study Today s lecture will feature several case studies involving

More information

STATA Note 5. One sample binomial data Confidence interval for proportion Unpaired binomial data: 2 x 2 tables Paired binomial data

STATA Note 5. One sample binomial data Confidence interval for proportion Unpaired binomial data: 2 x 2 tables Paired binomial data Postgraduate Course in Biostatistics, University of Aarhus STATA Note 5 One sample binomial data Confidence interval for proportion Unpaired binomial data: 2 x 2 tables Paired binomial data One sample

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

The SAS RELRISK9 Macro

The SAS RELRISK9 Macro The SAS RELRISK9 Macro Sally Skinner, Ruifeng Li, Ellen Hertzmark, and Donna Spiegelman November 15, 2012 Abstract The %RELRISK9 macro obtains relative risk estimates using PROC GENMOD with the binomial

More information