Regression III: Lab 4

Size: px
Start display at page:

Download "Regression III: Lab 4"

Transcription

1 Regression III: Lab 4 This lab will work through some model/variable selection problems, finite mixture models and missing data issues. You shouldn t feel obligated to work through this linearly, I would rather suggest that you try to work with topics your most interested in first. Model Selection In class we looked at some data we used in the mixture model on democracy. Using the dataset indicated below, test two different models of liberal-conservative identification. > library(foreign) > dat <- read.dta(' > dat$libcpre_self_num <- as.numeric(dat$libcpre_self) > dat$dem_agegrp_num <- as.numeric(dat$dem_agegrp) Use the methods we discussed in class to adjudicate between the two following models. Model 1 Estimate a linear model where you predict libcpre_self_num with indsocial, relig_chmember, dem_edugroup, dem_agegrp_num and gender_respondent. The model here represents social determinants of self-identification along with some controls. Model 2 Estimate a linear model where you predict libcpre_self_num with indspend, inc_incgroup_pre, dem_edugroup, dem_agegrp_num and gender_respondent. This model represents a more economically driven identification > m1 <- lm(libcpre_self_num ~ indsocial + relig_chmember + dem_edugroup + + dem_agegrp_num + gender_respondent, data=na.omit(dat)) > m2 <- lm(libcpre_self_num ~ indspend + inc_incgroup_pre + dem_edugroup + + dem_agegrp_num + gender_respondent, data=na.omit(dat)) Now you can do the testing. They could do whatever, but I ll look at AIC, BIC, Vuong and Clarke: > AIC(m1) [1] > AIC(m2) [1] > BIC(m1) [1] > BIC(m2) [1] > library(games) > clarke(m1, m2) Clarke test for non-nested models Model 1 log-likelihood: Model 2 log-likelihood: Observations: 2007 Test statistic: 1384 (69%) Model 1 is preferred (p < 2e-16) Now, use model averaging to evaluate the relative importance of the social and spending variables and to get model coefficients that incorporate selection uncertainty. > library(mumin) > mods <- list(m1, m2) > summary(model.avg(m1, m2)) 1

2 Call: model.avg(object = m1, m2) Component model call: lm(formula = <2 unique values>, data = na.omit(dat)) Component models: df loglik AICc delta weight Term codes: dem_agegrp_num dem_edugroup gender_respondent inc_incgroup_pre indsocial indspend relig_chmember Model-averaged coefficients: (full average) Estimate Std. Error Adjusted SE z value Pr(> z ) (Intercept) 4.120e e e < 2e-16 indsocial 1.212e e e < 2e-16 relig_chmemberno e e e e-05 dem_edugrouphs 1.492e e e dem_edugrouphs but no BA/S 1.588e e e dem_edugroupba/s 1.374e e e dem_edugroupgrad Deg e e e dem_agegrp_num 4.392e e e e-07 gender_respondent e e e indspend 2.926e e e inc_incgroup_pre 4.251e e e (Intercept) *** indsocial *** relig_chmemberno *** dem_edugrouphs dem_edugrouphs but no BA/S dem_edugroupba/s dem_edugroupgrad Deg dem_agegrp_num *** gender_respondent ** indspend inc_incgroup_pre (conditional average) Estimate Std. Error Adjusted SE z value Pr(> z ) (Intercept) < 2e-16 indsocial < 2e-16 relig_chmemberno e-05 dem_edugrouphs dem_edugrouphs but no BA/S dem_edugroupba/s dem_edugroupgrad Deg dem_agegrp_num e-07 gender_respondent indspend < 2e-16 inc_incgroup_pre (Intercept) *** indsocial *** relig_chmemberno *** dem_edugrouphs dem_edugrouphs but no BA/S dem_edugroupba/s dem_edugroupgrad Deg dem_agegrp_num *** gender_respondent ** indspend *** inc_incgroup_pre --- Signif. codes: 0 ^aăÿ***^aăź ^aăÿ**^aăź 0.01 ^aăÿ*^aăź 0.05 ^aăÿ.^aăź 0.1 ^aăÿ ^aăź 1 Relative variable importance: dem_agegrp_num dem_edugroup gender_respondent indsocial Importance: N containing models: relig_chmember inc_incgroup_pre indspend Importance: 1 <0.01 <0.01 N containing models:

3 Finite Mixture Models Using the data above, estimate a finite mixture model where you assume that indsocial and indspend operationalize the two different theories (don t use relig_chmember and inc_incgroup_pre in the models, yet.) Evaluate the resulting model against the linear model where there is an interaction between indsocial and indspend including the other controls. Run an OLS regression of libcpre_self_num on indsocial (a composite of social policy attitudes) and indspend (a composite of spending policy attitudes), their interaction and the controls gender_respondent, dem_edugroup and dem_agegrp_num. > library(flexmix) > dat <- read.dta(' > dat$libcpre_self_num <- as.numeric(dat$libcpre_self) > dat$dem_agegrp_num <- as.numeric(dat$dem_agegrp) > mod <- lm(libcpre_self_num ~ indsocial*indspend + gender_respondent + dem_edugroup + dem_agegrp_n > library(damisc) > DAintfun2(mod, c("indsocial", "indspend"), hist=t, scale.hist=.3) Conditional Effect of INDSOCIAL INDSPEND Conditional Effect of INDSPEND INDSOCIAL INDSPEND INDSOCIAL Estimate a finite mixture model where indsocial is the variable of interest in one component and indspend is the variable of interest in the other. That is set the indspend coefficient to zero in the model with indsocial and indsocial s coefficient to zero in the model with indspend. Fix the coefficients on the other regressors to be constant across the two components. 3

4 > model <- FLXMRglmfix(family = "gaussian", nested=list(k=c(1,1), + formula = c(~indsocial, + ~indspend)), fixed= ~ dem_agegrp_num + gender_respondent + dem_edugroup) > out.a <- stepflexmix(libcpre_self_num ~ 1, k=2, model=model, data=dat, nrep=20) 2 : * * * * * * * * * * * * * * * * * * * * > mod.refit <- refit(out.a) How do the two different models fit? Which do you think it better? > fit1 <- predict(out.a)$comp.1[[1]] > fit2 <- predict(out.a)$comp.2[[1]] > post <- out.a@posterior$scaled > predfix1 <- rowsums(cbind(fit1, fit2)*post) > ## Correlations of fitted values and observed values from the mixture model > cor(predfix1, dat$libcpre_self)^2 [1] > ## Correlation using only the fits from best predicted theory > predfix2 <- fit1 > predfix2[which(post[,2] >.5)] <- fit2[which(post[,2] >.5)] > cor(predfix2, dat$libcpre_self)^2 [,1] [1,] > ## correlations of fitted values and observed from the linear model > cor(mod$fitted, dat$libcpre_self)^2 [1] Consider that income (an economic predictor operationalized by inc_incgroup_pre) and church membership (a social predictor operationalized by relig_chmember) might tell us something about the probabilities of being in one or the other group. Incorporate that information and see whether, in fact, they do provide information about group membership. > out.b <- stepflexmix(libcpre_self ~ 1, k=2, model=model, data=dat, nrep=20, + concomitant = FLXPmultinom( ~ inc_incgroup_pre + relig_chmember)) 2 : * * * * * * * * * * * * * * * * * * * * > out.b.refit <- refit(out.b) > out.b.refit@concomitant $Comp.2 Estimate Std. Error z value Pr(> z ) (Intercept) * inc_incgroup_pre e-07 *** relig_chmember2. No Signif. codes: 0 ^aăÿ***^aăź ^aăÿ**^aăź 0.01 ^aăÿ*^aăź 0.05 ^aăÿ.^aăź 0.1 ^aăÿ ^aăź 1 4

5 Missing Data and Multiple Imputation Using lab3b_data.dta, do multiple imputation (with 5 imputations) on all of the variables in the dataset. You can download the data with: > library(foreign) > dat <- read.dta(' > dat$libcpre_self <- as.numeric(dat$libcpre_self) > dat$dem_agegrp <- as.numeric(dat$dem_agegrp) See how the coefficients change in a model of libcpre_self on all the other variables in the dataset from listwise deletion to multiple imputation. > library(mice) > library(mitools) > mice.out <- mice(dat, print.flag=f) iter imp variable 1 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember > mod.lm <- lm(libcpre_self ~ indsocial + indspend + dem_agegrp + + gender_respondent + dem_edugroup + inc_incgroup_pre + relig_chmember, data=dat) > mod.mids <- lm.mids(libcpre_self ~ indsocial + indspend + dem_agegrp + + gender_respondent + dem_edugroup + inc_incgroup_pre + relig_chmember, data=mice.out) > summary(mod.lm) Call: lm(formula = libcpre_self ~ indsocial + indspend + dem_agegrp + gender_respondent + dem_edugroup + inc_incgroup_pre + relig_chmember, data = dat) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** indsocial < 2e-16 *** indspend e-15 *** dem_agegrp e-06 *** gender_respondent * dem_edugrouphs dem_edugrouphs but no BA/S dem_edugroupba/s dem_edugroupgrad Deg * inc_incgroup_pre relig_chmemberno e-06 *** --- Signif. codes: 0 ^aăÿ***^aăź ^aăÿ**^aăź 0.01 ^aăÿ*^aăź 0.05 ^aăÿ.^aăź 0.1 ^aăÿ ^aăź 1 Residual standard error: on 1996 degrees of freedom (3909 observations deleted due to missingness) Multiple R-squared: , Adjusted R-squared: F-statistic: on 10 and 1996 DF, p-value: < 2.2e-16 > summary(pool(mod.mids)) est se t df Pr(> t ) (Intercept) e+00 indsocial e+00 indspend e+00 dem_agegrp e-09 gender_respondent e-02 dem_edugroup e-02 dem_edugroup e-03 dem_edugroup e-03 dem_edugroup e-01 inc_incgroup_pre e-01 relig_chmember e-07 lo 95 hi 95 nmis fmi lambda (Intercept) NA indsocial indspend

6 dem_agegrp gender_respondent dem_edugroup NA dem_edugroup NA dem_edugroup NA dem_edugroup NA inc_incgroup_pre relig_chmember NA How do the results change if you have use 10 imputations instead of 5? > mice.out2 <- mice(dat, m=10, print.flag=f) iter imp variable 1 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 6 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 7 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 8 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 9 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 1 10 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 6 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 7 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 8 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 9 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 2 10 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 6 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 7 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 8 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 9 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 3 10 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 6 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 7 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 8 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 9 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 4 10 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 1 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 2 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 3 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 4 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 5 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 6 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 7 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 8 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 9 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember 5 10 indsocial indspend dem_edugroup dem_agegrp libcpre_self inc_incgroup_pre relig_chmember > mod.mids2 <- lm.mids(as.numeric(libcpre_self) ~ indsocial + indspend + + as.numeric(dem_agegrp) + gender_respondent + dem_edugroup + inc_incgroup_pre + + relig_chmember, data=mice.out2) > summary(pool(mod.mids2)) est se t df (Intercept) indsocial indspend as.numeric(dem_agegrp) gender_respondent dem_edugroup dem_edugroup dem_edugroup dem_edugroup inc_incgroup_pre

7 relig_chmember Pr(> t ) lo 95 hi 95 nmis fmi (Intercept) e NA indsocial e indspend e as.numeric(dem_agegrp) e NA gender_respondent e dem_edugroup e NA dem_edugroup e NA dem_edugroup e NA dem_edugroup e NA inc_incgroup_pre e relig_chmember e NA lambda (Intercept) indsocial indspend as.numeric(dem_agegrp) gender_respondent dem_edugroup dem_edugroup dem_edugroup dem_edugroup inc_incgroup_pre relig_chmember Consider two situations - where non-responders are one standard deviation more liberal than responders and when they are one standard deviation more conservative than non-responders. Do your results change at all? > library(sensmiceda) > lrvals <- c(-1,1)*sd(dat$libcpre_self, na.rm=t) > out <- sens.est(mice.out, list(libcpre_self = lrvals)) libcpre_self pmm Summary : Variable Method SupPar [1,] "libcpre_self" "pmm" " " libcpre_self pmm Summary : Variable Method SupPar [1,] "libcpre_self" "pmm" " " > pool.dat <- sens.pool(mod.lm, out, mice.out) Multiple imputation results: MIcombine.default(X[[i]],...) results se (lower upper) missinfo (Intercept) % indsocial % indspend % dem_agegrp % gender_respondent % dem_edugroup % dem_edugroup % dem_edugroup % dem_edugroup % inc_incgroup_pre % relig_chmember % Multiple imputation results: MIcombine.default(X[[i]],...) results se (lower upper) missinfo (Intercept) % indsocial % indspend % dem_agegrp % gender_respondent % dem_edugroup % dem_edugroup % dem_edugroup % dem_edugroup % inc_incgroup_pre % relig_chmember % Multiple imputation results: MIcombine.default(X[[i]],...) results se (lower upper) missinfo (Intercept) % indsocial % indspend %

8 dem_agegrp % gender_respondent % dem_edugroup % dem_edugroup % dem_edugroup % dem_edugroup % inc_incgroup_pre % relig_chmember % > plot(pool.dat) libcpre_self: 1.47 libcpre_self: 1.47 mice Coefficients with 95% Confidence Intervals indsocial dem_edugroup4 (Intercept) indspend dem_edugroup5 dem_agegrp relig_chmember2 gender_respondent dem_edugroup inc_incgroup_pre dem_edugroup3 libcpre_self: 1.47 libcpre_self: 1.47 mice libcpre_self: 1.47 libcpre_self: 1.47 mice 8

Lab #13 - Resampling Methods Econ 224 October 23rd, 2018

Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Introduction In this lab you will work through Section 5.3 of ISL and record your code and results in an RMarkdown document. I have added section

More information

Performing Cluster Bootstrapped Regressions in R

Performing Cluster Bootstrapped Regressions in R Performing Cluster Bootstrapped Regressions in R Francis L. Huang / October 6, 2016 Supplementary material for: Using Cluster Bootstrapping to Analyze Nested Data with a Few Clusters in Educational and

More information

Gelman-Hill Chapter 3

Gelman-Hill Chapter 3 Gelman-Hill Chapter 3 Linear Regression Basics In linear regression with a single independent variable, as we have seen, the fundamental equation is where ŷ bx 1 b0 b b b y 1 yx, 0 y 1 x x Bivariate Normal

More information

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10 St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Solution to Bonus Questions

Solution to Bonus Questions Solution to Bonus Questions Q2: (a) The histogram of 1000 sample means and sample variances are plotted below. Both histogram are symmetrically centered around the true lambda value 20. But the sample

More information

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41 1/41 Model selection 560 Hierarchical modeling Peter Hoff Statistics, University of Washington /41 Modeling choices Model: A statistical model is a set of probability distributions for your data. In HLM,

More information

Section 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 3.4: Diagnostics and Transformations Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Regression Model Assumptions Y i = β 0 + β 1 X i + ɛ Recall the key assumptions

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

Exercise 2.23 Villanova MAT 8406 September 7, 2015

Exercise 2.23 Villanova MAT 8406 September 7, 2015 Exercise 2.23 Villanova MAT 8406 September 7, 2015 Step 1: Understand the Question Consider the simple linear regression model y = 50 + 10x + ε where ε is NID(0, 16). Suppose that n = 20 pairs of observations

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

mcssubset: Efficient Computation of Best Subset Linear Regressions in R

mcssubset: Efficient Computation of Best Subset Linear Regressions in R mcssubset: Efficient Computation of Best Subset Linear Regressions in R Marc Hofmann Université de Neuchâtel Cristian Gatu Université de Neuchâtel Erricos J. Kontoghiorghes Birbeck College Achim Zeileis

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients

More information

NEURAL NETWORKS. Cement. Blast Furnace Slag. Fly Ash. Water. Superplasticizer. Coarse Aggregate. Fine Aggregate. Age

NEURAL NETWORKS. Cement. Blast Furnace Slag. Fly Ash. Water. Superplasticizer. Coarse Aggregate. Fine Aggregate. Age NEURAL NETWORKS As an introduction, we ll tackle a prediction task with a continuous variable. We ll reproduce research from the field of cement and concrete manufacturing that seeks to model the compressive

More information

The linear mixed model: modeling hierarchical and longitudinal data

The linear mixed model: modeling hierarchical and longitudinal data The linear mixed model: modeling hierarchical and longitudinal data Analysis of Experimental Data AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44 Contents 1 Modeling Hierarchical

More information

Among those 14 potential explanatory variables,non-dummy variables are:

Among those 14 potential explanatory variables,non-dummy variables are: Among those 14 potential explanatory variables,non-dummy variables are: Size: 2nd column in the dataset Land: 14th column in the dataset Bed.Rooms: 5th column in the dataset Fireplace: 7th column in the

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Introduction to mixed-effects regression for (psycho)linguists

Introduction to mixed-effects regression for (psycho)linguists Introduction to mixed-effects regression for (psycho)linguists Martijn Wieling Department of Humanities Computing, University of Groningen Groningen, April 21, 2015 1 Martijn Wieling Introduction to mixed-effects

More information

Comparing Fitted Models with the fit.models Package

Comparing Fitted Models with the fit.models Package Comparing Fitted Models with the fit.models Package Kjell Konis Acting Assistant Professor Computational Finance and Risk Management Dept. Applied Mathematics, University of Washington History of fit.models

More information

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison 610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison R is very touchy about unbalanced designs, partly because

More information

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables: Regression Lab The data set cholesterol.txt available on your thumb drive contains the following variables: Field Descriptions ID: Subject ID sex: Sex: 0 = male, = female age: Age in years chol: Serum

More information

References R's single biggest strenght is it online community. There are tons of free tutorials on R.

References R's single biggest strenght is it online community. There are tons of free tutorials on R. Introduction to R Syllabus Instructor Grant Cavanaugh Department of Agricultural Economics University of Kentucky E-mail: gcavanugh@uky.edu Course description Introduction to R is a short course intended

More information

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Statistics Lab #7 ANOVA Part 2 & ANCOVA Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")

More information

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 3.2: Multiple Linear Regression II Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Multiple Linear Regression: Inference and Understanding We can answer new questions

More information

Bivariate (Simple) Regression Analysis

Bivariate (Simple) Regression Analysis Revised July 2018 Bivariate (Simple) Regression Analysis This set of notes shows how to use Stata to estimate a simple (two-variable) regression equation. It assumes that you have set Stata up on your

More information

Regression on the trees data with R

Regression on the trees data with R > trees Girth Height Volume 1 8.3 70 10.3 2 8.6 65 10.3 3 8.8 63 10.2 4 10.5 72 16.4 5 10.7 81 18.8 6 10.8 83 19.7 7 11.0 66 15.6 8 11.0 75 18.2 9 11.1 80 22.6 10 11.2 75 19.9 11 11.3 79 24.2 12 11.4 76

More information

Practice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version)

Practice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version) Practice in R January 28, 2010 (pdf version) 1 Sivan s practice Her practice file should be (here), or check the web for a more useful pointer. 2 Hetroskadasticity ˆ Let s make some hetroskadastic data:

More information

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening Variables Entered/Removed b Variables Entered GPA in other high school, test, Math test, GPA, High school math GPA a Variables Removed

More information

Multivariate Analysis Multivariate Calibration part 2

Multivariate Analysis Multivariate Calibration part 2 Multivariate Analysis Multivariate Calibration part 2 Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com Linear Latent Variables An essential concept in multivariate data

More information

Salary 9 mo : 9 month salary for faculty member for 2004

Salary 9 mo : 9 month salary for faculty member for 2004 22s:52 Applied Linear Regression DeCook Fall 2008 Lab 3 Friday October 3. The data Set In 2004, a study was done to examine if gender, after controlling for other variables, was a significant predictor

More information

Applied Statistics and Econometrics Lecture 6

Applied Statistics and Econometrics Lecture 6 Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,

More information

Missing Data. SPIDA 2012 Part 6 Mixed Models with R:

Missing Data. SPIDA 2012 Part 6 Mixed Models with R: The best solution to the missing data problem is not to have any. Stef van Buuren, developer of mice SPIDA 2012 Part 6 Mixed Models with R: Missing Data Georges Monette 1 May 2012 Email: georges@yorku.ca

More information

Section 2.2: Covariance, Correlation, and Least Squares

Section 2.2: Covariance, Correlation, and Least Squares Section 2.2: Covariance, Correlation, and Least Squares Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.1, 7.2 1 A Deeper

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1

More information

Regression. Page 1. Notes. Output Created Comments Data. 26-Mar :31:18. Input. C:\Documents and Settings\BuroK\Desktop\Data Sets\Prestige.

Regression. Page 1. Notes. Output Created Comments Data. 26-Mar :31:18. Input. C:\Documents and Settings\BuroK\Desktop\Data Sets\Prestige. GET FILE='C:\Documents and Settings\BuroK\Desktop\DataSets\Prestige.sav'. GET FILE='E:\MacEwan\Teaching\Stat252\Data\SPSS_data\MENTALID.sav'. DATASET ACTIVATE DataSet1. DATASET CLOSE DataSet2. GET FILE='E:\MacEwan\Teaching\Stat252\Data\SPSS_data\survey_part.sav'.

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.

More information

Bayes Estimators & Ridge Regression

Bayes Estimators & Ridge Regression Bayes Estimators & Ridge Regression Readings ISLR 6 STA 521 Duke University Merlise Clyde October 27, 2017 Model Assume that we have centered (as before) and rescaled X o (original X) so that X j = X o

More information

Simulating power in practice

Simulating power in practice Simulating power in practice Author: Nicholas G Reich This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

Descriptives. Graph. [DataSet1] C:\Documents and Settings\BuroK\Desktop\Prestige.sav

Descriptives. Graph. [DataSet1] C:\Documents and Settings\BuroK\Desktop\Prestige.sav GET FILE='C:\Documents and Settings\BuroK\Desktop\Prestige.sav'. DESCRIPTIVES VARIABLES=prestige education income women /STATISTICS=MEAN STDDEV MIN MAX. Descriptives Input Missing Value Handling Resources

More information

Section 2.1: Intro to Simple Linear Regression & Least Squares

Section 2.1: Intro to Simple Linear Regression & Least Squares Section 2.1: Intro to Simple Linear Regression & Least Squares Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.1, 7.2 1 Regression:

More information

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,

More information

Multiple Regression White paper

Multiple Regression White paper +44 (0) 333 666 7366 Multiple Regression White paper A tool to determine the impact in analysing the effectiveness of advertising spend. Multiple Regression In order to establish if the advertising mechanisms

More information

Discussion Notes 3 Stepwise Regression and Model Selection

Discussion Notes 3 Stepwise Regression and Model Selection Discussion Notes 3 Stepwise Regression and Model Selection Stepwise Regression There are many different commands for doing stepwise regression. Here we introduce the command step. There are many arguments

More information

Exercise: Graphing and Least Squares Fitting in Quattro Pro

Exercise: Graphing and Least Squares Fitting in Quattro Pro Chapter 5 Exercise: Graphing and Least Squares Fitting in Quattro Pro 5.1 Purpose The purpose of this experiment is to become familiar with using Quattro Pro to produce graphs and analyze graphical data.

More information

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2) SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple

More information

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation CSSS 510: Lab 2 Introduction to Maximum Likelihood Estimation 2018-10-12 0. Agenda 1. Housekeeping: simcf, tile 2. Questions about Homework 1 or lecture 3. Simulating heteroskedastic normal data 4. Fitting

More information

STRAT. A Program for Analyzing Statistical Strategic Models. Version 1.4. Curtis S. Signorino Department of Political Science University of Rochester

STRAT. A Program for Analyzing Statistical Strategic Models. Version 1.4. Curtis S. Signorino Department of Political Science University of Rochester STRAT A Program for Analyzing Statistical Strategic Models Version 1.4 Curtis S. Signorino Department of Political Science University of Rochester c Copyright, 2001 2003, Curtis S. Signorino All rights

More information

Quantitative Methods in Management

Quantitative Methods in Management Quantitative Methods in Management MBA Glasgow University March 20-23, 2009 Luiz Moutinho, University of Glasgow Graeme Hutcheson, University of Manchester Exploratory Regression The lecture notes, exercises

More information

Section 2.1: Intro to Simple Linear Regression & Least Squares

Section 2.1: Intro to Simple Linear Regression & Least Squares Section 2.1: Intro to Simple Linear Regression & Least Squares Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.1, 7.2 1 Regression:

More information

Lecture 26: Missing data

Lecture 26: Missing data Lecture 26: Missing data Reading: ESL 9.6 STATS 202: Data mining and analysis December 1, 2017 1 / 10 Missing data is everywhere Survey data: nonresponse. 2 / 10 Missing data is everywhere Survey data:

More information

Poisson Regression and Model Checking

Poisson Regression and Model Checking Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)

More information

STAT Statistical Learning. Predictive Modeling. Statistical Learning. Overview. Predictive Modeling. Classification Methods.

STAT Statistical Learning. Predictive Modeling. Statistical Learning. Overview. Predictive Modeling. Classification Methods. STAT 48 - STAT 48 - December 5, 27 STAT 48 - STAT 48 - Here are a few questions to consider: What does statistical learning mean to you? Is statistical learning different from statistics as a whole? What

More information

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Examples: Mixture Modeling With Cross-Sectional Data CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Mixture modeling refers to modeling with categorical latent variables that represent

More information

36-402/608 HW #1 Solutions 1/21/2010

36-402/608 HW #1 Solutions 1/21/2010 36-402/608 HW #1 Solutions 1/21/2010 1. t-test (20 points) Use fullbumpus.r to set up the data from fullbumpus.txt (both at Blackboard/Assignments). For this problem, analyze the full dataset together

More information

Statistical Analysis in R Guest Lecturer: Maja Milosavljevic January 28, 2015

Statistical Analysis in R Guest Lecturer: Maja Milosavljevic January 28, 2015 Statistical Analysis in R Guest Lecturer: Maja Milosavljevic January 28, 2015 Data Exploration Import Relevant Packages: library(grdevices) library(graphics) library(plyr) library(hexbin) library(base)

More information

Repeated Measures Part 4: Blood Flow data

Repeated Measures Part 4: Blood Flow data Repeated Measures Part 4: Blood Flow data /* bloodflow.sas */ options linesize=79 pagesize=100 noovp formdlim='_'; title 'Two within-subjecs factors: Blood flow data (NWK p. 1181)'; proc format; value

More information

Solution to Series 7

Solution to Series 7 Dr. Marcel Dettling Applied Statistical Regression AS 2015 Solution to Series 7 1. a) We begin the analysis by plotting histograms and barplots for all variables. > ## load data > load("customerwinback.rda")

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Mark Johnson Macquarie University Sydney, Australia February 27, 2017 1 / 38 Outline Introduction Hypothesis tests and confidence intervals Classical hypothesis tests

More information

Chapter 6: Linear Model Selection and Regularization

Chapter 6: Linear Model Selection and Regularization Chapter 6: Linear Model Selection and Regularization As p (the number of predictors) comes close to or exceeds n (the sample size) standard linear regression is faced with problems. The variance of the

More information

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017 Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University 9/1/2005 Beta-Regression with SPSS 1 Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University (email: Michael.Smithson@anu.edu.au) SPSS Nonlinear Regression syntax

More information

Estimating R 0 : Solutions

Estimating R 0 : Solutions Estimating R 0 : Solutions John M. Drake and Pejman Rohani Exercise 1. Show how this result could have been obtained graphically without the rearranged equation. Here we use the influenza data discussed

More information

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1)

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1) Orange Juice data Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20labs/l10-oj-data.html#(1) 1/31 Orange Juice Data The data contain weekly sales of refrigerated

More information

Section 4.1: Time Series I. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 4.1: Time Series I. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 4.1: Time Series I Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Time Series Data and Dependence Time-series data are simply a collection of observations gathered

More information

- 1 - Fig. A5.1 Missing value analysis dialog box

- 1 - Fig. A5.1 Missing value analysis dialog box WEB APPENDIX Sarstedt, M. & Mooi, E. (2019). A concise guide to market research. The process, data, and methods using SPSS (3 rd ed.). Heidelberg: Springer. Missing Value Analysis and Multiple Imputation

More information

Exponential Random Graph Models for Social Networks

Exponential Random Graph Models for Social Networks Exponential Random Graph Models for Social Networks ERGM Introduction Martina Morris Departments of Sociology, Statistics University of Washington Departments of Sociology, Statistics, and EECS, and Institute

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation Hoff Chapter 7, GH Chapter 25 April 21, 2017 Bednets and Malaria Y:presence or absence of parasites in a blood smear AGE: age of child BEDNET: bed net use (exposure) GREEN:greenness

More information

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Exploratory regression and model selection

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Exploratory regression and model selection Statistical Modelling for Social Scientists Manchester University January 20, 21 and 24, 2011 Graeme Hutcheson, University of Manchester Exploratory regression and model selection The lecture notes, exercises

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

Week 4: Simple Linear Regression III

Week 4: Simple Linear Regression III Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of

More information

In this chapter, we present how to use the multiple imputation methods

In this chapter, we present how to use the multiple imputation methods MULTIPLE IMPUTATION WITH PRINCIPAL COMPONENT METHODS: A USER GUIDE In this chapter, we present how to use the multiple imputation methods described previously: the BayesMIPCA method, allowing multiple

More information

**************************************************************************************************************** * Single Wave Analyses.

**************************************************************************************************************** * Single Wave Analyses. ASDA2 ANALYSIS EXAMPLE REPLICATION SPSS C11 * Syntax for Analysis Example Replication C11 * Use data sets previously prepared in SAS, refer to SAS Analysis Example Replication for C11 for details. ****************************************************************************************************************

More information

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010 Statistical Models for Management Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon February 24 26, 2010 Graeme Hutcheson, University of Manchester Exploratory regression and model

More information

1 Standard Errors on Different Models

1 Standard Errors on Different Models 1 Standard Errors on Different Models Math 158, Spring 2018 Jo Hardin Regression Splines & Smoothing/Kernel Splines R code First we scrape some weather data from NOAA. The resulting data we will use is

More information

Beta Regression: Shaken, Stirred, Mixed, and Partitioned. Achim Zeileis, Francisco Cribari-Neto, Bettina Grün

Beta Regression: Shaken, Stirred, Mixed, and Partitioned. Achim Zeileis, Francisco Cribari-Neto, Bettina Grün Beta Regression: Shaken, Stirred, Mixed, and Partitioned Achim Zeileis, Francisco Cribari-Neto, Bettina Grün Overview Motivation Shaken or stirred: Single or double index beta regression for mean and/or

More information

Introduction to R, Github and Gitlab

Introduction to R, Github and Gitlab Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and

More information

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015 STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................

More information

in this course) ˆ Y =time to event, follow-up curtailed: covered under ˆ Missing at random (MAR) a

in this course) ˆ Y =time to event, follow-up curtailed: covered under ˆ Missing at random (MAR) a Chapter 3 Missing Data 3.1 Types of Missing Data ˆ Missing completely at random (MCAR) ˆ Missing at random (MAR) a ˆ Informative missing (non-ignorable non-response) See 1, 38, 59 for an introduction to

More information

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error

More information

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering Encoding UTF-8 Version 1.0.3 Date 2018-03-25 Title Generalized Linear Models with Clustering Package glmmml March 25, 2018 Binomial and Poisson regression for clustered data, fixed and random effects with

More information

Model Selection and Inference

Model Selection and Inference Model Selection and Inference Merlise Clyde January 29, 2017 Last Class Model for brain weight as a function of body weight In the model with both response and predictor log transformed, are dinosaurs

More information

A survey analysis example

A survey analysis example A survey analysis example Thomas Lumley June 22, 2017 This document provides a simple example analysis of a survey data set, a subsample from the California Academic Performance Index, an annual set of

More information

Dynamic Network Regression Using R Package dnr

Dynamic Network Regression Using R Package dnr Dynamic Network Regression Using R Package dnr Abhirup Mallik July 26, 2018 R package dnr enables the user to fit dynamic network regression models for time variate network data available mostly in social

More information

Stat 5303 (Oehlert): Response Surfaces 1

Stat 5303 (Oehlert): Response Surfaces 1 Stat 5303 (Oehlert): Response Surfaces 1 > data

More information

1 Simple Linear Regression

1 Simple Linear Regression Math 158 Jo Hardin R code 1 Simple Linear Regression Consider a dataset from ISLR on credit scores. Because we don t know the sampling mechanism used to collect the data, we are unable to generalize the

More information

Multiple Linear Regression: Global tests and Multiple Testing

Multiple Linear Regression: Global tests and Multiple Testing Multiple Linear Regression: Global tests and Multiple Testing Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike

More information

Using HLM for Presenting Meta Analysis Results. R, C, Gardner Department of Psychology

Using HLM for Presenting Meta Analysis Results. R, C, Gardner Department of Psychology Data_Analysis.calm: dacmeta Using HLM for Presenting Meta Analysis Results R, C, Gardner Department of Psychology The primary purpose of meta analysis is to summarize the effect size results from a number

More information

Some methods for the quantification of prediction uncertainties for digital soil mapping: Universal kriging prediction variance.

Some methods for the quantification of prediction uncertainties for digital soil mapping: Universal kriging prediction variance. Some methods for the quantification of prediction uncertainties for digital soil mapping: Universal kriging prediction variance. Soil Security Laboratory 2018 1 Universal kriging prediction variance In

More information

Unit 5 Logistic Regression Practice Problems

Unit 5 Logistic Regression Practice Problems Unit 5 Logistic Regression Practice Problems SOLUTIONS R Users Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, 2004. Exercises

More information

CHAPTER 18 OUTPUT, SAVEDATA, AND PLOT COMMANDS

CHAPTER 18 OUTPUT, SAVEDATA, AND PLOT COMMANDS OUTPUT, SAVEDATA, And PLOT Commands CHAPTER 18 OUTPUT, SAVEDATA, AND PLOT COMMANDS THE OUTPUT COMMAND OUTPUT: In this chapter, the OUTPUT, SAVEDATA, and PLOT commands are discussed. The OUTPUT command

More information

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression

More information

Random coefficients models

Random coefficients models enote 9 1 enote 9 Random coefficients models enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data...........................

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

AA BB CC DD EE. Introduction to Graphics in R

AA BB CC DD EE. Introduction to Graphics in R Introduction to Graphics in R Cori Mar 7/10/18 ### Reading in the data dat

More information

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1 Panel data set Consists of n entities or subjects (e.g., firms and states), each of which includes T observations measured at 1 through t time period. total number of observations : nt Panel data have

More information

An Introductory Guide to R

An Introductory Guide to R An Introductory Guide to R By Claudia Mahler 1 Contents Installing and Operating R 2 Basics 4 Importing Data 5 Types of Data 6 Basic Operations 8 Selecting and Specifying Data 9 Matrices 11 Simple Statistics

More information

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data Introduction About this Document This manual was written by members of the Statistical Consulting Program as an introduction to SPSS 12.0. It is designed to assist new users in familiarizing themselves

More information

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS Examples: Missing Data Modeling And Bayesian Analysis CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS Mplus provides estimation of models with missing data using both frequentist and Bayesian

More information

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture IV: Quantitative Comparison of Models

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture IV: Quantitative Comparison of Models Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture IV: Quantitative Comparison of Models A classic mathematical model for enzyme kinetics is the Michaelis-Menten equation:

More information