22s:152 Applied Linear Regression

Size: px
Start display at page:

Download "22s:152 Applied Linear Regression"

Transcription

1 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider the available explanatory variables as candidate variables. (Some candidates may be transformations of others). Model selection can be challenging. If we have k candidate variables, there are potentially 2 k models to consider (i.e. each term being in or out of a given model). There are many methods for model selection, and we will only talk about a few in this class. One way to avoid looking at all possible subsets (potentially a very large number of models) is to use a stepwise procedure. For example, consider a backward stepwise method: 1. Start with the absolute largest model. 2. Choose a measure to quantify what makes a good model (R 2 is not a good choice, it will just choose the largest model every time). 3. Remove the term that most greatly increases the good model measure. 4. Continue to remove terms one at a time while the removal still provides a better model. 5. When removal of the next term would give you a worse model, stop the procedure. You ve found the best model. 1 2 The measure we use to make our choice should consider: 1. The number of explanatory variables in the model (we ll penalize models with too many). 2. The goodness of fit that the model provides. These express our conflicting interests... To describe the data reasonably well. (pushes toward more variables) To build a model simple enough to be interpretable. (pushes toward fewer variables) Some model selection measures (or criteria) Adjusted R 2,or R 2 : R 2 =1 RSS n 1 TSS n k 1 We prefer a model with a large R 2. Cross-Validation Criterion: ni=1 (Ŷ( i) CV = Y i) 2 n where Ŷ( i) is from the model fitted without using observation i. If you use a lot of parameters, you tend to over-fit the data, and we will do poorly at predicting a new Y not in the model-fitting (or training) data set. We prefer a model with a small CV. 3 4

2 Akaike information criterion (AIC): (This is assuming we have normal errors) AIC=n log e ˆσ 2 +2(k +1) We prefer a model with a small AIC. Bayesian information criterion (BIC): (This is assuming we have normal errors) BIC=n log e ˆσ 2 +(k +1)log e n We prefer a model with a small BIC. For both AIC and BIC, more parameters will provide smaller ˆσ 2,butthelasttermaddsona penalty related to the number of parameters in the model. Choose a best model using AIC in a Backward Stepwise Algorithm: Example: Crimeratedataset Crime-related and demographic statistics for 47 US states in The data were collected from the FBI s Uniform Crime Report and other government agencies to determine how the variable crime rate depends on the other variables measured in the study. VARIABLES RATE: Crime rate as # of offenses reported to police per million population Age: The number of males of age per 1000 population S: Indicator variable for Southern states (0 = No, 1 = Yes) Ed: Mean # of years of schooling x 10 for persons of age 25 or older Ex0: 1960 per capita expenditure on police by state and local government 5 6 Ex1: 1959 per capita expenditure on police by state and local government LF: Labor force participation rate per 1000 civilian urban males age M: The number of males per 1000 females N: State population size in hundred thousands NW: The number of non-whites per 1000 population U1: Unemployment rate of urban males per 1000 of age U2: Unemployment rate of urban males per 1000 of age W: Median value of transferable goods and assets or family income in tens of $ Pov: The number of families per 1000 earning below 1/2 the median income Use the step procedure in R to choose a good subset of predictors by subtracting terms one at atime. > crime.data=read.delim("crime.txt",sep="\t",header=false) > dimnames(crime.data)[[2]]=c("rate","age","s","ed","ex0", "Ex1","LF","M","N","NW","U1","U2","W","Pov") > attach(crime.data) > head(crime.data) RATE Age S Ed Ex0 Ex1 LF M N NW U1 U2 W X ## Fit the model including all candidate variables: > lm.full.out=lm(rate ~Age + S + Ed + Ex0 + Ex1 + LF + M + N + NW + U1 + U2 + W + Pov) > vifs=vif(lm.full.out) > round(vifs,2) Age S Ed Ex0 Ex1 LF M N NW U1 U2 W Pov

3 ## The starting AIC is ## remove variables one at a time. > model.selection=step(lm.full.out) Start: AIC= RATE ~ Age + S + Ed + Ex0 + Ex1 + LF + M + N + NW + U1 + U2 + W + Pov - NW LF N S Ex M <none> W U Ex U Age Ed Pov ## Remove NW and check if we should remove another. Step: AIC= RATE ~ Age + S + Ed + Ex0 + Ex1 + LF + M + N + U1 + U2 + W + Pov - LF N Ex S M <none> W U Ex U Age Ed Pov ## Remove LF and check if we should remove another. Step: AIC= RATE ~ Age + S + Ed + Ex0 + Ex1 + M + N + U1 + U2 + W + Pov - N S Ex M <none> W U Ex U Age Ed Pov ## Remove N and check if we should remove another. Step: AIC= RATE ~ Age + S + Ed + Ex0 + Ex1 + M + U1 + U2 + W + Pov - S Ex M <none> W U Ex U Age Ed Pov ## Remove S and check if we should remove another. Step: AIC= RATE ~ Age + Ed + Ex0 + Ex1 + M + U1 + U2 + W + Pov - Ex M <none> W U Ex U Age Ed Pov ## Remove Ex1 and check if we should remove another. Step: AIC= RATE ~ Age + Ed + Ex0 + M + U1 + U2 + W + Pov - M <none> W U U Age Ed Pov Ex ## Remove M and check if we should remove another. Step: AIC= RATE ~ Age + Ed + Ex0 + U1 + U2 + W + Pov - U <none> W U Age

4 - Ed Pov Ex ## Remove U1 and check if we should remove another. Step: AIC= RATE ~ Age + Ed + Ex0 + U2 + W + Pov <none> W U Age Ed Pov Ex ######################################### ## Procedure stops because removing ## ## any of the remaining variables ## ## only increases AIC. ## ######################################### ## Get the output from the final chosen model: > summary(model.selection) Call: lm(formula = RATE ~ Age + Ed + Ex0 + U2 + W + Pov) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** Age ** Ed *** Ex e-07 *** U W Pov e-05 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 40 degrees of freedom Multiple R-squared: ,Adjusted R-squared: 0.71 F-statistic: on 6 and 40 DF, p-value: 1.441e-10 > lm.out=lm(rate ~Age + Ed + Ex0 + U2 + W + Pov) > vif(lm.out) Age Ed Ex0 U2 W Pov You can use the step function with the BIC instead through an option in the step() statement. Setting k = log(n)inthestatementchangesthe criterion to the BIC rather than the AIC (Eventhough some of the output says AIC). > model.selection.bic=step(lm.full.out,k=log(47)) ##### similar output to previous example... ###### > summary(model.selection.bic) Call: lm(formula = RATE ~ Age + Ed + Ex0 + U2 + Pov) BIC tends to favor smaller models than the AIC. It has a heavier penalty for using more parameters. The only difference between the two chosen models in this example is that W (for wealth) is also removed from the BIC chosen model. You can also use the option direction = forward to build abestmodel. Butstartingwiththefull model is generally more reliable. Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** Age ** Ed *** Ex e-11 *** U * Pov e-05 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: 21.3 on 41 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 5 and 41 DF, p-value: 1.105e

5 We mentioned stepwise procedures as a way to get around looking at every single possible model (there are 2 k possibilities). What about actually considering every possible model? Is this feasible?... Depends on the total number of variables. We can use the regsubsets function in the leaps library to consider the best model of each possible size (1 predictor, 2 predictors, 3 predictors, etc.) > library(leaps) ## The. below means we ll use all the other variables ## besides RATE as predictors in the largest model. > crime.subsets=regsubsets(rate ~., nbest=1,nvmax=13, data=crime.data) ## For each size of model, we ll only record ## the single best model (nbest=1). > summary(crime.subsets) Subset selection object Call: regsubsets.formula(rate ~., nbest = 1, nvmax = 13, data = crime.data) 13 Variables (and intercept) 1 subsets of each size up to 13 Selection Algorithm: exhaustive Age S Ed Ex0 Ex1 LF M N NW U1 U2 W Pov 1 ( 1 ) " " " " " " "*" " " " " " " " " " " " " " " " " " " 2 ( 1 ) " " " " " " "*" " " " " " " " " " " " " " " " " "*" 3 ( 1 ) " " " " "*" "*" " " " " " " " " " " " " " " " " "*" 4 ( 1 ) "*" " " "*" "*" " " " " " " " " " " " " " " " " "*" 5 ( 1 ) "*" " " "*" "*" " " " " " " " " " " " " "*" " " "*" 6 ( 1 ) "*" " " "*" "*" " " " " " " " " " " " " "*" "*" "*" 7 ( 1 ) "*" " " "*" "*" " " " " " " " " " " "*" "*" "*" "*" 8 ( 1 ) "*" " " "*" "*" " " " " "*" " " " " "*" "*" "*" "*" 9 ( 1 ) "*" " " "*" "*" "*" " " "*" " " " " "*" "*" "*" "*" 10 ( 1 ) "*" "*" "*" "*" "*" " " "*" " " " " "*" "*" "*" "*" 11 ( 1 ) "*" "*" "*" "*" "*" " " "*" "*" " " "*" "*" "*" "*" 12 ( 1 ) "*" "*" "*" "*" "*" "*" "*" "*" " " "*" "*" "*" "*" 13 ( 1 ) "*" "*" "*" "*" "*" "*" "*" "*" "*" "*" "*" "*" "*" As Ex0, the1960policeexpenditures,isfirst chosen and appears in every best model, this may be the most important explanatory variable of crime rate. ## We ll consider models up to a ## 13 variable model (nvmax=13) The 5 variable model matches that of the stepwise BIC best model we saw earlier. And the 6variablemodelmatchesthatofthestepwise AIC best model we saw earlier. If you use regsubsets to record more than just the single best model of each size, you can see how different the BIC values are for the top X best models of each size in a visual plot... Statistic: bic E1-M-P Ed-E1-P E0-M-P Ed-E0-P Ed-E0-U2-P Ed-E0-W-P Ed-E0-M-P A-Ed-E0-P A-Ed-E0-U1-P A-Ed-E0-M-P A-Ed-E0-W-P A-Ed-E0-E1-U2-P A-Ed-E0-L-U2-P A-Ed-E0-U1-U2-P A-Ed-E0-U2-W-P A-Ed-E0-U2-P We ll keep the best 4 models of each size. > crime.subsets.2=regsubsets(rate~., nbest=4, nvmax=8, data=crime.data) ## The next line will give you the graphic ## for models of size 3 to 6. > subsets(crime.subsets.2,min.size=3,max.size=6,legend=f) ## The subsets() function is in the car library Subset Size This may be useful if you re deciding between models with similar BIC values, and some models seem better for you in terms of your research, and which variables are included. The plot also shows the model with the smallest BIC (if you show all subset sizes). 20

6 Here we keep track of the best 4 models of each size up to a model with all 13 variables included. The graphic becomes a bit hard to read when you look at all recorded models, so subsetting the picture (as on the previous page) is useful. > crime.subsets.3=regsubsets(rate~., nbest=4, nvmax=13, data=crime.data) > subsets(crime.subsets.3,legend=f) AIC function To compare model 1 to model 2 using AIC, you can just use the AIC function directly. > model.1=lm(rate ~Ex0 + Ex1 + LF + M + N) > model.2=lm(rate ~Age + S + Ex0 + Ex1 + U1 + Pov) > AIC(model.1) [1] Statistic: bic N W E1 E0 A-E1 E1-P A-E0 E0-P A-Ed-E0-M-NW-U1-U2-W-P A-Ed-E0-E1-M-U1-U2-W-P A-Ed-E0-M-N-U1-U2-W-P A-S-Ed-E0-M-U1-U2-W-P E1-M-P Ed-E1-P A-Ed-E0-E1-U1-U2-W-P A-Ed-E0-N-U1-U2-W-P A-Ed-E0-L-U1-U2-W-P E0-M-PEd-E0-U2-P A-Ed-E0-M-U1-U2-W-P Ed-E0-W-P Ed-E0-M-P A-Ed-E0-U1-P A-Ed-E0-M-P A-Ed-E0-E1-U2-P Ed-E0-P A-Ed-E0-U1-U2-P A-Ed-E0-L-U2-P A-Ed-E0-U1-U2-W-P A-Ed-E0-E1-U2-W-P A-Ed-E0-M-U1-U2-P A-Ed-E0-L-U2-W-P A-Ed-E0-P A-Ed-E0-W-P A-Ed-E0-U2-P A-Ed-E0-U2-W-P A-S-Ed-E0-E1-L-M-N-NW-U1-U2-W-P A-S-Ed-E0-E1-M-N-NW-U1-U2- A-Ed-E0-E1-L-M-N-NW-U1-U2- A-S-Ed-E0-E1-L-M-NW-U1-U2- A-S-Ed-E0-E1-L-M-N-U1-U2-W A-Ed-E0-E1-M-N-NW-U1-U2-W-P A-S-Ed-E0-E1-M-NW-U1-U2-W-P A-S-Ed-E0-E1-M-N-U1-U2-W-P A-S-Ed-E0-E1-L-M-U1-U2-W-P A-Ed-E0-E1-M-NW-U1-U2-W-P A-Ed-E0-E1-M-N-U1-U2-W-P A-S-Ed-E0-E1-M-U1-U2-W-P A-S-Ed-E0-M-N-U1-U2-W-P > AIC(model.2) [1] *smaller AIC is better. To compare the two models using BIC... > AIC(model.1,k=log(47)) [1] > AIC(model.2,k=log(47)) [1] *smaller BIC is better Subset Size 21 22

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

Variable selection is intended to select the best subset of predictors. But why bother?

Variable selection is intended to select the best subset of predictors. But why bother? Chapter 10 Variable Selection Variable selection is intended to select the best subset of predictors. But why bother? 1. We want to explain the data in the simplest way redundant predictors should be removed.

More information

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100.

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100. Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records

More information

BIOL 458 BIOMETRY Lab 10 - Multiple Regression

BIOL 458 BIOMETRY Lab 10 - Multiple Regression BIOL 458 BIOMETRY Lab 10 - Multiple Regression Many problems in science involve the analysis of multi-variable data sets. For data sets in which there is a single continuous dependent variable, but several

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Variable Selection - Chapters 8 in Dielman Artin Department of Statistical Science October 23, 2009 Outline Introduction 1 Introduction 2 3 4 Variable Selection Model

More information

[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization

[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization [POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization L. Jason Anastasopoulos ljanastas@uga.edu February 2, 2017 Gradient descent Let s begin with our simple problem of estimating

More information

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41 1/41 Model selection 560 Hierarchical modeling Peter Hoff Statistics, University of Washington /41 Modeling choices Model: A statistical model is a set of probability distributions for your data. In HLM,

More information

2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy

2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy 2017 ITRON EFG Meeting Abdul Razack Specialist, Load Forecasting NV Energy Topics 1. Concepts 2. Model (Variable) Selection Methods 3. Cross- Validation 4. Cross-Validation: Time Series 5. Example 1 6.

More information

Package leaps. R topics documented: May 5, Title regression subset selection. Version 2.9

Package leaps. R topics documented: May 5, Title regression subset selection. Version 2.9 Package leaps May 5, 2009 Title regression subset selection Version 2.9 Author Thomas Lumley using Fortran code by Alan Miller Description Regression subset selection including

More information

Quantitative Methods in Management

Quantitative Methods in Management Quantitative Methods in Management MBA Glasgow University March 20-23, 2009 Luiz Moutinho, University of Glasgow Graeme Hutcheson, University of Manchester Exploratory Regression The lecture notes, exercises

More information

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Exploratory regression and model selection

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Exploratory regression and model selection Statistical Modelling for Social Scientists Manchester University January 20, 21 and 24, 2011 Graeme Hutcheson, University of Manchester Exploratory regression and model selection The lecture notes, exercises

More information

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010 Statistical Models for Management Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon February 24 26, 2010 Graeme Hutcheson, University of Manchester Exploratory regression and model

More information

Assignment 6 - Model Building

Assignment 6 - Model Building Assignment 6 - Model Building your name goes here Due: Wednesday, March 7, 2018, noon, to Sakai Summary Primarily from the topics in Chapter 9 of your text, this homework assignment gives you practice

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

7. Collinearity and Model Selection

7. Collinearity and Model Selection Sociology 740 John Fox Lecture Notes 7. Collinearity and Model Selection Copyright 2014 by John Fox Collinearity and Model Selection 1 1. Introduction I When there is a perfect linear relationship among

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Chapter 9 Building the Regression Model I: Model Selection and Validation

Chapter 9 Building the Regression Model I: Model Selection and Validation Chapter 9 Building the Regression Model I: Model Selection and Validation 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 9 1 / 42 9.1 Polynomial Regression Models

More information

Applied Statistics and Econometrics Lecture 6

Applied Statistics and Econometrics Lecture 6 Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,

More information

Model selection Outline for today

Model selection Outline for today Model selection Outline for today The problem of model selection Choose among models by a criterion rather than significance testing Criteria: Mallow s C p and AIC Search strategies: All subsets; stepaic

More information

Multivariate Analysis Multivariate Calibration part 2

Multivariate Analysis Multivariate Calibration part 2 Multivariate Analysis Multivariate Calibration part 2 Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com Linear Latent Variables An essential concept in multivariate data

More information

Section 2.1: Intro to Simple Linear Regression & Least Squares

Section 2.1: Intro to Simple Linear Regression & Least Squares Section 2.1: Intro to Simple Linear Regression & Least Squares Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.1, 7.2 1 Regression:

More information

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

Discussion Notes 3 Stepwise Regression and Model Selection

Discussion Notes 3 Stepwise Regression and Model Selection Discussion Notes 3 Stepwise Regression and Model Selection Stepwise Regression There are many different commands for doing stepwise regression. Here we introduce the command step. There are many arguments

More information

14. League: A factor with levels A and N indicating player s league at the end of 1986

14. League: A factor with levels A and N indicating player s league at the end of 1986 PENALIZED REGRESSION Ridge and The LASSO Note: The example contained herein was copied from the lab exercise in Chapter 6 of Introduction to Statistical Learning by. For this exercise, we ll use some baseball

More information

1 StatLearn Practical exercise 5

1 StatLearn Practical exercise 5 1 StatLearn Practical exercise 5 Exercise 1.1. Download the LA ozone data set from the book homepage. We will be regressing the cube root of the ozone concentration on the other variables. Divide the data

More information

mcssubset: Efficient Computation of Best Subset Linear Regressions in R

mcssubset: Efficient Computation of Best Subset Linear Regressions in R mcssubset: Efficient Computation of Best Subset Linear Regressions in R Marc Hofmann Université de Neuchâtel Cristian Gatu Université de Neuchâtel Erricos J. Kontoghiorghes Birbeck College Achim Zeileis

More information

Gelman-Hill Chapter 3

Gelman-Hill Chapter 3 Gelman-Hill Chapter 3 Linear Regression Basics In linear regression with a single independent variable, as we have seen, the fundamental equation is where ŷ bx 1 b0 b b b y 1 yx, 0 y 1 x x Bivariate Normal

More information

. predict mod1. graph mod1 ed, connect(l) xlabel ylabel l1(model1 predicted income) b1(years of education)

. predict mod1. graph mod1 ed, connect(l) xlabel ylabel l1(model1 predicted income) b1(years of education) DUMMY VARIABLES AND INTERACTIONS Let's start with an example in which we are interested in discrimination in income. We have a dataset that includes information for about 16 people on their income, their

More information

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1)

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1) Orange Juice data Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20labs/l10-oj-data.html#(1) 1/31 Orange Juice Data The data contain weekly sales of refrigerated

More information

Information Criteria Methods in SAS for Multiple Linear Regression Models

Information Criteria Methods in SAS for Multiple Linear Regression Models Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information

More information

Chapter 10: Variable Selection. November 12, 2018

Chapter 10: Variable Selection. November 12, 2018 Chapter 10: Variable Selection November 12, 2018 1 Introduction 1.1 The Model-Building Problem The variable selection problem is to find an appropriate subset of regressors. It involves two conflicting

More information

The problem we have now is called variable selection or perhaps model selection. There are several objectives.

The problem we have now is called variable selection or perhaps model selection. There are several objectives. STAT-UB.0103 NOTES for Wednesday 01.APR.04 One of the clues on the library data comes through the VIF values. These VIFs tell you to what extent a predictor is linearly dependent on other predictors. We

More information

Chapter 6: Linear Model Selection and Regularization

Chapter 6: Linear Model Selection and Regularization Chapter 6: Linear Model Selection and Regularization As p (the number of predictors) comes close to or exceeds n (the sample size) standard linear regression is faced with problems. The variance of the

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Section 2.1: Intro to Simple Linear Regression & Least Squares

Section 2.1: Intro to Simple Linear Regression & Least Squares Section 2.1: Intro to Simple Linear Regression & Least Squares Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.1, 7.2 1 Regression:

More information

Using R for Analyzing Delay Discounting Choice Data. analysis of discounting choice data requires the use of tools that allow for repeated measures

Using R for Analyzing Delay Discounting Choice Data. analysis of discounting choice data requires the use of tools that allow for repeated measures Using R for Analyzing Delay Discounting Choice Data Logistic regression is available in a wide range of statistical software packages, but the analysis of discounting choice data requires the use of tools

More information

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 3.2: Multiple Linear Regression II Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Multiple Linear Regression: Inference and Understanding We can answer new questions

More information

Lab #13 - Resampling Methods Econ 224 October 23rd, 2018

Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Introduction In this lab you will work through Section 5.3 of ISL and record your code and results in an RMarkdown document. I have added section

More information

An Econometric Study: The Cost of Mobile Broadband

An Econometric Study: The Cost of Mobile Broadband An Econometric Study: The Cost of Mobile Broadband Zhiwei Peng, Yongdon Shin, Adrian Raducanu IATOM13 ENAC January 16, 2014 Zhiwei Peng, Yongdon Shin, Adrian Raducanu (UCLA) The Cost of Mobile Broadband

More information

Model Selection and Inference

Model Selection and Inference Model Selection and Inference Merlise Clyde January 29, 2017 Last Class Model for brain weight as a function of body weight In the model with both response and predictor log transformed, are dinosaurs

More information

Bayes Estimators & Ridge Regression

Bayes Estimators & Ridge Regression Bayes Estimators & Ridge Regression Readings ISLR 6 STA 521 Duke University Merlise Clyde October 27, 2017 Model Assume that we have centered (as before) and rescaled X o (original X) so that X j = X o

More information

Salary 9 mo : 9 month salary for faculty member for 2004

Salary 9 mo : 9 month salary for faculty member for 2004 22s:52 Applied Linear Regression DeCook Fall 2008 Lab 3 Friday October 3. The data Set In 2004, a study was done to examine if gender, after controlling for other variables, was a significant predictor

More information

Section 2.2: Covariance, Correlation, and Least Squares

Section 2.2: Covariance, Correlation, and Least Squares Section 2.2: Covariance, Correlation, and Least Squares Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.1, 7.2 1 A Deeper

More information

CPSC 340: Machine Learning and Data Mining. Feature Selection Fall 2017

CPSC 340: Machine Learning and Data Mining. Feature Selection Fall 2017 CPSC 340: Machine Learning and Data Mining Feature Selection Fall 2017 Assignment 2: Admin 1 late day to hand in tonight, 2 for Wednesday, answers posted Thursday. Extra office hours Thursday at 4pm (ICICS

More information

Multicollinearity and Validation CIVL 7012/8012

Multicollinearity and Validation CIVL 7012/8012 Multicollinearity and Validation CIVL 7012/8012 2 In Today s Class Recap Multicollinearity Model Validation MULTICOLLINEARITY 1. Perfect Multicollinearity 2. Consequences of Perfect Multicollinearity 3.

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining Feature Selection Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. Admin Assignment 3: Due Friday Midterm: Feb 14 in class

More information

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Abstract In this project, we study 374 public high schools in New York City. The project seeks to use regression techniques

More information

Machine Learning. Topic 4: Linear Regression Models

Machine Learning. Topic 4: Linear Regression Models Machine Learning Topic 4: Linear Regression Models (contains ideas and a few images from wikipedia and books by Alpaydin, Duda/Hart/ Stork, and Bishop. Updated Fall 205) Regression Learning Task There

More information

Analysis of variance - ANOVA

Analysis of variance - ANOVA Analysis of variance - ANOVA Based on a book by Julian J. Faraway University of Iceland (UI) Estimation 1 / 50 Anova In ANOVAs all predictors are categorical/qualitative. The original thinking was to try

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Nonparametric Methods Recap

Nonparametric Methods Recap Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010 Nonparametric Methods Kernel Density estimate (also Histogram) Weighted frequency Classification - K-NN Classifier Majority

More information

MODEL DEVELOPMENT: VARIABLE SELECTION

MODEL DEVELOPMENT: VARIABLE SELECTION 7 MODEL DEVELOPMENT: VARIABLE SELECTION The discussion of least squares regression thus far has presumed that the model was known with respect to which variables were to be included and the form these

More information

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Statistics Lab #7 ANOVA Part 2 & ANCOVA Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")

More information

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors

More information

STAT 705 Introduction to generalized additive models

STAT 705 Introduction to generalized additive models STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear

More information

Chapter 10: Extensions to the GLM

Chapter 10: Extensions to the GLM Chapter 10: Extensions to the GLM 10.1 Implement a GAM for the Swedish mortality data, for males, using smooth functions for age and year. Age and year are standardized as described in Section 4.11, for

More information

Statistical Consulting Topics Using cross-validation for model selection. Cross-validation is a technique that can be used for model evaluation.

Statistical Consulting Topics Using cross-validation for model selection. Cross-validation is a technique that can be used for model evaluation. Statistical Consulting Topics Using cross-validation for model selection Cross-validation is a technique that can be used for model evaluation. We often fit a model to a full data set and then perform

More information

Solution to Bonus Questions

Solution to Bonus Questions Solution to Bonus Questions Q2: (a) The histogram of 1000 sample means and sample variances are plotted below. Both histogram are symmetrically centered around the true lambda value 20. But the sample

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

Package pampe. R topics documented: November 7, 2015

Package pampe. R topics documented: November 7, 2015 Package pampe November 7, 2015 Type Package Title Implementation of the Panel Data Approach Method for Program Evaluation Version 1.1.2 Date 2015-11-06 Author Ainhoa Vega-Bayo Maintainer Ainhoa Vega-Bayo

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 327 Geometric Regression Introduction Geometric regression is a special case of negative binomial regression in which the dispersion parameter is set to one. It is similar to regular multiple regression

More information

ES-2 Lecture: Fitting models to data

ES-2 Lecture: Fitting models to data ES-2 Lecture: Fitting models to data Outline Motivation: why fit models to data? Special case (exact solution): # unknowns in model =# datapoints Typical case (approximate solution): # unknowns in model

More information

Show how the LG-Syntax can be generated from a GUI model. Modify the LG-Equations to specify a different LC regression model

Show how the LG-Syntax can be generated from a GUI model. Modify the LG-Equations to specify a different LC regression model Tutorial #S1: Getting Started with LG-Syntax DemoData = 'conjoint.sav' This tutorial introduces the use of the LG-Syntax module, an add-on to the Advanced version of Latent GOLD. In this tutorial we utilize

More information

2016 Stat-Ease, Inc. Taking Advantage of Automated Model Selection Tools for Response Surface Modeling

2016 Stat-Ease, Inc. Taking Advantage of Automated Model Selection Tools for Response Surface Modeling Taking Advantage of Automated Model Selection Tools for Response Surface Modeling There are many attendees today! To avoid disrupting the Voice over Internet Protocol (VoIP) system, I will mute all. Please

More information

An R Package for the Panel Approach Method for Program Evaluation: pampe by Ainhoa Vega-Bayo

An R Package for the Panel Approach Method for Program Evaluation: pampe by Ainhoa Vega-Bayo CONTRIBUTED RESEARCH ARTICLES 105 An R Package for the Panel Approach Method for Program Evaluation: pampe by Ainhoa Vega-Bayo Abstract The pampe package for R implements the panel data approach method

More information

Practice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version)

Practice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version) Practice in R January 28, 2010 (pdf version) 1 Sivan s practice Her practice file should be (here), or check the web for a more useful pointer. 2 Hetroskadasticity ˆ Let s make some hetroskadastic data:

More information

NEURAL NETWORKS. Cement. Blast Furnace Slag. Fly Ash. Water. Superplasticizer. Coarse Aggregate. Fine Aggregate. Age

NEURAL NETWORKS. Cement. Blast Furnace Slag. Fly Ash. Water. Superplasticizer. Coarse Aggregate. Fine Aggregate. Age NEURAL NETWORKS As an introduction, we ll tackle a prediction task with a continuous variable. We ll reproduce research from the field of cement and concrete manufacturing that seeks to model the compressive

More information

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison 610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison R is very touchy about unbalanced designs, partly because

More information

Data Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology

Data Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology ❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous

More information

Ex.1 constructing tables. a) find the joint relative frequency of males who have a bachelors degree.

Ex.1 constructing tables. a) find the joint relative frequency of males who have a bachelors degree. Two-way Frequency Tables two way frequency table- a table that divides responses into categories. Joint relative frequency- the number of times a specific response is given divided by the sample. Marginal

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. These methods refit a model of interest to samples formed from the training set,

More information

Nonparametric Classification Methods

Nonparametric Classification Methods Nonparametric Classification Methods We now examine some modern, computationally intensive methods for regression and classification. Recall that the LDA approach constructs a line (or plane or hyperplane)

More information

Lasso. November 14, 2017

Lasso. November 14, 2017 Lasso November 14, 2017 Contents 1 Case Study: Least Absolute Shrinkage and Selection Operator (LASSO) 1 1.1 The Lasso Estimator.................................... 1 1.2 Computation of the Lasso Solution............................

More information

Among those 14 potential explanatory variables,non-dummy variables are:

Among those 14 potential explanatory variables,non-dummy variables are: Among those 14 potential explanatory variables,non-dummy variables are: Size: 2nd column in the dataset Land: 14th column in the dataset Bed.Rooms: 5th column in the dataset Fireplace: 7th column in the

More information

Regression III: Lab 4

Regression III: Lab 4 Regression III: Lab 4 This lab will work through some model/variable selection problems, finite mixture models and missing data issues. You shouldn t feel obligated to work through this linearly, I would

More information

CHAPTER 7 ASDA ANALYSIS EXAMPLES REPLICATION-SPSS/PASW V18 COMPLEX SAMPLES

CHAPTER 7 ASDA ANALYSIS EXAMPLES REPLICATION-SPSS/PASW V18 COMPLEX SAMPLES CHAPTER 7 ASDA ANALYSIS EXAMPLES REPLICATION-SPSS/PASW V18 COMPLEX SAMPLES GENERAL NOTES ABOUT ANALYSIS EXAMPLES REPLICATION These examples are intended to provide guidance on how to use the commands/procedures

More information

Lasso Regression: Regularization for feature selection

Lasso Regression: Regularization for feature selection Lasso Regression: Regularization for feature selection CSE 416: Machine Learning Emily Fox University of Washington April 12, 2018 Symptom of overfitting 2 Often, overfitting associated with very large

More information

Lecture on Modeling Tools for Clustering & Regression

Lecture on Modeling Tools for Clustering & Regression Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into

More information

8. Collinearity and Model Selection

8. Collinearity and Model Selection Lecture Notes 8. Collinearity and Model Selection Collinearity and Model Selection 1 1. Introduction I When there is aperfect linear relationship among the regressors in a linear model, the least-squares

More information

Exploratory model analysis

Exploratory model analysis Exploratory model analysis with R and GGobi Hadley Wickham 6--8 Introduction Why do we build models? There are two basic reasons: explanation or prediction [Ripley, 4]. Using large ensembles of models

More information

( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value.

( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value. Calibration OVERVIEW... 2 INTRODUCTION... 2 CALIBRATION... 3 ANOTHER REASON FOR CALIBRATION... 4 CHECKING THE CALIBRATION OF A REGRESSION... 5 CALIBRATION IN SIMPLE REGRESSION (DISPLAY.JMP)... 5 TESTING

More information

Model selection. Peter Hoff STAT 423. Applied Regression and Analysis of Variance. University of Washington /53

Model selection. Peter Hoff STAT 423. Applied Regression and Analysis of Variance. University of Washington /53 /53 Model selection Peter Hoff STAT 423 Applied Regression and Analysis of Variance University of Washington Diabetes example: y = diabetes progression x 1 = age x 2 = sex. dim(x) ## [1] 442 64 colnames(x)

More information

The linear mixed model: modeling hierarchical and longitudinal data

The linear mixed model: modeling hierarchical and longitudinal data The linear mixed model: modeling hierarchical and longitudinal data Analysis of Experimental Data AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44 Contents 1 Modeling Hierarchical

More information

Statistical Analysis in R Guest Lecturer: Maja Milosavljevic January 28, 2015

Statistical Analysis in R Guest Lecturer: Maja Milosavljevic January 28, 2015 Statistical Analysis in R Guest Lecturer: Maja Milosavljevic January 28, 2015 Data Exploration Import Relevant Packages: library(grdevices) library(graphics) library(plyr) library(hexbin) library(base)

More information

Poisson Regression and Model Checking

Poisson Regression and Model Checking Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Evolution of Regression II: From OLS to GPS to MARS Hands-on with SPM

Evolution of Regression II: From OLS to GPS to MARS Hands-on with SPM Evolution of Regression II: From OLS to GPS to MARS Hands-on with SPM March 2013 Dan Steinberg Mikhail Golovnya Salford Systems Salford Systems 2013 1 Course Outline Today s Webinar: Hands-on companion

More information

Section 4.1: Time Series I. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 4.1: Time Series I. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 4.1: Time Series I Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Time Series Data and Dependence Time-series data are simply a collection of observations gathered

More information

IQR = number. summary: largest. = 2. Upper half: Q3 =

IQR = number. summary: largest. = 2. Upper half: Q3 = Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number

More information

Section 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 3.4: Diagnostics and Transformations Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Regression Model Assumptions Y i = β 0 + β 1 X i + ɛ Recall the key assumptions

More information

Predicting Web Service Levels During VM Live Migrations

Predicting Web Service Levels During VM Live Migrations Predicting Web Service Levels During VM Live Migrations 5th International DMTF Academic Alliance Workshop on Systems and Virtualization Management: Standards and the Cloud Helmut Hlavacs, Thomas Treutner

More information

Stat 5303 (Oehlert): Response Surfaces 1

Stat 5303 (Oehlert): Response Surfaces 1 Stat 5303 (Oehlert): Response Surfaces 1 > data

More information

Logistic Regression. (Dichotomous predicted variable) Tim Frasier

Logistic Regression. (Dichotomous predicted variable) Tim Frasier Logistic Regression (Dichotomous predicted variable) Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

More information

SYS 6021 Linear Statistical Models

SYS 6021 Linear Statistical Models SYS 6021 Linear Statistical Models Project 2 Spam Filters Jinghe Zhang Summary The spambase data and time indexed counts of spams and hams are studied to develop accurate spam filters. Static models are

More information

Package freeknotsplines

Package freeknotsplines Version 1.0.1 Date 2018-05-17 Package freeknotsplines June 10, 2018 Title Algorithms for Implementing Free-Knot Splines Author , Philip Smith , Pierre Lecuyer

More information

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017 Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison

More information

Lab 8 - Subset Selection in Python

Lab 8 - Subset Selection in Python Lab 8 - Subset Selection in Python March 2, 2016 This lab on Subset Selection is a Python adaptation of p. 244-247 of Introduction to Statistical Learning with Applications in R by Gareth James, Daniela

More information

Splines and penalized regression

Splines and penalized regression Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,

More information