Lecture A. Exhaustive and non-exhaustive algorithms without resampling. Samuel Müller and Garth Tarr
|
|
- Anissa Adele Stanley
- 5 years ago
- Views:
Transcription
1 Lecture A Exhaustive and non-exhaustive algorithms without resampling Samuel Müller and Garth Tarr
2 "Doubt is not a pleasant mental state, but certainty is a ridiculous one." Voltaire ( ) Necessity to select models is great despite philosophical issues Myriad of model/variable selection methods based on loglikelihood, RSS, prediction errors, resampling, etc 2 / 42
3 Some motivating questions How to practically select model(s)? How to cope with an ever increasing number of observed variables? How to visualise the model building process? How to assess the stability of a selected model? 3 / 42
4 Topics Selecting models Stepwise model selection Information criteria including AIC and BIC Exhaustive and non-exhaustive searches Regularization methods 4 / 42
5 Body fat data N = 252 measurements of men (available in R package mfp) Training sample size n = 128 (available in R package mplot) Response variable: body fat percentage: Bodyfat 13 or 14 explanatory variables data("bodyfat", package = "mplot") dim(bodyfat) ## [1] names(bodyfat) ## [1] "Id" "Bodyfat" "Age" "Weight" "Height" "Neck" ## [7] "Chest" "Abdo" "Hip" "Thigh" "Knee" "Ankle" ## [13] "Bic" "Fore" "Wrist" Source: Johnson (1996) 5 / 42
6 Body fat: Boxplots bfat = bodyfat[, -1] # remove the ID column boxplot(bfat, horizontal = TRUE, las = 1) 6 / 42
7 pairs(bfat) 7 / 42
8 Body fat data: Full model and power set names(bfat) ## [1] "Bodyfat" "Age" "Weight" "Height" "Neck" "Chest" ## [7] "Abdo" "Hip" "Thigh" "Knee" "Ankle" "Bic" ## [13] "Fore" "Wrist" Full model Power set Bodyfat=β 1 + β 2 Age + β 3 Weight + β 4 Height + β 5 Height + β 6 Chest + β 7 Abdo + β 8 Hip + β 9 Thigh + β 10 Knee + β 11 Ankle + β 12 Bic + β 13 Fore + β 14 Wrist + ϵ Here, the intercept is not subject to selection. Therefore, the number of possible regression models is: M = #A = 2 13 = 8192 p 1 = 13 and thus 8 / 42
9 Body fat: Best bivariate tted model M0 = lm(bodyfat ~ Abdo, data = bfat) summary(m0) ## ## Call: ## lm(formula = Bodyfat ~ Abdo, data = bfat) ## ## Residuals: ## Min 1Q Median 3Q Max ## ## ## Coefficients: ## Estimate Std. Error t value Pr(> t ) ## (Intercept) <2e-16 *** ## Abdo <2e-16 *** ## --- ## Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: on 126 degrees of freedom ## Multiple R-squared: , Adjusted R-squared: ## F-statistic: on 1 and 126 DF, p-value: < 2.2e-16 9 / 42
10 Body fat: Full tted model M1 = lm(bodyfat ~., data = bfat) summary(m1) ## ## Call: ## lm(formula = Bodyfat ~., data = bfat) ## ## Residuals: ## Min 1Q Median 3Q Max ## ## ## Coefficients: ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Age ## Weight ## Height ## Neck ## Chest ## Abdo e-13 *** ## Hip ## Thigh ## Knee ## Ankle ## Bic ## Fore ## Wrist ## / 42
11 The AIC and BIC AIC is the most widely known and used model selection method (of course this does not imply it is the best/recommended method to use) AIC Akaike (1973): BIC Schwarz (1978): AIC = 2 LogLik + 2 p BIC = 2 LogLik + log(n) p The smaller the AIC/BIC the better the model These and many other criteria choose models by minimizing an expression that can be written as Loss + Penalty 11 / 42
12 Stepwise variable selection Stepwise, forward and backward: 1. Start with some model, typically null model (with no explanatory variables) or full model (with all variables) 2. For each variable in the current model, investigate effect of removing it 3. Remove the least informative variable, unless this variable is nonetheless supplying signi cant information about the response 4. For each variable not in the current model, investigate effect of including it 5. Include the most statistically signi cant variable not currently in model (unless no signi cant variable exists) 6. Go to step 2. Stop only if no change in steps / 42
13 Body fat: Forward search using BIC M0 = lm(bodyfat ~ 1, data = bfat) # Null model M1 = lm(bodyfat ~., data = bfat) # Full model step.fwd.bic = step(m0, scope = list(lower = M0, upper = M1), direction = "forward", trace = FALSE, k = log(128)) # BIC: k=log(n) # summary(step.fwd.bic) round(summary(step.fwd.bic)$coef,3) ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Abdo ## Weight Best model has two features: Abdo and Weight Backward search using BIC gives the same model (con rm yourself) 13 / 42
14 Body fat: Backward search using AIC M0 = lm(bodyfat ~ 1, data = bfat) # Null model M1 = lm(bodyfat ~., data = bfat) # Full model step.bwd.aic = step(m1, scope = list(lower = M0, upper = M1), direction = "backward", trace = FALSE, k = 2) round(summary(step.bwd.aic)$coef,3) ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Weight ## Neck ## Abdo ## Fore Best model has four variables Using AIC gives larger model because 2 < log(128) = 4.85 k=2 is the default in step() 14 / 42
15 Body fat: Stepwise using AIC M0 = lm(bodyfat ~ 1, data = bfat) # Null model M1 = lm(bodyfat ~., data = bfat) # Full model step.aic = step(m1, scope = list(lower = M0, upper = M1), trace = FALSE) round(summary(step.aic)$coef,3) ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Weight ## Neck ## Abdo ## Fore Here, default stepwise gives the same as bacward using AIC This does not hold in general 15 / 42
16 Criticism of stepwise procedures Never run (p value based) automatic stepwise procedures on their own! Stepwise regression is probably the most abused computerized statistical technique ever devised. If you think you need stepwise regression to solve a particular problem you have, it is almost certain you do not. Professional statisticians rarely use automated stepwise regression. Wilkinson (1998) See Wiegand (2010) for a more recent simulation study and review of the performance of stepwise procedures 16 / 42
17 Exhaustive search Exhaustive search is the only technique guaranteed to nd the predictor variable subset with the best evaluation criterion. Since we look over the whole model space, we can identify the best model(s) at each model size Sometimes known as best subsets model selection Loss component is (typically) the residual sum of squares Main drawback: exhaustive searching is computationally intensive 17 / 42
18 The leaps package The leaps package performs regression subset selection including exhaustive searches (Lumley and Miller, 2009) for linear models Main function: regsubsets() performs an exhaustive search of the model space to nd the best subsets of variables for predicting y in linear regression Uses an e cient branch-and-bound algorithm Alternative function: leaps() is just a compatibility wrapper for regsubsets() 18 / 42
19 The regsubsets function Input: full model and various optional parameters. Output: a regsubsets object but you interact with it using the summary function which gives: which A logical matrix indicating which elements are in each model rsq The r-squared for each model rss Residual sum of squares for each model adjr2 Adjusted r-squared cp Mallows' Cp bic Bayesian information criterion outmat A version of the which component that is formatted for printing 19 / 42
20 Body fat: Exhaustive search require(leaps) rgsbst.out = regsubsets(bodyfat ~., data = bfat, nbest = 1, # 1 best model for each dimension nvmax = NULL, # NULL for no limit on number of variables force.in = NULL, force.out = NULL, method = "exhaustive") rgsbst.out ## Subset selection object ## Call: regsubsets.formula(bodyfat ~., data = bfat, nbest = 1, nvmax = NULL, ## force.in = NULL, force.out = NULL, method = "exhaustive") ## 13 Variables (and intercept) ## Forced in Forced out ## Age FALSE FALSE ## Weight FALSE FALSE ## Height FALSE FALSE ## Neck FALSE FALSE ## Chest FALSE FALSE ## Abdo FALSE FALSE ## Hip FALSE FALSE ## Thigh FALSE FALSE ## Knee FALSE FALSE ## Ankle FALSE FALSE ## Bic FALSE FALSE ## Fore FALSE FALSE ## Wrist FALSE FALSE ## 1 subsets of each size up to 13 ## Selection Algorithm: exhaustive 20 / 42
21 Body fat: Exhaustive search summary.out = summary(rgsbst.out) names(summary.out) ## [1] "which" "rsq" "rss" "adjr2" "cp" "bic" "outmat" "obj" as.data.frame(summary.out$outmat) ## Age Weight Height Neck Chest Abdo Hip Thigh Knee Ankle Bic Fore Wrist ## 1 ( 1 ) * ## 2 ( 1 ) * * ## 3 ( 1 ) * * * ## 4 ( 1 ) * * * * ## 5 ( 1 ) * * * * * ## 6 ( 1 ) * * * * * * ## 7 ( 1 ) * * * * * * * ## 8 ( 1 ) * * * * * * * * ## 9 ( 1 ) * * * * * * * * * ## 10 ( 1 ) * * * * * * * * * * ## 11 ( 1 ) * * * * * * * * * * * ## 12 ( 1 ) * * * * * * * * * * * * ## 13 ( 1 ) * * * * * * * * * * * * * 21 / 42
22 Body fat: Exhaustive search and C p summary.out$cp ## [1] ## [8] (id = which.min(summary.out$cp)) ## [1] 4 summary.out$which[id, ] ## (Intercept) Age Weight Height Neck Chest Abdo ## TRUE FALSE TRUE FALSE TRUE FALSE TRUE ## Hip Thigh Knee Ankle Bic Fore Wrist ## FALSE FALSE FALSE FALSE FALSE TRUE FALSE names(which(summary.out$which[id, ] == TRUE)) ## [1] "(Intercept)" "Weight" "Neck" "Abdo" "Fore" 22 / 42
23 Body fat: Exhaustive search and C p plot(rgsbst.out, scale = "Cp", main = "Mallows' Cp") 23 / 42
24 Body fat: Exhaustive search and C p plot(summary.out$cp ~ rowsums(summary.out$which), ylab = "Cp", xlab = "Number of parameters") abline(v=rowsums(summary.out$which)[id],col=gray(0.5)) 24 / 42
25 Regularisation methods with glmnet Regularisation methods shrink estimated regression coe imposing a penalty on their sizes cients by There are different choices for the penalty The penalty choice drives the properties of the method In particular, glmnet implements Lasso regression using a Ridge regression using a L 1 L 2 penalty penalty Elastic-net using both, an L 1 and an L 2 penalty Adaptive lasso, it takes advantage of feature weights 25 / 42
26 Lasso is least squares with L 1 penalty Simultaneously estimate and select coe cients through ^β lasso = ^β lasso (λ) = argmin β R p{(y Xβ) T (y Xβ) + λ β 1 } positive tuning parameter λ controls the shrinkage Optimisation problem above is equivalent to solving minimise RSS(β) = (y Xβ) T (y Xβ) subject to p β j t. j=1 26 / 42
27 Example: Diabetes We consider the diabetes data and we only analyse main effects The data from the lars package is the same as the mplot package but is structured and scaled differently data("diabetes", package = "lars") names(diabetes) # list, slot x has 10 main effects ## [1] "x" "y" "x2" dim(diabetes) ## [1] x = diabetes$x y = diabetes$y 27 / 42
28 Example: Diabetes class(x) = NULL # remove the AsIs class; otherwise single boxplot drawn boxplot(x, cex.axis = 2) 28 / 42
29 Example: Diabetes lm.fit = lm(y ~ x) #least squares fit for the full model first summary(lm.fit) ## ## Call: ## lm(formula = y ~ x) ## ## Residuals: ## Min 1Q Median 3Q Max ## ## ## Coefficients: ## Estimate Std. Error t value Pr(> t ) ## (Intercept) < 2e-16 *** ## xage ## xsex *** ## xbmi e-14 *** ## xmap e-06 *** ## xtc ## xldl ## xhdl ## xtch ## xltg e-05 *** ## xglu / 42
30 Fitting a lasso regression Lasso regression with the glmnet() function in library(glmnet) has as a rst argument the design matrix and as a second argument the response vector, i.e. library(glmnet) lasso.fit = glmnet(x,y) 30 / 42
31 Comments glmnet.fit is an object of class glmnet that contains all the relevant information of the tted model for further use Various methods are provided for the object such as coef, plot, predict and print that extract and present useful object information The additional material shows that cross-validation can be used to choose a good value for λ through using the function cv.glmnet set.seed(1) # to get a reproducable lambda.min value lasso.cv = cv.glmnet(x, y) lasso.cv$lambda.min ## [1] / 42
32 Lasso regression coe cients Clearly the regression coe cients are smaller for the lasso regression (this slide) than for least squares (next slide) round(coef(lasso.fit,s=c(5,1,0.5,0)),2) ## 11 x 4 sparse Matrix of class "dgcmatrix" ## ## (Intercept) ## age ## sex ## bmi ## map ## tc ## ldl ## hdl ## tch ## ltg ## glu / 42
33 Lasso regression vs least squares round(coef(lm.fit),2) ## (Intercept) xage xsex xbmi xmap ## ## xtc xldl xhdl xtch xltg ## ## xglu ## beta.lasso = coef(lasso.fit,s=1) sum(abs(beta.lasso[-1])) ## [1] beta.ls = coef(lm.fit) sum(abs(beta.ls[-1])) ## [1] / 42
34 Lasso coe cient plot plot(lasso.fit, label = TRUE, cex.axis = 1.5, cex.lab = 1.5) 34 / 42
35 We observe that the lasso Can shrink coe cients exactly to zero Simultaneously estimates and selects coe cients Paths are not necessarily monotone (e.g. coe table for different values of s) cients for hdl in above 35 / 42
36 Further remarks on lasso lasso is an acronym for least absolute selection and shrinkage operator Theoretically, when the tuning parameter λ = 0, there is no bias and ^β lasso = ^β LS ( p n ) however, glmnet approximates the solution When λ =, we get ^βlasso = 0 an estimator with zero variance For λ in between, we are balancing bias and variance As λ increases, more coe cients are set to zero (less variables are selected), and among the nonzero coe cients, more shrinkage is employed 36 / 42
37 Why does the lasso give zero coe cients? This gure is taken from James, Witten, Hastie, and Tibshirani (2013) "An Introduction to Statistical Learning, with applications in R" with permission from the authors. 37 / 42
38 Ridge is least squares with L 2 penalty As the lasso, ridge regression shrinks the estimated regression coe cients by imposing a penalty on their sizes minimise RSS(β) = (y Xβ) T (y Xβ) subject to p β 2 t, j j=1 where the positive tuning parameter λ controls the shrinkage However, unlike the lasso, ridge regression does not shrink the parameters all the way to zero Ridge coe cients have closed form solution (even when n < p) ^β ridge (λ) = (X T X + λi) 1 X T y 38 / 42
39 Fitting a ridge regression Ridge regression with the glmnet() function by changing one of its default options to alpha=0 ridge.fit = glmnet(x, y, alpha = 0) ridge.cv = cv.glmnet(x, y, alpha = 0) ridge.cv$lambda.min ## [1] Comments In general, the optimal λ for ridge is very different than the optimal λ for the lasso β 2 j β j The scale of and is very different α is known as the elastic-net mixing parameter 39 / 42
40 Elastic-net: best of both worlds? Minimising the (penalised) log-loglikelihood in linear regression with iid errors is equivalent to minimising the (penalised) RSS More generally, the elastic-net minimises the following objective function over β 0 and β over a grid of values of λ covering the entire range 1 n N w i l(y i, β 0 + β T x i ) + λ [(1 α) i=1 p β 2 + j α j=1 p β j ] j=1 This is general enough to include logistic regression, generalised linear models and Cox regression, where l(y i ) is the log-likelihood of observation i and an optional observation weight w i 40 / 42
41 References Akaike, H. (1973). "Information theory and and extension of the maximum likelihood principle". In: Second International Symposium on Information Theory. Ed. by B. Petrov and F. Csaki. Budapest: Academiai Kiado, pp James, G, D. Witten, T. Hastie, et al. (2013). An introduction to statistical learning with applications in R. New York, NY: Springer. ISBN: URL: Johnson, R. W. (1996). "Fitting Percentage of Body Fat to Simple Body Measurements". In: Journal of Statistics Education 4.1. URL: Lumley, T. and A. Miller (2009). leaps: regression subset selection. R package version 2.9. URL: Schwarz, G. (1978). "Estimating the Dimension of a Model". In: The Annals of Statistics 6.2, pp DOI: /aos/ Wiegand, R. E. (2010). "Performance of using multiple stepwise algorithms for variable selection". In: Statistics in Medicine 29.15, pp DOI: /sim Wilkinson, L. (1998). SYSTAT 8 statistics manual. SPSS Inc. 41 / 42
42 devtools::session_info(include_base = FALSE) ## mime [1] CRAN (R 3.5.0) ## Session info ## mplot * [1] CRAN (R 3.5.2) ## setting value ## version R version ( ) ## pillar ## pkgbuild [1] CRAN (R 3.5.0) [1] CRAN (R 3.5.2) ## os macos Mojave ## pkgconfig [1] CRAN (R 3.5.0) ## system x86_64, darwin ## pkgload [1] CRAN (R 3.5.0) ## ui X11 ## pkgmaker [1] CRAN (R 3.5.0) ## language (EN) ## collate en_au.utf-8 ## plyr ## prettyunits [1] CRAN (R 3.5.0) [1] CRAN (R 3.5.0) ## ctype en_au.utf-8 ## processx [1] CRAN (R 3.5.2) ## tz Australia/Sydney ## promises [1] CRAN (R 3.5.0) ## date ## ps [1] CRAN (R 3.5.0) ## ## purrr [1] CRAN (R 3.5.2) ## Packages ## R [1] CRAN (R 3.5.2) ## package * version date lib source ## assertthat [1] CRAN (R 3.5.2) ## backports [1] CRAN (R 3.5.0) ## bibtex [1] CRAN (R 3.5.0) ## callr [1] CRAN (R 3.5.2) ## cli [1] CRAN (R 3.5.2) ## codetools [1] CRAN (R 3.5.2) ## crayon [1] CRAN (R 3.5.0) ## desc [1] CRAN (R 3.5.0) ## devtools [1] CRAN (R 3.5.1) ## digest [1] CRAN (R 3.5.0) ## dorng [1] CRAN (R 3.5.0) ## dplyr [1] CRAN (R 3.5.2) ## evaluate [1] CRAN (R 3.5.2) ## foreach * [1] CRAN (R 3.5.0) ## formatr [1] CRAN (R 3.5.2) ## fs [1] CRAN (R 3.5.2) ## glmnet * [1] CRAN (R 3.5.0) ## glue [1] CRAN (R 3.5.2) ## htmltools [1] CRAN (R 3.5.0) ## httpuv [1] CRAN (R 3.5.2) ## httr [1] CRAN (R 3.5.0) ## iterators [1] CRAN (R 3.5.0) ## jsonlite [1] CRAN (R 3.5.0) ## knitcitations * [1] CRAN (R 3.5.0) ## knitr * [1] CRAN (R 3.5.2) ## later [1] CRAN (R 3.5.2) ## lattice [1] CRAN (R 3.5.2) ## leaps * [1] CRAN (R 3.5.0) ## lubridate [1] CRAN (R 3.5.0) ## magrittr [1] CRAN (R 3.5.0) ## Matrix * [1] CRAN (R 3.5.2) ## memoise [1] CRAN (R 3.5.0) ## Rcpp [1] CRAN (R 3.5.2) ## RefManageR [1] Github (ropensci/refmanager@3834c ## registry [1] CRAN (R 3.5.2) ## remotes [1] CRAN (R 3.5.0) ## rlang [1] CRAN (R 3.5.2) ## rmarkdown [1] CRAN (R 3.5.2) ## rngtools [1] CRAN (R 3.5.0) ## rprojroot [1] CRAN (R 3.5.0) ## sessioninfo [1] CRAN (R 3.5.0) ## shiny [1] CRAN (R 3.5.1) ## shinydashboard [1] CRAN (R 3.5.0) ## stringi [1] CRAN (R 3.5.2) ## stringr [1] CRAN (R 3.5.2) ## testthat [1] CRAN (R 3.5.0) ## tibble [1] CRAN (R 3.5.2) ## tidyselect [1] CRAN (R 3.5.0) ## usethis [1] CRAN (R 3.5.0) ## withr [1] CRAN (R 3.5.0) ## xaringan [1] CRAN (R 3.5.2) ## xfun [1] CRAN (R 3.5.2) ## xml [1] CRAN (R 3.5.0) ## xtable [1] CRAN (R 3.5.0) ## yaml [1] CRAN (R 3.5.1) ## ## [1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library 42 / 42
Lecture B. Exhaustive and non-exhaustive algorithms with resampling. Samuel Müller and Garth Tarr
Lecture B Exhaustive and non-exhaustive algorithms with resampling Samuel Müller and Garth Tarr Overview 1. Cross validation 2. The mplot package 3. Arti cial example: Information criteria can be blind
More informationPackage biglars. February 19, 2015
Package biglars February 19, 2015 Type Package Title Scalable Least-Angle Regression and Lasso Version 1.0.2 Date Tue Dec 27 15:06:08 PST 2011 Author Mark Seligman, Chris Fraley, Tim Hesterberg Maintainer
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider
More informationLecture 13: Model selection and regularization
Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider
More informationLasso. November 14, 2017
Lasso November 14, 2017 Contents 1 Case Study: Least Absolute Shrinkage and Selection Operator (LASSO) 1 1.1 The Lasso Estimator.................................... 1 1.2 Computation of the Lasso Solution............................
More informationPackage leaps. R topics documented: May 5, Title regression subset selection. Version 2.9
Package leaps May 5, 2009 Title regression subset selection Version 2.9 Author Thomas Lumley using Fortran code by Alan Miller Description Regression subset selection including
More informationInformation Criteria Methods in SAS for Multiple Linear Regression Models
Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information
More informationPenalized regression Statistical Learning, 2011
Penalized regression Statistical Learning, 2011 Niels Richard Hansen September 19, 2011 Penalized regression is implemented in several different R packages. Ridge regression can, in principle, be carried
More informationLinear Model Selection and Regularization. especially usefull in high dimensions p>>100.
Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records
More informationAssignment 6 - Model Building
Assignment 6 - Model Building your name goes here Due: Wednesday, March 7, 2018, noon, to Sakai Summary Primarily from the topics in Chapter 9 of your text, this homework assignment gives you practice
More informationarxiv: v2 [stat.me] 28 Feb 2017
mplot: An R Package for Graphical Model Stability and Variable Selection Procedures Garth Tarr University of Newcastle Samuel Mueller University of Sydney Alan H. Welsh Australian National University Abstract
More informationRegularization Methods. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel
Regularization Methods Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Avoiding overfitting and improving model interpretability with the help of regularization
More information2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy
2017 ITRON EFG Meeting Abdul Razack Specialist, Load Forecasting NV Energy Topics 1. Concepts 2. Model (Variable) Selection Methods 3. Cross- Validation 4. Cross-Validation: Time Series 5. Example 1 6.
More informationLinear Methods for Regression and Shrinkage Methods
Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors
More informationStat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors
Week 10: Autocorrelated errors This week, I have done one possible analysis and provided lots of output for you to consider. Case study: predicting body fat Body fat is an important health measure, but
More informationChapter 6: Linear Model Selection and Regularization
Chapter 6: Linear Model Selection and Regularization As p (the number of predictors) comes close to or exceeds n (the sample size) standard linear regression is faced with problems. The variance of the
More informationPackage FWDselect. December 19, 2015
Title Selecting Variables in Regression Models Version 2.1.0 Date 2015-12-18 Author Marta Sestelo [aut, cre], Nora M. Villanueva [aut], Javier Roca-Pardinas [aut] Maintainer Marta Sestelo
More information14. League: A factor with levels A and N indicating player s league at the end of 1986
PENALIZED REGRESSION Ridge and The LASSO Note: The example contained herein was copied from the lab exercise in Chapter 6 of Introduction to Statistical Learning by. For this exercise, we ll use some baseball
More informationmcssubset: Efficient Computation of Best Subset Linear Regressions in R
mcssubset: Efficient Computation of Best Subset Linear Regressions in R Marc Hofmann Université de Neuchâtel Cristian Gatu Université de Neuchâtel Erricos J. Kontoghiorghes Birbeck College Achim Zeileis
More informationMATH 829: Introduction to Data Mining and Analysis Model selection
1/12 MATH 829: Introduction to Data Mining and Analysis Model selection Dominique Guillot Departments of Mathematical Sciences University of Delaware February 24, 2016 2/12 Comparison of regression methods
More information1 StatLearn Practical exercise 5
1 StatLearn Practical exercise 5 Exercise 1.1. Download the LA ozone data set from the book homepage. We will be regressing the cube root of the ozone concentration on the other variables. Divide the data
More informationBIOL 458 BIOMETRY Lab 10 - Multiple Regression
BIOL 458 BIOMETRY Lab 10 - Multiple Regression Many problems in science involve the analysis of multi-variable data sets. For data sets in which there is a single continuous dependent variable, but several
More informationThe Data. Math 158, Spring 2016 Jo Hardin Shrinkage Methods R code Ridge Regression & LASSO
Math 158, Spring 2016 Jo Hardin Shrinkage Methods R code Ridge Regression & LASSO The Data The following dataset is from Hastie, Tibshirani and Friedman (2009), from a studyby Stamey et al. (1989) of prostate
More informationPackage msgps. February 20, 2015
Type Package Package msgps February 20, 2015 Title Degrees of freedom of elastic net, adaptive lasso and generalized elastic net Version 1.3 Date 2012-5-17 Author Kei Hirose Maintainer Kei Hirose
More informationLasso.jl Documentation
Lasso.jl Documentation Release 0.0.1 Simon Kornblith Jan 07, 2018 Contents 1 Lasso paths 3 2 Fused Lasso and trend filtering 7 3 Indices and tables 9 i ii Lasso.jl Documentation, Release 0.0.1 Contents:
More informationSTA121: Applied Regression Analysis
STA121: Applied Regression Analysis Variable Selection - Chapters 8 in Dielman Artin Department of Statistical Science October 23, 2009 Outline Introduction 1 Introduction 2 3 4 Variable Selection Model
More informationAmong those 14 potential explanatory variables,non-dummy variables are:
Among those 14 potential explanatory variables,non-dummy variables are: Size: 2nd column in the dataset Land: 14th column in the dataset Bed.Rooms: 5th column in the dataset Fireplace: 7th column in the
More informationDiscussion Notes 3 Stepwise Regression and Model Selection
Discussion Notes 3 Stepwise Regression and Model Selection Stepwise Regression There are many different commands for doing stepwise regression. Here we introduce the command step. There are many arguments
More informationContents Cont Hypothesis testing
Lecture 5 STATS/CME 195 Contents Hypothesis testing Hypothesis testing Exploratory vs. confirmatory data analysis Two approaches of statistics to analyze data sets: Exploratory: use plotting, transformations
More informationModel selection. Peter Hoff STAT 423. Applied Regression and Analysis of Variance. University of Washington /53
/53 Model selection Peter Hoff STAT 423 Applied Regression and Analysis of Variance University of Washington Diabetes example: y = diabetes progression x 1 = age x 2 = sex. dim(x) ## [1] 442 64 colnames(x)
More informationPackage glmnetutils. August 1, 2017
Type Package Version 1.1 Title Utilities for 'Glmnet' Package glmnetutils August 1, 2017 Description Provides a formula interface for the 'glmnet' package for elasticnet regression, a method for cross-validating
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Mark Johnson Macquarie University Sydney, Australia February 27, 2017 1 / 38 Outline Introduction Hypothesis tests and confidence intervals Classical hypothesis tests
More informationMultivariate Analysis Multivariate Calibration part 2
Multivariate Analysis Multivariate Calibration part 2 Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com Linear Latent Variables An essential concept in multivariate data
More informationGradient LASSO algoithm
Gradient LASSO algoithm Yongdai Kim Seoul National University, Korea jointly with Yuwon Kim University of Minnesota, USA and Jinseog Kim Statistical Research Center for Complex Systems, Korea Contents
More informationPackage EBglmnet. January 30, 2016
Type Package Package EBglmnet January 30, 2016 Title Empirical Bayesian Lasso and Elastic Net Methods for Generalized Linear Models Version 4.1 Date 2016-01-15 Author Anhui Huang, Dianting Liu Maintainer
More informationPackage TANDEM. R topics documented: June 15, Type Package
Type Package Package TANDEM June 15, 2017 Title A Two-Stage Approach to Maximize Interpretability of Drug Response Models Based on Multiple Molecular Data Types Version 1.0.2 Date 2017-04-07 Author Nanne
More informationLecture on Modeling Tools for Clustering & Regression
Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into
More informationPackage milr. June 8, 2017
Type Package Package milr June 8, 2017 Title Multiple-Instance Logistic Regression with LASSO Penalty Version 0.3.0 Date 2017-06-05 The multiple instance data set consists of many independent subjects
More informationI How does the formulation (5) serve the purpose of the composite parameterization
Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationLasso Regression: Regularization for feature selection
Lasso Regression: Regularization for feature selection CSE 416: Machine Learning Emily Fox University of Washington April 12, 2018 Symptom of overfitting 2 Often, overfitting associated with very large
More informationVariable selection is intended to select the best subset of predictors. But why bother?
Chapter 10 Variable Selection Variable selection is intended to select the best subset of predictors. But why bother? 1. We want to explain the data in the simplest way redundant predictors should be removed.
More informationRegression Analysis and Linear Regression Models
Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical
More informationPackage pampe. R topics documented: November 7, 2015
Package pampe November 7, 2015 Type Package Title Implementation of the Panel Data Approach Method for Program Evaluation Version 1.1.2 Date 2015-11-06 Author Ainhoa Vega-Bayo Maintainer Ainhoa Vega-Bayo
More information[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization
[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization L. Jason Anastasopoulos ljanastas@uga.edu February 2, 2017 Gradient descent Let s begin with our simple problem of estimating
More informationPackage LINselect. June 9, 2018
Title Selection of Linear Estimators Version 1.1 Date 2017-04-20 Package LINselect June 9, 2018 Author Yannick Baraud, Christophe Giraud, Sylvie Huet Maintainer ORPHANED Description Estimate the mean of
More informationComparison of Optimization Methods for L1-regularized Logistic Regression
Comparison of Optimization Methods for L1-regularized Logistic Regression Aleksandar Jovanovich Department of Computer Science and Information Systems Youngstown State University Youngstown, OH 44555 aleksjovanovich@gmail.com
More informationPackage sparsenetgls
Type Package Package sparsenetgls March 18, 2019 Title Using Gaussian graphical structue learning estimation in generalized least squared regression for multivariate normal regression Version 1.0.1 Maintainer
More informationQuantitative Methods in Management
Quantitative Methods in Management MBA Glasgow University March 20-23, 2009 Luiz Moutinho, University of Glasgow Graeme Hutcheson, University of Manchester Exploratory Regression The lecture notes, exercises
More informationModel selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41
1/41 Model selection 560 Hierarchical modeling Peter Hoff Statistics, University of Washington /41 Modeling choices Model: A statistical model is a set of probability distributions for your data. In HLM,
More informationMultiple Linear Regression
Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors
More informationStatistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Exploratory regression and model selection
Statistical Modelling for Social Scientists Manchester University January 20, 21 and 24, 2011 Graeme Hutcheson, University of Manchester Exploratory regression and model selection The lecture notes, exercises
More informationPackage GLDreg. February 28, 2017
Type Package Package GLDreg February 28, 2017 Title Fit GLD Regression Model and GLD Quantile Regression Model to Empirical Data Version 1.0.7 Date 2017-03-15 Author Steve Su, with contributions from:
More informationGlmnet Vignette. Introduction. Trevor Hastie and Junyang Qian
Glmnet Vignette Trevor Hastie and Junyang Qian Stanford September 13, 2016 Introduction Installation Quick Start Linear Regression Logistic Regression Poisson Models Cox Models Sparse Matrices Appendix
More informationStatistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010
Statistical Models for Management Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon February 24 26, 2010 Graeme Hutcheson, University of Manchester Exploratory regression and model
More informationsurvsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes
survsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes Kouros Owzar Zhiguo Li Nancy Cox Sin-Ho Jung Chanhee Yi June 29, 2016 1 Introduction This vignette
More informationModel Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer
Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error
More informationPackage MMIX. R topics documented: February 15, Type Package. Title Model selection uncertainty and model mixing. Version 1.2.
Package MMIX February 15, 2013 Type Package Title Model selection uncertainty and model mixing Version 1.2 Date 2012-06-18 Author Marie Morfin and David Makowski Maintainer Description
More informationPredictor Selection Algorithm for Bayesian Lasso
Predictor Selection Algorithm for Baesian Lasso Quan Zhang Ma 16, 2014 1 Introduction The Lasso [1] is a method in regression model for coefficients shrinkage and model selection. It is often used in the
More informationVariable Selection 6.783, Biomedical Decision Support
6.783, Biomedical Decision Support (lrosasco@mit.edu) Department of Brain and Cognitive Science- MIT November 2, 2009 About this class Why selecting variables Approaches to variable selection Sparsity-based
More informationLab #13 - Resampling Methods Econ 224 October 23rd, 2018
Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Introduction In this lab you will work through Section 5.3 of ISL and record your code and results in an RMarkdown document. I have added section
More informationRMassBank for XCMS. Erik Müller. January 4, Introduction 2. 2 Input files LC/MS data Additional Workflow-Methods 2
RMassBank for XCMS Erik Müller January 4, 2019 Contents 1 Introduction 2 2 Input files 2 2.1 LC/MS data........................... 2 3 Additional Workflow-Methods 2 3.1 Options..............................
More informationA Knitr Demo. Charles J. Geyer. February 8, 2017
A Knitr Demo Charles J. Geyer February 8, 2017 1 Licence This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/.
More informationPackage TVsMiss. April 5, 2018
Type Package Title Variable Selection for Missing Data Version 0.1.1 Date 2018-04-05 Author Jiwei Zhao, Yang Yang, and Ning Yang Maintainer Yang Yang Package TVsMiss April 5, 2018
More information9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10
St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................
More informationPoisson Regression and Model Checking
Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)
More informationLecture 16: High-dimensional regression, non-linear regression
Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we
More informationLECTURE 12: LINEAR MODEL SELECTION PT. 3. October 23, 2017 SDS 293: Machine Learning
LECTURE 12: LINEAR MODEL SELECTION PT. 3 October 23, 2017 SDS 293: Machine Learning Announcements 1/2 Presentation of the CS Major & Minors TODAY @ lunch Ford 240 FREE FOOD! Announcements 2/2 CS Internship
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationStat 4510/7510 Homework 6
Stat 4510/7510 1/11. Stat 4510/7510 Homework 6 Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details so that
More informationFeature Selec+on. Machine Learning Fall 2018 Kasthuri Kannan
Feature Selec+on Machine Learning Fall 2018 Kasthuri Kannan Interpretability vs. Predic+on Types of feature selec+on Subset selec+on/forward/backward Shrinkage (Lasso/Ridge) Best model (CV) Feature selec+on
More informationPredictive Checking. Readings GH Chapter 6-8. February 8, 2017
Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison
More informationPackage freeknotsplines
Version 1.0.1 Date 2018-05-17 Package freeknotsplines June 10, 2018 Title Algorithms for Implementing Free-Knot Splines Author , Philip Smith , Pierre Lecuyer
More informationSection 2.3: Simple Linear Regression: Predictions and Inference
Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple
More informationIntroduction to R, Github and Gitlab
Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and
More informationPackage ordinalnet. December 5, 2017
Type Package Title Penalized Ordinal Regression Version 2.4 Package ordinalnet December 5, 2017 Fits ordinal regression models with elastic net penalty. Supported model families include cumulative probability,
More informationA Method for Comparing Multiple Regression Models
CSIS Discussion Paper No. 141 A Method for Comparing Multiple Regression Models Yuki Hiruta Yasushi Asami Department of Urban Engineering, the University of Tokyo e-mail: hiruta@ua.t.u-tokyo.ac.jp asami@csis.u-tokyo.ac.jp
More informationOrganizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set
Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,
More informationPackage scoop. September 16, 2011
Package scoop September 16, 2011 Version 0.2-1 Date 2011-xx-xx Title Sparse cooperative regression Author Julien Chiquet Maintainer Julien Chiquet Depends MASS, methods
More informationSTAT Statistical Learning. Predictive Modeling. Statistical Learning. Overview. Predictive Modeling. Classification Methods.
STAT 48 - STAT 48 - December 5, 27 STAT 48 - STAT 48 - Here are a few questions to consider: What does statistical learning mean to you? Is statistical learning different from statistics as a whole? What
More informationVARIABLE SELECTION MADE EASY USING GENREG IN JMP PRO
VARIABLE SELECTION MADE EASY USING GENREG IN JMP PRO Clay Barker Senior Research Statistician Developer JMP Division, SAS Institute THE IMPORTANCE OF VARIABLE SELECTION In 1996, Brad Efron (famous for
More informationPaper ST-157. Dennis J. Beal, Science Applications International Corporation, Oak Ridge, Tennessee
Paper ST-157 SAS Code for Variable Selection in Multiple Linear Regression Models Using Information Criteria Methods with Explicit Enumeration for a Large Number of Independent Regressors Dennis J. Beal,
More informationThis is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or
STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible
More informationBayes Estimators & Ridge Regression
Bayes Estimators & Ridge Regression Readings ISLR 6 STA 521 Duke University Merlise Clyde October 27, 2017 Model Assume that we have centered (as before) and rescaled X o (original X) so that X j = X o
More informationPerformance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018
Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:
More informationBIOL 458 BIOMETRY Lab 10 - Multiple Regression
BIOL 458 BIOMETRY Lab 0 - Multiple Regression Many problems in biology science involve the analysis of multivariate data sets. For data sets in which there is a single continuous dependent variable, but
More informationCross-Validation Alan Arnholt 3/22/2016
Cross-Validation Alan Arnholt 3/22/2016 Note: Working definitions and graphs are taken from Ugarte, Militino, and Arnholt (2016) The Validation Set Approach The basic idea behind the validation set approach
More informationPackage stepnorm. R topics documented: April 10, Version Date
Version 1.38.0 Date 2008-10-08 Package stepnorm April 10, 2015 Title Stepwise normalization functions for cdna microarrays Author Yuanyuan Xiao , Yee Hwa (Jean) Yang
More informationPredicting Web Service Levels During VM Live Migrations
Predicting Web Service Levels During VM Live Migrations 5th International DMTF Academic Alliance Workshop on Systems and Virtualization Management: Standards and the Cloud Helmut Hlavacs, Thomas Treutner
More informationThe RcmdrPlugin.HH Package
Type Package The RcmdrPlugin.HH Package Title Rcmdr support for the HH package Version 1.1-4 Date 2007-07-24 July 31, 2007 Author Richard M. Heiberger, with contributions from Burt Holland. Maintainer
More informationLeveling Up as a Data Scientist. ds/2014/10/level-up-ds.jpg
Model Optimization Leveling Up as a Data Scientist http://shorelinechurch.org/wp-content/uploa ds/2014/10/level-up-ds.jpg Bias and Variance Error = (expected loss of accuracy) 2 + flexibility of model
More informationRegularized Committee of Extreme Learning Machine for Regression Problems
Regularized Committee of Extreme Learning Machine for Regression Problems Pablo Escandell-Montero, José M. Martínez-Martínez, Emilio Soria-Olivas, Josep Guimerá-Tomás, Marcelino Martínez-Sober and Antonio
More informationExploratory model analysis
Exploratory model analysis with R and GGobi Hadley Wickham 6--8 Introduction Why do we build models? There are two basic reasons: explanation or prediction [Ripley, 4]. Using large ensembles of models
More informationOutline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model
Topic 16 - Other Remedies Ridge Regression Robust Regression Regression Trees Outline - Fall 2013 Piecewise Linear Model Bootstrapping Topic 16 2 Ridge Regression Modification of least squares that addresses
More informationChapter 10: Variable Selection. November 12, 2018
Chapter 10: Variable Selection November 12, 2018 1 Introduction 1.1 The Model-Building Problem The variable selection problem is to find an appropriate subset of regressors. It involves two conflicting
More informationGAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.
GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential
More informationGeneralized Additive Models
Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.
More informationModel selection and validation 1: Cross-validation
Model selection and validation 1: Cross-validation Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013 Optional reading: ISL 2.2, 5.1, ESL 7.4, 7.10 1 Reminder: modern regression techniques Over the
More informationThe leaps Package. April 22, Author Thomas Lumley using Fortran code by Alan Miller
The leaps Package April 22, 2004 Title regression subset selection Version 2.7 Author Thomas Lumley using Fortran code by Alan Miller Regression subset selection including ehaustive
More information