Lecture A. Exhaustive and non-exhaustive algorithms without resampling. Samuel Müller and Garth Tarr

Size: px
Start display at page:

Download "Lecture A. Exhaustive and non-exhaustive algorithms without resampling. Samuel Müller and Garth Tarr"

Transcription

1 Lecture A Exhaustive and non-exhaustive algorithms without resampling Samuel Müller and Garth Tarr

2 "Doubt is not a pleasant mental state, but certainty is a ridiculous one." Voltaire ( ) Necessity to select models is great despite philosophical issues Myriad of model/variable selection methods based on loglikelihood, RSS, prediction errors, resampling, etc 2 / 42

3 Some motivating questions How to practically select model(s)? How to cope with an ever increasing number of observed variables? How to visualise the model building process? How to assess the stability of a selected model? 3 / 42

4 Topics Selecting models Stepwise model selection Information criteria including AIC and BIC Exhaustive and non-exhaustive searches Regularization methods 4 / 42

5 Body fat data N = 252 measurements of men (available in R package mfp) Training sample size n = 128 (available in R package mplot) Response variable: body fat percentage: Bodyfat 13 or 14 explanatory variables data("bodyfat", package = "mplot") dim(bodyfat) ## [1] names(bodyfat) ## [1] "Id" "Bodyfat" "Age" "Weight" "Height" "Neck" ## [7] "Chest" "Abdo" "Hip" "Thigh" "Knee" "Ankle" ## [13] "Bic" "Fore" "Wrist" Source: Johnson (1996) 5 / 42

6 Body fat: Boxplots bfat = bodyfat[, -1] # remove the ID column boxplot(bfat, horizontal = TRUE, las = 1) 6 / 42

7 pairs(bfat) 7 / 42

8 Body fat data: Full model and power set names(bfat) ## [1] "Bodyfat" "Age" "Weight" "Height" "Neck" "Chest" ## [7] "Abdo" "Hip" "Thigh" "Knee" "Ankle" "Bic" ## [13] "Fore" "Wrist" Full model Power set Bodyfat=β 1 + β 2 Age + β 3 Weight + β 4 Height + β 5 Height + β 6 Chest + β 7 Abdo + β 8 Hip + β 9 Thigh + β 10 Knee + β 11 Ankle + β 12 Bic + β 13 Fore + β 14 Wrist + ϵ Here, the intercept is not subject to selection. Therefore, the number of possible regression models is: M = #A = 2 13 = 8192 p 1 = 13 and thus 8 / 42

9 Body fat: Best bivariate tted model M0 = lm(bodyfat ~ Abdo, data = bfat) summary(m0) ## ## Call: ## lm(formula = Bodyfat ~ Abdo, data = bfat) ## ## Residuals: ## Min 1Q Median 3Q Max ## ## ## Coefficients: ## Estimate Std. Error t value Pr(> t ) ## (Intercept) <2e-16 *** ## Abdo <2e-16 *** ## --- ## Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: on 126 degrees of freedom ## Multiple R-squared: , Adjusted R-squared: ## F-statistic: on 1 and 126 DF, p-value: < 2.2e-16 9 / 42

10 Body fat: Full tted model M1 = lm(bodyfat ~., data = bfat) summary(m1) ## ## Call: ## lm(formula = Bodyfat ~., data = bfat) ## ## Residuals: ## Min 1Q Median 3Q Max ## ## ## Coefficients: ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Age ## Weight ## Height ## Neck ## Chest ## Abdo e-13 *** ## Hip ## Thigh ## Knee ## Ankle ## Bic ## Fore ## Wrist ## / 42

11 The AIC and BIC AIC is the most widely known and used model selection method (of course this does not imply it is the best/recommended method to use) AIC Akaike (1973): BIC Schwarz (1978): AIC = 2 LogLik + 2 p BIC = 2 LogLik + log(n) p The smaller the AIC/BIC the better the model These and many other criteria choose models by minimizing an expression that can be written as Loss + Penalty 11 / 42

12 Stepwise variable selection Stepwise, forward and backward: 1. Start with some model, typically null model (with no explanatory variables) or full model (with all variables) 2. For each variable in the current model, investigate effect of removing it 3. Remove the least informative variable, unless this variable is nonetheless supplying signi cant information about the response 4. For each variable not in the current model, investigate effect of including it 5. Include the most statistically signi cant variable not currently in model (unless no signi cant variable exists) 6. Go to step 2. Stop only if no change in steps / 42

13 Body fat: Forward search using BIC M0 = lm(bodyfat ~ 1, data = bfat) # Null model M1 = lm(bodyfat ~., data = bfat) # Full model step.fwd.bic = step(m0, scope = list(lower = M0, upper = M1), direction = "forward", trace = FALSE, k = log(128)) # BIC: k=log(n) # summary(step.fwd.bic) round(summary(step.fwd.bic)$coef,3) ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Abdo ## Weight Best model has two features: Abdo and Weight Backward search using BIC gives the same model (con rm yourself) 13 / 42

14 Body fat: Backward search using AIC M0 = lm(bodyfat ~ 1, data = bfat) # Null model M1 = lm(bodyfat ~., data = bfat) # Full model step.bwd.aic = step(m1, scope = list(lower = M0, upper = M1), direction = "backward", trace = FALSE, k = 2) round(summary(step.bwd.aic)$coef,3) ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Weight ## Neck ## Abdo ## Fore Best model has four variables Using AIC gives larger model because 2 < log(128) = 4.85 k=2 is the default in step() 14 / 42

15 Body fat: Stepwise using AIC M0 = lm(bodyfat ~ 1, data = bfat) # Null model M1 = lm(bodyfat ~., data = bfat) # Full model step.aic = step(m1, scope = list(lower = M0, upper = M1), trace = FALSE) round(summary(step.aic)$coef,3) ## Estimate Std. Error t value Pr(> t ) ## (Intercept) ## Weight ## Neck ## Abdo ## Fore Here, default stepwise gives the same as bacward using AIC This does not hold in general 15 / 42

16 Criticism of stepwise procedures Never run (p value based) automatic stepwise procedures on their own! Stepwise regression is probably the most abused computerized statistical technique ever devised. If you think you need stepwise regression to solve a particular problem you have, it is almost certain you do not. Professional statisticians rarely use automated stepwise regression. Wilkinson (1998) See Wiegand (2010) for a more recent simulation study and review of the performance of stepwise procedures 16 / 42

17 Exhaustive search Exhaustive search is the only technique guaranteed to nd the predictor variable subset with the best evaluation criterion. Since we look over the whole model space, we can identify the best model(s) at each model size Sometimes known as best subsets model selection Loss component is (typically) the residual sum of squares Main drawback: exhaustive searching is computationally intensive 17 / 42

18 The leaps package The leaps package performs regression subset selection including exhaustive searches (Lumley and Miller, 2009) for linear models Main function: regsubsets() performs an exhaustive search of the model space to nd the best subsets of variables for predicting y in linear regression Uses an e cient branch-and-bound algorithm Alternative function: leaps() is just a compatibility wrapper for regsubsets() 18 / 42

19 The regsubsets function Input: full model and various optional parameters. Output: a regsubsets object but you interact with it using the summary function which gives: which A logical matrix indicating which elements are in each model rsq The r-squared for each model rss Residual sum of squares for each model adjr2 Adjusted r-squared cp Mallows' Cp bic Bayesian information criterion outmat A version of the which component that is formatted for printing 19 / 42

20 Body fat: Exhaustive search require(leaps) rgsbst.out = regsubsets(bodyfat ~., data = bfat, nbest = 1, # 1 best model for each dimension nvmax = NULL, # NULL for no limit on number of variables force.in = NULL, force.out = NULL, method = "exhaustive") rgsbst.out ## Subset selection object ## Call: regsubsets.formula(bodyfat ~., data = bfat, nbest = 1, nvmax = NULL, ## force.in = NULL, force.out = NULL, method = "exhaustive") ## 13 Variables (and intercept) ## Forced in Forced out ## Age FALSE FALSE ## Weight FALSE FALSE ## Height FALSE FALSE ## Neck FALSE FALSE ## Chest FALSE FALSE ## Abdo FALSE FALSE ## Hip FALSE FALSE ## Thigh FALSE FALSE ## Knee FALSE FALSE ## Ankle FALSE FALSE ## Bic FALSE FALSE ## Fore FALSE FALSE ## Wrist FALSE FALSE ## 1 subsets of each size up to 13 ## Selection Algorithm: exhaustive 20 / 42

21 Body fat: Exhaustive search summary.out = summary(rgsbst.out) names(summary.out) ## [1] "which" "rsq" "rss" "adjr2" "cp" "bic" "outmat" "obj" as.data.frame(summary.out$outmat) ## Age Weight Height Neck Chest Abdo Hip Thigh Knee Ankle Bic Fore Wrist ## 1 ( 1 ) * ## 2 ( 1 ) * * ## 3 ( 1 ) * * * ## 4 ( 1 ) * * * * ## 5 ( 1 ) * * * * * ## 6 ( 1 ) * * * * * * ## 7 ( 1 ) * * * * * * * ## 8 ( 1 ) * * * * * * * * ## 9 ( 1 ) * * * * * * * * * ## 10 ( 1 ) * * * * * * * * * * ## 11 ( 1 ) * * * * * * * * * * * ## 12 ( 1 ) * * * * * * * * * * * * ## 13 ( 1 ) * * * * * * * * * * * * * 21 / 42

22 Body fat: Exhaustive search and C p summary.out$cp ## [1] ## [8] (id = which.min(summary.out$cp)) ## [1] 4 summary.out$which[id, ] ## (Intercept) Age Weight Height Neck Chest Abdo ## TRUE FALSE TRUE FALSE TRUE FALSE TRUE ## Hip Thigh Knee Ankle Bic Fore Wrist ## FALSE FALSE FALSE FALSE FALSE TRUE FALSE names(which(summary.out$which[id, ] == TRUE)) ## [1] "(Intercept)" "Weight" "Neck" "Abdo" "Fore" 22 / 42

23 Body fat: Exhaustive search and C p plot(rgsbst.out, scale = "Cp", main = "Mallows' Cp") 23 / 42

24 Body fat: Exhaustive search and C p plot(summary.out$cp ~ rowsums(summary.out$which), ylab = "Cp", xlab = "Number of parameters") abline(v=rowsums(summary.out$which)[id],col=gray(0.5)) 24 / 42

25 Regularisation methods with glmnet Regularisation methods shrink estimated regression coe imposing a penalty on their sizes cients by There are different choices for the penalty The penalty choice drives the properties of the method In particular, glmnet implements Lasso regression using a Ridge regression using a L 1 L 2 penalty penalty Elastic-net using both, an L 1 and an L 2 penalty Adaptive lasso, it takes advantage of feature weights 25 / 42

26 Lasso is least squares with L 1 penalty Simultaneously estimate and select coe cients through ^β lasso = ^β lasso (λ) = argmin β R p{(y Xβ) T (y Xβ) + λ β 1 } positive tuning parameter λ controls the shrinkage Optimisation problem above is equivalent to solving minimise RSS(β) = (y Xβ) T (y Xβ) subject to p β j t. j=1 26 / 42

27 Example: Diabetes We consider the diabetes data and we only analyse main effects The data from the lars package is the same as the mplot package but is structured and scaled differently data("diabetes", package = "lars") names(diabetes) # list, slot x has 10 main effects ## [1] "x" "y" "x2" dim(diabetes) ## [1] x = diabetes$x y = diabetes$y 27 / 42

28 Example: Diabetes class(x) = NULL # remove the AsIs class; otherwise single boxplot drawn boxplot(x, cex.axis = 2) 28 / 42

29 Example: Diabetes lm.fit = lm(y ~ x) #least squares fit for the full model first summary(lm.fit) ## ## Call: ## lm(formula = y ~ x) ## ## Residuals: ## Min 1Q Median 3Q Max ## ## ## Coefficients: ## Estimate Std. Error t value Pr(> t ) ## (Intercept) < 2e-16 *** ## xage ## xsex *** ## xbmi e-14 *** ## xmap e-06 *** ## xtc ## xldl ## xhdl ## xtch ## xltg e-05 *** ## xglu / 42

30 Fitting a lasso regression Lasso regression with the glmnet() function in library(glmnet) has as a rst argument the design matrix and as a second argument the response vector, i.e. library(glmnet) lasso.fit = glmnet(x,y) 30 / 42

31 Comments glmnet.fit is an object of class glmnet that contains all the relevant information of the tted model for further use Various methods are provided for the object such as coef, plot, predict and print that extract and present useful object information The additional material shows that cross-validation can be used to choose a good value for λ through using the function cv.glmnet set.seed(1) # to get a reproducable lambda.min value lasso.cv = cv.glmnet(x, y) lasso.cv$lambda.min ## [1] / 42

32 Lasso regression coe cients Clearly the regression coe cients are smaller for the lasso regression (this slide) than for least squares (next slide) round(coef(lasso.fit,s=c(5,1,0.5,0)),2) ## 11 x 4 sparse Matrix of class "dgcmatrix" ## ## (Intercept) ## age ## sex ## bmi ## map ## tc ## ldl ## hdl ## tch ## ltg ## glu / 42

33 Lasso regression vs least squares round(coef(lm.fit),2) ## (Intercept) xage xsex xbmi xmap ## ## xtc xldl xhdl xtch xltg ## ## xglu ## beta.lasso = coef(lasso.fit,s=1) sum(abs(beta.lasso[-1])) ## [1] beta.ls = coef(lm.fit) sum(abs(beta.ls[-1])) ## [1] / 42

34 Lasso coe cient plot plot(lasso.fit, label = TRUE, cex.axis = 1.5, cex.lab = 1.5) 34 / 42

35 We observe that the lasso Can shrink coe cients exactly to zero Simultaneously estimates and selects coe cients Paths are not necessarily monotone (e.g. coe table for different values of s) cients for hdl in above 35 / 42

36 Further remarks on lasso lasso is an acronym for least absolute selection and shrinkage operator Theoretically, when the tuning parameter λ = 0, there is no bias and ^β lasso = ^β LS ( p n ) however, glmnet approximates the solution When λ =, we get ^βlasso = 0 an estimator with zero variance For λ in between, we are balancing bias and variance As λ increases, more coe cients are set to zero (less variables are selected), and among the nonzero coe cients, more shrinkage is employed 36 / 42

37 Why does the lasso give zero coe cients? This gure is taken from James, Witten, Hastie, and Tibshirani (2013) "An Introduction to Statistical Learning, with applications in R" with permission from the authors. 37 / 42

38 Ridge is least squares with L 2 penalty As the lasso, ridge regression shrinks the estimated regression coe cients by imposing a penalty on their sizes minimise RSS(β) = (y Xβ) T (y Xβ) subject to p β 2 t, j j=1 where the positive tuning parameter λ controls the shrinkage However, unlike the lasso, ridge regression does not shrink the parameters all the way to zero Ridge coe cients have closed form solution (even when n < p) ^β ridge (λ) = (X T X + λi) 1 X T y 38 / 42

39 Fitting a ridge regression Ridge regression with the glmnet() function by changing one of its default options to alpha=0 ridge.fit = glmnet(x, y, alpha = 0) ridge.cv = cv.glmnet(x, y, alpha = 0) ridge.cv$lambda.min ## [1] Comments In general, the optimal λ for ridge is very different than the optimal λ for the lasso β 2 j β j The scale of and is very different α is known as the elastic-net mixing parameter 39 / 42

40 Elastic-net: best of both worlds? Minimising the (penalised) log-loglikelihood in linear regression with iid errors is equivalent to minimising the (penalised) RSS More generally, the elastic-net minimises the following objective function over β 0 and β over a grid of values of λ covering the entire range 1 n N w i l(y i, β 0 + β T x i ) + λ [(1 α) i=1 p β 2 + j α j=1 p β j ] j=1 This is general enough to include logistic regression, generalised linear models and Cox regression, where l(y i ) is the log-likelihood of observation i and an optional observation weight w i 40 / 42

41 References Akaike, H. (1973). "Information theory and and extension of the maximum likelihood principle". In: Second International Symposium on Information Theory. Ed. by B. Petrov and F. Csaki. Budapest: Academiai Kiado, pp James, G, D. Witten, T. Hastie, et al. (2013). An introduction to statistical learning with applications in R. New York, NY: Springer. ISBN: URL: Johnson, R. W. (1996). "Fitting Percentage of Body Fat to Simple Body Measurements". In: Journal of Statistics Education 4.1. URL: Lumley, T. and A. Miller (2009). leaps: regression subset selection. R package version 2.9. URL: Schwarz, G. (1978). "Estimating the Dimension of a Model". In: The Annals of Statistics 6.2, pp DOI: /aos/ Wiegand, R. E. (2010). "Performance of using multiple stepwise algorithms for variable selection". In: Statistics in Medicine 29.15, pp DOI: /sim Wilkinson, L. (1998). SYSTAT 8 statistics manual. SPSS Inc. 41 / 42

42 devtools::session_info(include_base = FALSE) ## mime [1] CRAN (R 3.5.0) ## Session info ## mplot * [1] CRAN (R 3.5.2) ## setting value ## version R version ( ) ## pillar ## pkgbuild [1] CRAN (R 3.5.0) [1] CRAN (R 3.5.2) ## os macos Mojave ## pkgconfig [1] CRAN (R 3.5.0) ## system x86_64, darwin ## pkgload [1] CRAN (R 3.5.0) ## ui X11 ## pkgmaker [1] CRAN (R 3.5.0) ## language (EN) ## collate en_au.utf-8 ## plyr ## prettyunits [1] CRAN (R 3.5.0) [1] CRAN (R 3.5.0) ## ctype en_au.utf-8 ## processx [1] CRAN (R 3.5.2) ## tz Australia/Sydney ## promises [1] CRAN (R 3.5.0) ## date ## ps [1] CRAN (R 3.5.0) ## ## purrr [1] CRAN (R 3.5.2) ## Packages ## R [1] CRAN (R 3.5.2) ## package * version date lib source ## assertthat [1] CRAN (R 3.5.2) ## backports [1] CRAN (R 3.5.0) ## bibtex [1] CRAN (R 3.5.0) ## callr [1] CRAN (R 3.5.2) ## cli [1] CRAN (R 3.5.2) ## codetools [1] CRAN (R 3.5.2) ## crayon [1] CRAN (R 3.5.0) ## desc [1] CRAN (R 3.5.0) ## devtools [1] CRAN (R 3.5.1) ## digest [1] CRAN (R 3.5.0) ## dorng [1] CRAN (R 3.5.0) ## dplyr [1] CRAN (R 3.5.2) ## evaluate [1] CRAN (R 3.5.2) ## foreach * [1] CRAN (R 3.5.0) ## formatr [1] CRAN (R 3.5.2) ## fs [1] CRAN (R 3.5.2) ## glmnet * [1] CRAN (R 3.5.0) ## glue [1] CRAN (R 3.5.2) ## htmltools [1] CRAN (R 3.5.0) ## httpuv [1] CRAN (R 3.5.2) ## httr [1] CRAN (R 3.5.0) ## iterators [1] CRAN (R 3.5.0) ## jsonlite [1] CRAN (R 3.5.0) ## knitcitations * [1] CRAN (R 3.5.0) ## knitr * [1] CRAN (R 3.5.2) ## later [1] CRAN (R 3.5.2) ## lattice [1] CRAN (R 3.5.2) ## leaps * [1] CRAN (R 3.5.0) ## lubridate [1] CRAN (R 3.5.0) ## magrittr [1] CRAN (R 3.5.0) ## Matrix * [1] CRAN (R 3.5.2) ## memoise [1] CRAN (R 3.5.0) ## Rcpp [1] CRAN (R 3.5.2) ## RefManageR [1] Github (ropensci/refmanager@3834c ## registry [1] CRAN (R 3.5.2) ## remotes [1] CRAN (R 3.5.0) ## rlang [1] CRAN (R 3.5.2) ## rmarkdown [1] CRAN (R 3.5.2) ## rngtools [1] CRAN (R 3.5.0) ## rprojroot [1] CRAN (R 3.5.0) ## sessioninfo [1] CRAN (R 3.5.0) ## shiny [1] CRAN (R 3.5.1) ## shinydashboard [1] CRAN (R 3.5.0) ## stringi [1] CRAN (R 3.5.2) ## stringr [1] CRAN (R 3.5.2) ## testthat [1] CRAN (R 3.5.0) ## tibble [1] CRAN (R 3.5.2) ## tidyselect [1] CRAN (R 3.5.0) ## usethis [1] CRAN (R 3.5.0) ## withr [1] CRAN (R 3.5.0) ## xaringan [1] CRAN (R 3.5.2) ## xfun [1] CRAN (R 3.5.2) ## xml [1] CRAN (R 3.5.0) ## xtable [1] CRAN (R 3.5.0) ## yaml [1] CRAN (R 3.5.1) ## ## [1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library 42 / 42

Lecture B. Exhaustive and non-exhaustive algorithms with resampling. Samuel Müller and Garth Tarr

Lecture B. Exhaustive and non-exhaustive algorithms with resampling. Samuel Müller and Garth Tarr Lecture B Exhaustive and non-exhaustive algorithms with resampling Samuel Müller and Garth Tarr Overview 1. Cross validation 2. The mplot package 3. Arti cial example: Information criteria can be blind

More information

Package biglars. February 19, 2015

Package biglars. February 19, 2015 Package biglars February 19, 2015 Type Package Title Scalable Least-Angle Regression and Lasso Version 1.0.2 Date Tue Dec 27 15:06:08 PST 2011 Author Mark Seligman, Chris Fraley, Tim Hesterberg Maintainer

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

Lasso. November 14, 2017

Lasso. November 14, 2017 Lasso November 14, 2017 Contents 1 Case Study: Least Absolute Shrinkage and Selection Operator (LASSO) 1 1.1 The Lasso Estimator.................................... 1 1.2 Computation of the Lasso Solution............................

More information

Package leaps. R topics documented: May 5, Title regression subset selection. Version 2.9

Package leaps. R topics documented: May 5, Title regression subset selection. Version 2.9 Package leaps May 5, 2009 Title regression subset selection Version 2.9 Author Thomas Lumley using Fortran code by Alan Miller Description Regression subset selection including

More information

Information Criteria Methods in SAS for Multiple Linear Regression Models

Information Criteria Methods in SAS for Multiple Linear Regression Models Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information

More information

Penalized regression Statistical Learning, 2011

Penalized regression Statistical Learning, 2011 Penalized regression Statistical Learning, 2011 Niels Richard Hansen September 19, 2011 Penalized regression is implemented in several different R packages. Ridge regression can, in principle, be carried

More information

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100.

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100. Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records

More information

Assignment 6 - Model Building

Assignment 6 - Model Building Assignment 6 - Model Building your name goes here Due: Wednesday, March 7, 2018, noon, to Sakai Summary Primarily from the topics in Chapter 9 of your text, this homework assignment gives you practice

More information

arxiv: v2 [stat.me] 28 Feb 2017

arxiv: v2 [stat.me] 28 Feb 2017 mplot: An R Package for Graphical Model Stability and Variable Selection Procedures Garth Tarr University of Newcastle Samuel Mueller University of Sydney Alan H. Welsh Australian National University Abstract

More information

Regularization Methods. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel

Regularization Methods. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Regularization Methods Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Avoiding overfitting and improving model interpretability with the help of regularization

More information

2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy

2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy 2017 ITRON EFG Meeting Abdul Razack Specialist, Load Forecasting NV Energy Topics 1. Concepts 2. Model (Variable) Selection Methods 3. Cross- Validation 4. Cross-Validation: Time Series 5. Example 1 6.

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors Week 10: Autocorrelated errors This week, I have done one possible analysis and provided lots of output for you to consider. Case study: predicting body fat Body fat is an important health measure, but

More information

Chapter 6: Linear Model Selection and Regularization

Chapter 6: Linear Model Selection and Regularization Chapter 6: Linear Model Selection and Regularization As p (the number of predictors) comes close to or exceeds n (the sample size) standard linear regression is faced with problems. The variance of the

More information

Package FWDselect. December 19, 2015

Package FWDselect. December 19, 2015 Title Selecting Variables in Regression Models Version 2.1.0 Date 2015-12-18 Author Marta Sestelo [aut, cre], Nora M. Villanueva [aut], Javier Roca-Pardinas [aut] Maintainer Marta Sestelo

More information

14. League: A factor with levels A and N indicating player s league at the end of 1986

14. League: A factor with levels A and N indicating player s league at the end of 1986 PENALIZED REGRESSION Ridge and The LASSO Note: The example contained herein was copied from the lab exercise in Chapter 6 of Introduction to Statistical Learning by. For this exercise, we ll use some baseball

More information

mcssubset: Efficient Computation of Best Subset Linear Regressions in R

mcssubset: Efficient Computation of Best Subset Linear Regressions in R mcssubset: Efficient Computation of Best Subset Linear Regressions in R Marc Hofmann Université de Neuchâtel Cristian Gatu Université de Neuchâtel Erricos J. Kontoghiorghes Birbeck College Achim Zeileis

More information

MATH 829: Introduction to Data Mining and Analysis Model selection

MATH 829: Introduction to Data Mining and Analysis Model selection 1/12 MATH 829: Introduction to Data Mining and Analysis Model selection Dominique Guillot Departments of Mathematical Sciences University of Delaware February 24, 2016 2/12 Comparison of regression methods

More information

1 StatLearn Practical exercise 5

1 StatLearn Practical exercise 5 1 StatLearn Practical exercise 5 Exercise 1.1. Download the LA ozone data set from the book homepage. We will be regressing the cube root of the ozone concentration on the other variables. Divide the data

More information

BIOL 458 BIOMETRY Lab 10 - Multiple Regression

BIOL 458 BIOMETRY Lab 10 - Multiple Regression BIOL 458 BIOMETRY Lab 10 - Multiple Regression Many problems in science involve the analysis of multi-variable data sets. For data sets in which there is a single continuous dependent variable, but several

More information

The Data. Math 158, Spring 2016 Jo Hardin Shrinkage Methods R code Ridge Regression & LASSO

The Data. Math 158, Spring 2016 Jo Hardin Shrinkage Methods R code Ridge Regression & LASSO Math 158, Spring 2016 Jo Hardin Shrinkage Methods R code Ridge Regression & LASSO The Data The following dataset is from Hastie, Tibshirani and Friedman (2009), from a studyby Stamey et al. (1989) of prostate

More information

Package msgps. February 20, 2015

Package msgps. February 20, 2015 Type Package Package msgps February 20, 2015 Title Degrees of freedom of elastic net, adaptive lasso and generalized elastic net Version 1.3 Date 2012-5-17 Author Kei Hirose Maintainer Kei Hirose

More information

Lasso.jl Documentation

Lasso.jl Documentation Lasso.jl Documentation Release 0.0.1 Simon Kornblith Jan 07, 2018 Contents 1 Lasso paths 3 2 Fused Lasso and trend filtering 7 3 Indices and tables 9 i ii Lasso.jl Documentation, Release 0.0.1 Contents:

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Variable Selection - Chapters 8 in Dielman Artin Department of Statistical Science October 23, 2009 Outline Introduction 1 Introduction 2 3 4 Variable Selection Model

More information

Among those 14 potential explanatory variables,non-dummy variables are:

Among those 14 potential explanatory variables,non-dummy variables are: Among those 14 potential explanatory variables,non-dummy variables are: Size: 2nd column in the dataset Land: 14th column in the dataset Bed.Rooms: 5th column in the dataset Fireplace: 7th column in the

More information

Discussion Notes 3 Stepwise Regression and Model Selection

Discussion Notes 3 Stepwise Regression and Model Selection Discussion Notes 3 Stepwise Regression and Model Selection Stepwise Regression There are many different commands for doing stepwise regression. Here we introduce the command step. There are many arguments

More information

Contents Cont Hypothesis testing

Contents Cont Hypothesis testing Lecture 5 STATS/CME 195 Contents Hypothesis testing Hypothesis testing Exploratory vs. confirmatory data analysis Two approaches of statistics to analyze data sets: Exploratory: use plotting, transformations

More information

Model selection. Peter Hoff STAT 423. Applied Regression and Analysis of Variance. University of Washington /53

Model selection. Peter Hoff STAT 423. Applied Regression and Analysis of Variance. University of Washington /53 /53 Model selection Peter Hoff STAT 423 Applied Regression and Analysis of Variance University of Washington Diabetes example: y = diabetes progression x 1 = age x 2 = sex. dim(x) ## [1] 442 64 colnames(x)

More information

Package glmnetutils. August 1, 2017

Package glmnetutils. August 1, 2017 Type Package Version 1.1 Title Utilities for 'Glmnet' Package glmnetutils August 1, 2017 Description Provides a formula interface for the 'glmnet' package for elasticnet regression, a method for cross-validating

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Mark Johnson Macquarie University Sydney, Australia February 27, 2017 1 / 38 Outline Introduction Hypothesis tests and confidence intervals Classical hypothesis tests

More information

Multivariate Analysis Multivariate Calibration part 2

Multivariate Analysis Multivariate Calibration part 2 Multivariate Analysis Multivariate Calibration part 2 Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com Linear Latent Variables An essential concept in multivariate data

More information

Gradient LASSO algoithm

Gradient LASSO algoithm Gradient LASSO algoithm Yongdai Kim Seoul National University, Korea jointly with Yuwon Kim University of Minnesota, USA and Jinseog Kim Statistical Research Center for Complex Systems, Korea Contents

More information

Package EBglmnet. January 30, 2016

Package EBglmnet. January 30, 2016 Type Package Package EBglmnet January 30, 2016 Title Empirical Bayesian Lasso and Elastic Net Methods for Generalized Linear Models Version 4.1 Date 2016-01-15 Author Anhui Huang, Dianting Liu Maintainer

More information

Package TANDEM. R topics documented: June 15, Type Package

Package TANDEM. R topics documented: June 15, Type Package Type Package Package TANDEM June 15, 2017 Title A Two-Stage Approach to Maximize Interpretability of Drug Response Models Based on Multiple Molecular Data Types Version 1.0.2 Date 2017-04-07 Author Nanne

More information

Lecture on Modeling Tools for Clustering & Regression

Lecture on Modeling Tools for Clustering & Regression Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into

More information

Package milr. June 8, 2017

Package milr. June 8, 2017 Type Package Package milr June 8, 2017 Title Multiple-Instance Logistic Regression with LASSO Penalty Version 0.3.0 Date 2017-06-05 The multiple instance data set consists of many independent subjects

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form

More information

Lasso Regression: Regularization for feature selection

Lasso Regression: Regularization for feature selection Lasso Regression: Regularization for feature selection CSE 416: Machine Learning Emily Fox University of Washington April 12, 2018 Symptom of overfitting 2 Often, overfitting associated with very large

More information

Variable selection is intended to select the best subset of predictors. But why bother?

Variable selection is intended to select the best subset of predictors. But why bother? Chapter 10 Variable Selection Variable selection is intended to select the best subset of predictors. But why bother? 1. We want to explain the data in the simplest way redundant predictors should be removed.

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Package pampe. R topics documented: November 7, 2015

Package pampe. R topics documented: November 7, 2015 Package pampe November 7, 2015 Type Package Title Implementation of the Panel Data Approach Method for Program Evaluation Version 1.1.2 Date 2015-11-06 Author Ainhoa Vega-Bayo Maintainer Ainhoa Vega-Bayo

More information

[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization

[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization [POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization L. Jason Anastasopoulos ljanastas@uga.edu February 2, 2017 Gradient descent Let s begin with our simple problem of estimating

More information

Package LINselect. June 9, 2018

Package LINselect. June 9, 2018 Title Selection of Linear Estimators Version 1.1 Date 2017-04-20 Package LINselect June 9, 2018 Author Yannick Baraud, Christophe Giraud, Sylvie Huet Maintainer ORPHANED Description Estimate the mean of

More information

Comparison of Optimization Methods for L1-regularized Logistic Regression

Comparison of Optimization Methods for L1-regularized Logistic Regression Comparison of Optimization Methods for L1-regularized Logistic Regression Aleksandar Jovanovich Department of Computer Science and Information Systems Youngstown State University Youngstown, OH 44555 aleksjovanovich@gmail.com

More information

Package sparsenetgls

Package sparsenetgls Type Package Package sparsenetgls March 18, 2019 Title Using Gaussian graphical structue learning estimation in generalized least squared regression for multivariate normal regression Version 1.0.1 Maintainer

More information

Quantitative Methods in Management

Quantitative Methods in Management Quantitative Methods in Management MBA Glasgow University March 20-23, 2009 Luiz Moutinho, University of Glasgow Graeme Hutcheson, University of Manchester Exploratory Regression The lecture notes, exercises

More information

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41 1/41 Model selection 560 Hierarchical modeling Peter Hoff Statistics, University of Washington /41 Modeling choices Model: A statistical model is a set of probability distributions for your data. In HLM,

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Exploratory regression and model selection

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Exploratory regression and model selection Statistical Modelling for Social Scientists Manchester University January 20, 21 and 24, 2011 Graeme Hutcheson, University of Manchester Exploratory regression and model selection The lecture notes, exercises

More information

Package GLDreg. February 28, 2017

Package GLDreg. February 28, 2017 Type Package Package GLDreg February 28, 2017 Title Fit GLD Regression Model and GLD Quantile Regression Model to Empirical Data Version 1.0.7 Date 2017-03-15 Author Steve Su, with contributions from:

More information

Glmnet Vignette. Introduction. Trevor Hastie and Junyang Qian

Glmnet Vignette. Introduction. Trevor Hastie and Junyang Qian Glmnet Vignette Trevor Hastie and Junyang Qian Stanford September 13, 2016 Introduction Installation Quick Start Linear Regression Logistic Regression Poisson Models Cox Models Sparse Matrices Appendix

More information

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010 Statistical Models for Management Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon February 24 26, 2010 Graeme Hutcheson, University of Manchester Exploratory regression and model

More information

survsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes

survsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes survsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes Kouros Owzar Zhiguo Li Nancy Cox Sin-Ho Jung Chanhee Yi June 29, 2016 1 Introduction This vignette

More information

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error

More information

Package MMIX. R topics documented: February 15, Type Package. Title Model selection uncertainty and model mixing. Version 1.2.

Package MMIX. R topics documented: February 15, Type Package. Title Model selection uncertainty and model mixing. Version 1.2. Package MMIX February 15, 2013 Type Package Title Model selection uncertainty and model mixing Version 1.2 Date 2012-06-18 Author Marie Morfin and David Makowski Maintainer Description

More information

Predictor Selection Algorithm for Bayesian Lasso

Predictor Selection Algorithm for Bayesian Lasso Predictor Selection Algorithm for Baesian Lasso Quan Zhang Ma 16, 2014 1 Introduction The Lasso [1] is a method in regression model for coefficients shrinkage and model selection. It is often used in the

More information

Variable Selection 6.783, Biomedical Decision Support

Variable Selection 6.783, Biomedical Decision Support 6.783, Biomedical Decision Support (lrosasco@mit.edu) Department of Brain and Cognitive Science- MIT November 2, 2009 About this class Why selecting variables Approaches to variable selection Sparsity-based

More information

Lab #13 - Resampling Methods Econ 224 October 23rd, 2018

Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Lab #13 - Resampling Methods Econ 224 October 23rd, 2018 Introduction In this lab you will work through Section 5.3 of ISL and record your code and results in an RMarkdown document. I have added section

More information

RMassBank for XCMS. Erik Müller. January 4, Introduction 2. 2 Input files LC/MS data Additional Workflow-Methods 2

RMassBank for XCMS. Erik Müller. January 4, Introduction 2. 2 Input files LC/MS data Additional Workflow-Methods 2 RMassBank for XCMS Erik Müller January 4, 2019 Contents 1 Introduction 2 2 Input files 2 2.1 LC/MS data........................... 2 3 Additional Workflow-Methods 2 3.1 Options..............................

More information

A Knitr Demo. Charles J. Geyer. February 8, 2017

A Knitr Demo. Charles J. Geyer. February 8, 2017 A Knitr Demo Charles J. Geyer February 8, 2017 1 Licence This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/.

More information

Package TVsMiss. April 5, 2018

Package TVsMiss. April 5, 2018 Type Package Title Variable Selection for Missing Data Version 0.1.1 Date 2018-04-05 Author Jiwei Zhao, Yang Yang, and Ning Yang Maintainer Yang Yang Package TVsMiss April 5, 2018

More information

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10 St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................

More information

Poisson Regression and Model Checking

Poisson Regression and Model Checking Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)

More information

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

LECTURE 12: LINEAR MODEL SELECTION PT. 3. October 23, 2017 SDS 293: Machine Learning

LECTURE 12: LINEAR MODEL SELECTION PT. 3. October 23, 2017 SDS 293: Machine Learning LECTURE 12: LINEAR MODEL SELECTION PT. 3 October 23, 2017 SDS 293: Machine Learning Announcements 1/2 Presentation of the CS Major & Minors TODAY @ lunch Ford 240 FREE FOOD! Announcements 2/2 CS Internship

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

Stat 4510/7510 Homework 6

Stat 4510/7510 Homework 6 Stat 4510/7510 1/11. Stat 4510/7510 Homework 6 Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details so that

More information

Feature Selec+on. Machine Learning Fall 2018 Kasthuri Kannan

Feature Selec+on. Machine Learning Fall 2018 Kasthuri Kannan Feature Selec+on Machine Learning Fall 2018 Kasthuri Kannan Interpretability vs. Predic+on Types of feature selec+on Subset selec+on/forward/backward Shrinkage (Lasso/Ridge) Best model (CV) Feature selec+on

More information

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017 Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison

More information

Package freeknotsplines

Package freeknotsplines Version 1.0.1 Date 2018-05-17 Package freeknotsplines June 10, 2018 Title Algorithms for Implementing Free-Knot Splines Author , Philip Smith , Pierre Lecuyer

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Introduction to R, Github and Gitlab

Introduction to R, Github and Gitlab Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and

More information

Package ordinalnet. December 5, 2017

Package ordinalnet. December 5, 2017 Type Package Title Penalized Ordinal Regression Version 2.4 Package ordinalnet December 5, 2017 Fits ordinal regression models with elastic net penalty. Supported model families include cumulative probability,

More information

A Method for Comparing Multiple Regression Models

A Method for Comparing Multiple Regression Models CSIS Discussion Paper No. 141 A Method for Comparing Multiple Regression Models Yuki Hiruta Yasushi Asami Department of Urban Engineering, the University of Tokyo e-mail: hiruta@ua.t.u-tokyo.ac.jp asami@csis.u-tokyo.ac.jp

More information

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,

More information

Package scoop. September 16, 2011

Package scoop. September 16, 2011 Package scoop September 16, 2011 Version 0.2-1 Date 2011-xx-xx Title Sparse cooperative regression Author Julien Chiquet Maintainer Julien Chiquet Depends MASS, methods

More information

STAT Statistical Learning. Predictive Modeling. Statistical Learning. Overview. Predictive Modeling. Classification Methods.

STAT Statistical Learning. Predictive Modeling. Statistical Learning. Overview. Predictive Modeling. Classification Methods. STAT 48 - STAT 48 - December 5, 27 STAT 48 - STAT 48 - Here are a few questions to consider: What does statistical learning mean to you? Is statistical learning different from statistics as a whole? What

More information

VARIABLE SELECTION MADE EASY USING GENREG IN JMP PRO

VARIABLE SELECTION MADE EASY USING GENREG IN JMP PRO VARIABLE SELECTION MADE EASY USING GENREG IN JMP PRO Clay Barker Senior Research Statistician Developer JMP Division, SAS Institute THE IMPORTANCE OF VARIABLE SELECTION In 1996, Brad Efron (famous for

More information

Paper ST-157. Dennis J. Beal, Science Applications International Corporation, Oak Ridge, Tennessee

Paper ST-157. Dennis J. Beal, Science Applications International Corporation, Oak Ridge, Tennessee Paper ST-157 SAS Code for Variable Selection in Multiple Linear Regression Models Using Information Criteria Methods with Explicit Enumeration for a Large Number of Independent Regressors Dennis J. Beal,

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

Bayes Estimators & Ridge Regression

Bayes Estimators & Ridge Regression Bayes Estimators & Ridge Regression Readings ISLR 6 STA 521 Duke University Merlise Clyde October 27, 2017 Model Assume that we have centered (as before) and rescaled X o (original X) so that X j = X o

More information

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:

More information

BIOL 458 BIOMETRY Lab 10 - Multiple Regression

BIOL 458 BIOMETRY Lab 10 - Multiple Regression BIOL 458 BIOMETRY Lab 0 - Multiple Regression Many problems in biology science involve the analysis of multivariate data sets. For data sets in which there is a single continuous dependent variable, but

More information

Cross-Validation Alan Arnholt 3/22/2016

Cross-Validation Alan Arnholt 3/22/2016 Cross-Validation Alan Arnholt 3/22/2016 Note: Working definitions and graphs are taken from Ugarte, Militino, and Arnholt (2016) The Validation Set Approach The basic idea behind the validation set approach

More information

Package stepnorm. R topics documented: April 10, Version Date

Package stepnorm. R topics documented: April 10, Version Date Version 1.38.0 Date 2008-10-08 Package stepnorm April 10, 2015 Title Stepwise normalization functions for cdna microarrays Author Yuanyuan Xiao , Yee Hwa (Jean) Yang

More information

Predicting Web Service Levels During VM Live Migrations

Predicting Web Service Levels During VM Live Migrations Predicting Web Service Levels During VM Live Migrations 5th International DMTF Academic Alliance Workshop on Systems and Virtualization Management: Standards and the Cloud Helmut Hlavacs, Thomas Treutner

More information

The RcmdrPlugin.HH Package

The RcmdrPlugin.HH Package Type Package The RcmdrPlugin.HH Package Title Rcmdr support for the HH package Version 1.1-4 Date 2007-07-24 July 31, 2007 Author Richard M. Heiberger, with contributions from Burt Holland. Maintainer

More information

Leveling Up as a Data Scientist. ds/2014/10/level-up-ds.jpg

Leveling Up as a Data Scientist.   ds/2014/10/level-up-ds.jpg Model Optimization Leveling Up as a Data Scientist http://shorelinechurch.org/wp-content/uploa ds/2014/10/level-up-ds.jpg Bias and Variance Error = (expected loss of accuracy) 2 + flexibility of model

More information

Regularized Committee of Extreme Learning Machine for Regression Problems

Regularized Committee of Extreme Learning Machine for Regression Problems Regularized Committee of Extreme Learning Machine for Regression Problems Pablo Escandell-Montero, José M. Martínez-Martínez, Emilio Soria-Olivas, Josep Guimerá-Tomás, Marcelino Martínez-Sober and Antonio

More information

Exploratory model analysis

Exploratory model analysis Exploratory model analysis with R and GGobi Hadley Wickham 6--8 Introduction Why do we build models? There are two basic reasons: explanation or prediction [Ripley, 4]. Using large ensembles of models

More information

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model Topic 16 - Other Remedies Ridge Regression Robust Regression Regression Trees Outline - Fall 2013 Piecewise Linear Model Bootstrapping Topic 16 2 Ridge Regression Modification of least squares that addresses

More information

Chapter 10: Variable Selection. November 12, 2018

Chapter 10: Variable Selection. November 12, 2018 Chapter 10: Variable Selection November 12, 2018 1 Introduction 1.1 The Model-Building Problem The variable selection problem is to find an appropriate subset of regressors. It involves two conflicting

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.

More information

Model selection and validation 1: Cross-validation

Model selection and validation 1: Cross-validation Model selection and validation 1: Cross-validation Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013 Optional reading: ISL 2.2, 5.1, ESL 7.4, 7.10 1 Reminder: modern regression techniques Over the

More information

The leaps Package. April 22, Author Thomas Lumley using Fortran code by Alan Miller

The leaps Package. April 22, Author Thomas Lumley using Fortran code by Alan Miller The leaps Package April 22, 2004 Title regression subset selection Version 2.7 Author Thomas Lumley using Fortran code by Alan Miller Regression subset selection including ehaustive

More information