Lecture 17: Smoothing splines, Local Regression, and GAMs

Size: px
Start display at page:

Download "Lecture 17: Smoothing splines, Local Regression, and GAMs"

Transcription

1 Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections STATS 202: Data mining and analysis November 6, / 24

2 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want the function Y = f(x) to: 1. Be a cubic polynomial between every pair of knots ξ i, ξ i Be continuous at each knot. 3. Have continuous first and second derivatives at each knot. 2 / 24

3 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want the function Y = f(x) to: 1. Be a cubic polynomial between every pair of knots ξ i, ξ i Be continuous at each knot. 3. Have continuous first and second derivatives at each knot. It turns out, we can write f in terms of K + 3 basis functions: f(x) = β 0 +β 1 X+β 2 X 2 +β 3 X 3 +β 4 h(x, ξ 1 )+ +β K+3 h(x, ξ K ) where, h(x, ξ) = { (x ξ) 3 if x > ξ 0 otherwise 2 / 24

4 Natural cubic splines Spline which is linear instead of cubic for X < ξ 1, X > ξ K. Wage Natural Cubic Spline Cubic Spline Age The predictions are more stable for extreme values of X. 3 / 24

5 Choosing the number and locations of knots The locations of the knots are typically quantiles of X. The number of knots, K, is chosen by cross validation: Mean Squared Error Mean Squared Error Degrees of Freedom of Natural Spline Degrees of Freedom of Cubic Spline 4 / 24

6 Smoothing splines Find the function f which minimizes n (y i f(x i )) 2 + λ i=1 f (x) 2 dx The RSS of the model. A penalty for the roughness of the function. 5 / 24

7 Smoothing splines Find the function f which minimizes n (y i f(x i )) 2 + λ i=1 f (x) 2 dx The RSS of the model. A penalty for the roughness of the function. Facts: The minimizer ˆf is a natural cubic spline, with knots at each sample point x 1,..., x n. btaining ˆf is similar to a Ridge regression. 5 / 24

8 Regression splines vs. Smoothing splines Cubic regression splines Smoothing splines Fix the locations of K knots at quantiles of X. 6 / 24

9 Regression splines vs. Smoothing splines Cubic regression splines Smoothing splines Fix the locations of K knots at quantiles of X. Number of knots K < n. 6 / 24

10 Regression splines vs. Smoothing splines Cubic regression splines Smoothing splines Fix the locations of K knots at quantiles of X. Number of knots K < n. Find the natural cubic spline ˆf which minimizes the RSS: n (y i f(x i )) 2 i=1 6 / 24

11 Regression splines vs. Smoothing splines Cubic regression splines Smoothing splines Fix the locations of K knots at quantiles of X. Number of knots K < n. Find the natural cubic spline ˆf which minimizes the RSS: n (y i f(x i )) 2 i=1 Choose K by cross validation. 6 / 24

12 Regression splines vs. Smoothing splines Cubic regression splines Fix the locations of K knots at quantiles of X. Number of knots K < n. Smoothing splines Put n knots at x 1,..., x n. Find the natural cubic spline ˆf which minimizes the RSS: n (y i f(x i )) 2 i=1 Choose K by cross validation. 6 / 24

13 Regression splines vs. Smoothing splines Cubic regression splines Fix the locations of K knots at quantiles of X. Number of knots K < n. Find the natural cubic spline ˆf which minimizes the RSS: Smoothing splines Put n knots at x 1,..., x n. We could find a cubic spline which makes the RSS = 0 verfitting! n (y i f(x i )) 2 i=1 Choose K by cross validation. 6 / 24

14 Regression splines vs. Smoothing splines Cubic regression splines Fix the locations of K knots at quantiles of X. Number of knots K < n. Find the natural cubic spline ˆf which minimizes the RSS: n (y i f(x i )) 2 i=1 Smoothing splines Put n knots at x 1,..., x n. We could find a cubic spline which makes the RSS = 0 verfitting! Instead, we obtain the fitted values ˆf(x 1 ),..., ˆf(x n ) through an algorithm similar to Ridge regression. Choose K by cross validation. 6 / 24

15 Regression splines vs. Smoothing splines Cubic regression splines Fix the locations of K knots at quantiles of X. Number of knots K < n. Find the natural cubic spline ˆf which minimizes the RSS: n (y i f(x i )) 2 i=1 Choose K by cross validation. Smoothing splines Put n knots at x 1,..., x n. We could find a cubic spline which makes the RSS = 0 verfitting! Instead, we obtain the fitted values ˆf(x 1 ),..., ˆf(x n ) through an algorithm similar to Ridge regression. The function ˆf is the only natural cubic spline that has these fitted values. 6 / 24

16 Deriving a smoothing spline 1. Show that if you fix the values f(x 1 ),..., f(x n ), the roughness f (x) 2 dx is minimized by a natural cubic spline. Problem 5.7 in ESL. 7 / 24

17 Deriving a smoothing spline 1. Show that if you fix the values f(x 1 ),..., f(x n ), the roughness f (x) 2 dx is minimized by a natural cubic spline. Problem 5.7 in ESL. 2. Deduce that the solution to the smoothing spline problem is a natural cubic spline, which can be written in terms of its basis functions. f(x) = β 0 + β 1 f 1 (x) +... β n+3 f n+3 (x) 7 / 24

18 Deriving a smoothing spline 3. Letting N be a matrix with N(i, j) = f j (x i ), we can write the objective function: (y Nβ) T (y Nβ) + λβ T Ω N β, where Ω N (i, j) = f i (t)f j (t)dt. 8 / 24

19 Deriving a smoothing spline 3. Letting N be a matrix with N(i, j) = f j (x i ), we can write the objective function: (y Nβ) T (y Nβ) + λβ T Ω N β, where Ω N (i, j) = f i (t)f j (t)dt. 4. By simple calculus, the coefficients ˆβ which minimize (y Nβ) T (y Nβ) + λβ T Ω N β, are ˆβ = (N T N + λω N ) 1 N T y. 8 / 24

20 Deriving a smoothing spline 5. Note that the predicted values are a linear function of the observed values: ŷ = N(N T N + λω N ) 1 N T y }{{} S λ 9 / 24

21 Deriving a smoothing spline 5. Note that the predicted values are a linear function of the observed values: ŷ = N(N T N + λω N ) 1 N T y }{{} S λ 6. The degrees of freedom for a smoothing spline are: Trace(S λ ) = S λ (1, 1) + S λ (2, 2) + + S λ (n, n) 9 / 24

22 Choosing the regularization parameter λ We typically choose λ through cross validation. 10 / 24

23 Choosing the regularization parameter λ We typically choose λ through cross validation. Fortunately, we can solve the problem for any λ with the same complexity of diagonalizing an n n matrix (just like in ridge regression). 10 / 24

24 Choosing the regularization parameter λ We typically choose λ through cross validation. Fortunately, we can solve the problem for any λ with the same complexity of diagonalizing an n n matrix (just like in ridge regression). There is a shortcut for LCV: RSS loocv (λ) = n (y i i=1 ˆf ( i) λ (x i )) 2 10 / 24

25 Choosing the regularization parameter λ We typically choose λ through cross validation. Fortunately, we can solve the problem for any λ with the same complexity of diagonalizing an n n matrix (just like in ridge regression). There is a shortcut for LCV: RSS loocv (λ) = = n (y i i=1 ˆf ( i) λ (x i )) 2 [ n y i ˆf λ (x i ) 1 S λ (i, i) i=1 ] 2 10 / 24

26 Choosing the regularization parameter λ Smoothing Spline Wage Degrees of Freedom 6.8 Degrees of Freedom (LCV) Age 11 / 24

27 Regression splines vs. Smoothing splines Cubic regression splines Fix the locations of K knots at quantiles of X. Number of knots K < n. Find the natural cubic spline ˆf which minimizes the RSS: n (y i f(x i )) 2 i=1 Choose K by cross validation. Smoothing splines Put n knots at x 1,..., x n. We could find a cubic spline which makes the RSS = 0 verfitting! Instead, we obtain the fitted values ˆf(x 1 ),..., ˆf(x n ) through an algorithm similar to Ridge regression. The function ˆf is the only natural cubic spline that has these fitted values. 12 / 24

28 Local linear regression Local Regression Idea: At each point, use regression function fit only to nearest neighbors of that point. This generalizes KNN regression, which is a form of local constant regression. The span is the fraction of training samples used in each regression. 13 / 24

29 Local linear regression To predict the regression function f at an input x: 14 / 24

30 Local linear regression To predict the regression function f at an input x: 1. Assign a weight K i to the training point x i, such that: Ki = 0 unless x i is one of the k nearest neighbors of x. K i decreases when the distance d(x, x i ) increases. 14 / 24

31 Local linear regression To predict the regression function f at an input x: 1. Assign a weight K i to the training point x i, such that: Ki = 0 unless x i is one of the k nearest neighbors of x. K i decreases when the distance d(x, x i ) increases. 2. Perform a weighted least squares regression; i.e. find (β 0, β 1 ) which minimize n K i (y i β 0 β 1 x i ) 2. i=i 14 / 24

32 Local linear regression To predict the regression function f at an input x: 1. Assign a weight K i to the training point x i, such that: Ki = 0 unless x i is one of the k nearest neighbors of x. K i decreases when the distance d(x, x i ) increases. 2. Perform a weighted least squares regression; i.e. find (β 0, β 1 ) which minimize n K i (y i β 0 β 1 x i ) 2. i=i 3. Predict ˆf(x) = ˆβ 0 + ˆβ 1 x. 14 / 24

33 Local linear regression Local Linear Regression Wage Span is 0.2 (16.4 Degrees of Freedom) Span is 0.7 (5.3 Degrees of Freedom) Age The span, k/n, is chosen by cross-validation. 15 / 24

34 Generalized Additive Models (GAMs) Extension of non-linear models to multiple predictors: wage = β 0 + β 1 year + β 2 age + β 3 education + ɛ wage = β 0 + f 1 (year) + f 2 (age) + f 3 (education) + ɛ 16 / 24

35 Generalized Additive Models (GAMs) Extension of non-linear models to multiple predictors: wage = β 0 + β 1 year + β 2 age + β 3 education + ɛ wage = β 0 + f 1 (year) + f 2 (age) + f 3 (education) + ɛ The functions f 1,..., f p can be polynomials, natural splines, smoothing splines, local regressions / 24

36 Fitting a GAM If the functions f 1 have a basis representation, we can simply use least squares: Natural cubic splines Polynomials Step functions wage = β 0 + f 1 (year) + f 2 (age) + f 3 (education) + ɛ 17 / 24

37 Fitting a GAM therwise, we can use backfitting: 18 / 24

38 Fitting a GAM therwise, we can use backfitting: 1. Keep f 2,..., f p fixed, and fit f 1 using the partial residuals: y i β 0 f 2 (x i2 ) f p (x ip ), as the response. 18 / 24

39 Fitting a GAM therwise, we can use backfitting: 1. Keep f 2,..., f p fixed, and fit f 1 using the partial residuals: y i β 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1, f 3,..., f p fixed, and fit f 2 using the partial residuals: y i β 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response. 18 / 24

40 Fitting a GAM therwise, we can use backfitting: 1. Keep f 2,..., f p fixed, and fit f 1 using the partial residuals: y i β 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1, f 3,..., f p fixed, and fit f 2 using the partial residuals: y i β 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response / 24

41 Fitting a GAM therwise, we can use backfitting: 1. Keep f 2,..., f p fixed, and fit f 1 using the partial residuals: y i β 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1, f 3,..., f p fixed, and fit f 2 using the partial residuals: y i β 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response Iterate 18 / 24

42 Fitting a GAM therwise, we can use backfitting: 1. Keep f 2,..., f p fixed, and fit f 1 using the partial residuals: y i β 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1, f 3,..., f p fixed, and fit f 2 using the partial residuals: y i β 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response Iterate This works for smoothing splines and local regression. 18 / 24

43 Properties of GAMs GAMs are a step from linear regression toward a fully nonparametric method. 19 / 24

44 Properties of GAMs GAMs are a step from linear regression toward a fully nonparametric method. The only constraint is additivity. This can be partially addressed by adding key interaction variables X i X j. 19 / 24

45 Properties of GAMs GAMs are a step from linear regression toward a fully nonparametric method. The only constraint is additivity. This can be partially addressed by adding key interaction variables X i X j. We can report degrees of freedom for most non-linear functions. 19 / 24

46 Properties of GAMs GAMs are a step from linear regression toward a fully nonparametric method. The only constraint is additivity. This can be partially addressed by adding key interaction variables X i X j. We can report degrees of freedom for most non-linear functions. As in linear regression, we can examine the significance of each of the variables. 19 / 24

47 Example: Regression for Wage <HS HS <Coll Coll >Coll f1(year) f2(age) f3(education) year age education year: natural spline with df=4. age: natural spline with df=5. education: step function. 20 / 24

48 Example: Regression for Wage <HS HS <Coll Coll >Coll f1(year) f2(age) f3(education) year age education year: smoothing spline with df=4. age: smoothing spline with df=5. education: step function. 21 / 24

49 GAMs for classification We can model the log-odds in a classification problem using a GAM: log P (Y = 1 X) P (Y = 0 X) = β 0 + f 1 (X 1 ) + + f p (X p ). The fitting algorithm is a version of backfitting, but we won t discuss the details. 22 / 24

50 Example: Classification for Wage>250 <HS HS <Coll Coll >Coll f1(year) f2(age) f3(education) year age education year: linear. age: smoothing spline with df=5. education: step function. 23 / 24

51 Example: Classification for Wage>250 HS <Coll Coll >Coll f1(year) f2(age) f3(education) year age education year: linear. age: smoothing spline with df=5. education: step function. Exclude samples with education < HS. 24 / 24

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity The truth is never linear! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! But often

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial

More information

Splines and penalized regression

Splines and penalized regression Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,

More information

Lecture 7: Linear Regression (continued)

Lecture 7: Linear Regression (continued) Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions

More information

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function

More information

Lecture 7: Splines and Generalized Additive Models

Lecture 7: Splines and Generalized Additive Models Lecture 7: and Generalized Additive Models Computational Statistics Thierry Denœux April, 2016 Introduction Overview Introduction Simple approaches Polynomials Step functions Regression splines Natural

More information

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017 Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last

More information

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this

More information

Model selection and validation 1: Cross-validation

Model selection and validation 1: Cross-validation Model selection and validation 1: Cross-validation Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013 Optional reading: ISL 2.2, 5.1, ESL 7.4, 7.10 1 Reminder: modern regression techniques Over the

More information

STAT 705 Introduction to generalized additive models

STAT 705 Introduction to generalized additive models STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors

More information

Generalized additive models I

Generalized additive models I I Patrick Breheny October 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/18 Introduction Thus far, we have discussed nonparametric regression involving a single covariate In practice, we often

More information

Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines

Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines Spline Models Introduction to CS and NCS Regression splines Smoothing splines 3 Cubic Splines a knots: a< 1 < 2 < < m

More information

Applied Statistics : Practical 9

Applied Statistics : Practical 9 Applied Statistics : Practical 9 This practical explores nonparametric regression and shows how to fit a simple additive model. The first item introduces the necessary R commands for nonparametric regression

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

Chapter 5: Basis Expansion and Regularization

Chapter 5: Basis Expansion and Regularization Chapter 5: Basis Expansion and Regularization DD3364 April 1, 2012 Introduction Main idea Moving beyond linearity Augment the vector of inputs X with additional variables. These are transformations of

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

Lecture 20: Bagging, Random Forests, Boosting

Lecture 20: Bagging, Random Forests, Boosting Lecture 20: Bagging, Random Forests, Boosting Reading: Chapter 8 STATS 202: Data mining and analysis November 13, 2017 1 / 17 Classification and Regression trees, in a nut shell Grow the tree by recursively

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

Lecture 26: Missing data

Lecture 26: Missing data Lecture 26: Missing data Reading: ESL 9.6 STATS 202: Data mining and analysis December 1, 2017 1 / 10 Missing data is everywhere Survey data: nonresponse. 2 / 10 Missing data is everywhere Survey data:

More information

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:

More information

Linear Regression and K-Nearest Neighbors 3/28/18

Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,

More information

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Non-Linear Regression Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Understanding the need for non-parametric regressions 2 Familiarizing with two common

More information

Incorporating Geospatial Data in House Price Indexes: A Hedonic Imputation Approach with Splines. Robert J. Hill and Michael Scholz

Incorporating Geospatial Data in House Price Indexes: A Hedonic Imputation Approach with Splines. Robert J. Hill and Michael Scholz Incorporating Geospatial Data in House Price Indexes: A Hedonic Imputation Approach with Splines Robert J. Hill and Michael Scholz Department of Economics University of Graz, Austria OeNB Workshop Vienna,

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Basis Functions Tom Kelsey School of Computer Science University of St Andrews http://www.cs.st-andrews.ac.uk/~tom/ tom@cs.st-andrews.ac.uk Tom Kelsey ID5059-02-BF 2015-02-04

More information

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is...

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1D Regression i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1 Non-linear problems What if the underlying function is

More information

Nonparametric regression using kernel and spline methods

Nonparametric regression using kernel and spline methods Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested

More information

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2) SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description

More information

Nonparametric Regression and Generalized Additive Models Part I

Nonparametric Regression and Generalized Additive Models Part I SPIDA, June 2004 Nonparametric Regression and Generalized Additive Models Part I Robert Andersen McMaster University Plan of the Lecture 1. Detecting nonlinearity Fitting a linear model to a nonlinear

More information

Nonparametric Classification Methods

Nonparametric Classification Methods Nonparametric Classification Methods We now examine some modern, computationally intensive methods for regression and classification. Recall that the LDA approach constructs a line (or plane or hyperplane)

More information

Nonparametric Regression (and Classification)

Nonparametric Regression (and Classification) Nonparametric Regression (and Classification) Statistical Machine Learning, Spring 217 Ryan Tibshirani (with Larry Wasserman) 1 Introduction, and k-nearest-neighbors 1.1 Basic setup Given a random pair

More information

Nonlinearity and Generalized Additive Models Lecture 2

Nonlinearity and Generalized Additive Models Lecture 2 University of Texas at Dallas, March 2007 Nonlinearity and Generalized Additive Models Lecture 2 Robert Andersen McMaster University http://socserv.mcmaster.ca/andersen Definition of a Smoother A smoother

More information

Lecture on Modeling Tools for Clustering & Regression

Lecture on Modeling Tools for Clustering & Regression Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into

More information

Economics Nonparametric Econometrics

Economics Nonparametric Econometrics Economics 217 - Nonparametric Econometrics Topics covered in this lecture Introduction to the nonparametric model The role of bandwidth Choice of smoothing function R commands for nonparametric models

More information

NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR

NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR J. D. Maca July 1, 1997 Abstract The purpose of this manual is to demonstrate the usage of software for

More information

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica Boston-Keio Workshop 2016. Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica... Mihoko Minami Keio University, Japan August 15, 2016 Joint work with

More information

HW 10 STAT 672, Summer 2018

HW 10 STAT 672, Summer 2018 HW 10 STAT 672, Summer 2018 1) (0 points) Do parts (a), (b), (c), and (e) of Exercise 2 on p. 298 of ISL. 2) (0 points) Do Exercise 3 on p. 298 of ISL. 3) For this problem, try to use the 64 bit version

More information

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods + CS78: Machine Learning and Data Mining Complexity & Nearest Neighbor Methods Prof. Erik Sudderth Some materials courtesy Alex Ihler & Sameer Singh Machine Learning Complexity and Overfitting Nearest

More information

Random Forest A. Fornaser

Random Forest A. Fornaser Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

HW3: Multiple Linear Regression

HW3: Multiple Linear Regression STAT 391 - INTRO STAT DATA SCI UW Spring Quarter 2017 Néhémy Lim HW3: Multiple Linear Regression Programming assignment. Directions. Comment all functions to receive full credit. Provide a single Python

More information

HW 10 STAT 472, Spring 2018

HW 10 STAT 472, Spring 2018 HW 10 STAT 472, Spring 2018 1) (0 points) Do parts (a), (b), (c), and (e) of Exercise 2 on p. 298 of ISL. 2) (0 points) Do Exercise 3 on p. 298 of ISL. 3) For this problem, you can merely submit the things

More information

Nonparametric Mixed-Effects Models for Longitudinal Data

Nonparametric Mixed-Effects Models for Longitudinal Data Nonparametric Mixed-Effects Models for Longitudinal Data Zhang Jin-Ting Dept of Stat & Appl Prob National University of Sinagpore University of Seoul, South Korea, 7 p.1/26 OUTLINE The Motivating Data

More information

Computational Physics PHYS 420

Computational Physics PHYS 420 Computational Physics PHYS 420 Dr Richard H. Cyburt Assistant Professor of Physics My office: 402c in the Science Building My phone: (304) 384-6006 My email: rcyburt@concord.edu My webpage: www.concord.edu/rcyburt

More information

Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique

Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique Al Kadiri M. A. Abstract Benefits of using Ranked Set Sampling (RSS) rather than Simple Random Sampling (SRS) are indeed significant

More information

A Graphical Analysis of Simultaneously Choosing the Bandwidth and Mixing Parameter for Semiparametric Regression Techniques

A Graphical Analysis of Simultaneously Choosing the Bandwidth and Mixing Parameter for Semiparametric Regression Techniques Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2009 A Graphical Analysis of Simultaneously Choosing the Bandwidth and Mixing Parameter for Semiparametric

More information

Curve fitting using linear models

Curve fitting using linear models Curve fitting using linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark September 28, 2012 1 / 12 Outline for today linear models and basis functions polynomial regression

More information

Generalized Additive Models

Generalized Additive Models :p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

Lecture VIII. Global Approximation Methods: I

Lecture VIII. Global Approximation Methods: I Lecture VIII Global Approximation Methods: I Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Global Methods p. 1 /29 Global function approximation Global methods: function

More information

Semiparametric Tools: Generalized Additive Models

Semiparametric Tools: Generalized Additive Models Semiparametric Tools: Generalized Additive Models Jamie Monogan Washington University November 8, 2010 Jamie Monogan (WUStL) Generalized Additive Models November 8, 2010 1 / 33 Regression Splines Choose

More information

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG WHAT IS IT? The Generalized Regression platform was introduced in JMP Pro 11 and got much better in version

More information

GAMs with integrated model selection using penalized regression splines and applications to environmental modelling.

GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. Simon N. Wood a, Nicole H. Augustin b a Mathematical Institute, North Haugh, St Andrews,

More information

Nonparametric Methods Recap

Nonparametric Methods Recap Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010 Nonparametric Methods Kernel Density estimate (also Histogram) Weighted frequency Classification - K-NN Classifier Majority

More information

Machine Learning Duncan Anderson Managing Director, Willis Towers Watson

Machine Learning Duncan Anderson Managing Director, Willis Towers Watson Machine Learning Duncan Anderson Managing Director, Willis Towers Watson 21 March 2018 GIRO 2016, Dublin - Response to machine learning Don t panic! We re doomed! 2 This is not all new Actuaries adopt

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Principal curves and surfaces: Data visualization, compression, and beyond

Principal curves and surfaces: Data visualization, compression, and beyond p. 1/27 Principal curves and surfaces: Data visualization, compression, and beyond Jochen Einbeck Department of Mathematical Sciences, Durham University jochen.einbeck@durham.ac.uk Sheffield, 18th of April

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

Model Inference and Averaging. Baging, Stacking, Random Forest, Boosting

Model Inference and Averaging. Baging, Stacking, Random Forest, Boosting Model Inference and Averaging Baging, Stacking, Random Forest, Boosting Bagging Bootstrap Aggregating Bootstrap Repeatedly select n data samples with replacement Each dataset b=1:b is slightly different

More information

CoxFlexBoost: Fitting Structured Survival Models

CoxFlexBoost: Fitting Structured Survival Models CoxFlexBoost: Fitting Structured Survival Models Benjamin Hofner 1 Institut für Medizininformatik, Biometrie und Epidemiologie (IMBE) Friedrich-Alexander-Universität Erlangen-Nürnberg joint work with Torsten

More information

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni Nonparametric Risk Attribution for Factor Models of Portfolios October 3, 2017 Kellie Ottoboni Outline The problem Page 3 Additive model of returns Page 7 Euler s formula for risk decomposition Page 11

More information

Lasso. November 14, 2017

Lasso. November 14, 2017 Lasso November 14, 2017 Contents 1 Case Study: Least Absolute Shrinkage and Selection Operator (LASSO) 1 1.1 The Lasso Estimator.................................... 1 1.2 Computation of the Lasso Solution............................

More information

Instance and case-based reasoning

Instance and case-based reasoning Instance and case-based reasoning ML for NLP Lecturer: Kevin Koidl Assist. Lecturer Alfredo Maldonado https://www.scss.tcd.ie/kevin.koidl/cs462/ kevin.koidl@scss.tcd.ie, maldonaa@tcd.ie 27 Instance-based

More information

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that The theory of the linear model 41 Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that E(Y X) =X 0 b 0 0 the F-test statistic follows an F-distribution with (p p 0, n p) degrees

More information

Interpolation by Spline Functions

Interpolation by Spline Functions Interpolation by Spline Functions Com S 477/577 Sep 0 007 High-degree polynomials tend to have large oscillations which are not the characteristics of the original data. To yield smooth interpolating curves

More information

STA 414/2104 S: February Administration

STA 414/2104 S: February Administration 1 / 16 Administration HW 2 posted on web page, due March 4 by 1 pm Midterm on March 16; practice questions coming Lecture/questions on Thursday this week Regression: variable selection, regression splines,

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

GAM: The Predictive Modeling Silver Bullet

GAM: The Predictive Modeling Silver Bullet GAM: The Predictive Modeling Silver Bullet Author: Kim Larsen Introduction Imagine that you step into a room of data scientists; the dress code is casual and the scent of strong coffee is hanging in the

More information

1 StatLearn Practical exercise 5

1 StatLearn Practical exercise 5 1 StatLearn Practical exercise 5 Exercise 1.1. Download the LA ozone data set from the book homepage. We will be regressing the cube root of the ozone concentration on the other variables. Divide the data

More information

6 Model selection and kernels

6 Model selection and kernels 6. Bias-Variance Dilemma Esercizio 6. While you fit a Linear Model to your data set. You are thinking about changing the Linear Model to a Quadratic one (i.e., a Linear Model with quadratic features φ(x)

More information

Abstract R-splines are introduced as splines t with a polynomial null space plus the sum of radial basis functions. Thin plate splines are a special c

Abstract R-splines are introduced as splines t with a polynomial null space plus the sum of radial basis functions. Thin plate splines are a special c R-Splines for Response Surface Modeling July 12, 2000 Sarah W. Hardy North Carolina State University Douglas W. Nychka National Center for Atmospheric Research Raleigh, NC 27695-8203 Boulder, CO 80307

More information

I211: Information infrastructure II

I211: Information infrastructure II Data Mining: Classifier Evaluation I211: Information infrastructure II 3-nearest neighbor labeled data find class labels for the 4 data points 1 0 0 6 0 0 0 5 17 1.7 1 1 4 1 7.1 1 1 1 0.4 1 2 1 3.0 0 0.1

More information

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K. A toolbo of smooths Simon Wood Mathematical Sciences, University of Bath, U.K. Smooths for semi-parametric GLMs To build adequate semi-parametric GLMs requires that we use functions with appropriate properties.

More information

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,

More information

Comparison of Linear Regression with K-Nearest Neighbors

Comparison of Linear Regression with K-Nearest Neighbors Comparison of Linear Regression with K-Nearest Neighbors Rebecca C. Steorts, Duke University STA 325, Chapter 3.5 ISL Agenda Intro to KNN Comparison of KNN and Linear Regression K-Nearest Neighbors vs

More information

Four equations are necessary to evaluate these coefficients. Eqn

Four equations are necessary to evaluate these coefficients. Eqn 1.2 Splines 11 A spline function is a piecewise defined function with certain smoothness conditions [Cheney]. A wide variety of functions is potentially possible; polynomial functions are almost exclusively

More information

Distribution-free Predictive Approaches

Distribution-free Predictive Approaches Distribution-free Predictive Approaches The methods discussed in the previous sections are essentially model-based. Model-free approaches such as tree-based classification also exist and are popular for

More information

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010 Lecture 8, Ceng375 Numerical Computations at December 9, 2010 Computer Engineering Department Çankaya University 8.1 Contents 1 2 3 8.2 : These provide a more efficient way to construct an interpolating

More information

2 Regression Splines and Regression Smoothers

2 Regression Splines and Regression Smoothers 2 Regression Splines and Regression Smoothers 2.1 Introduction This chapter launches a more detailed examination of statistical learning within a regression framework. Once again, the focus is on conditional

More information

Lecture 19: Decision trees

Lecture 19: Decision trees Lecture 19: Decision trees Reading: Section 8.1 STATS 202: Data mining and analysis November 10, 2017 1 / 17 Decision trees, 10,000 foot view R2 R5 t4 1. Find a partition of the space of predictors. X2

More information

Using the SemiPar Package

Using the SemiPar Package Using the SemiPar Package NICHOLAS J. SALKOWSKI Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA salk0008@umn.edu May 15, 2008 1 Introduction The

More information

Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers

Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers Why enhance GLM? Shortcomings of the linear modelling approach. GLM being

More information

Interactive Graphics. Lecture 9: Introduction to Spline Curves. Interactive Graphics Lecture 9: Slide 1

Interactive Graphics. Lecture 9: Introduction to Spline Curves. Interactive Graphics Lecture 9: Slide 1 Interactive Graphics Lecture 9: Introduction to Spline Curves Interactive Graphics Lecture 9: Slide 1 Interactive Graphics Lecture 13: Slide 2 Splines The word spline comes from the ship building trade

More information

Smoothing non-stationary noise of the Nigerian Stock Exchange All-Share Index data using variable coefficient functions

Smoothing non-stationary noise of the Nigerian Stock Exchange All-Share Index data using variable coefficient functions Smoothing non-stationary noise of the Nigerian Stock Exchange All-Share Index data using variable coefficient functions 1 Alabi Nurudeen Olawale, 2 Are Stephen Olusegun 1 Department of Mathematics and

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression Appendix to An R and S-PLUS Companion to Applied Regression John Fox Revised April 2005 1 Nonparametric Regression Models The traditional nonlinear regression model (described

More information

Topics in Machine Learning

Topics in Machine Learning Topics in Machine Learning Gilad Lerman School of Mathematics University of Minnesota Text/slides stolen from G. James, D. Witten, T. Hastie, R. Tibshirani and A. Ng Machine Learning - Motivation Arthur

More information

Lecture 06 Decision Trees I

Lecture 06 Decision Trees I Lecture 06 Decision Trees I 08 February 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/33 Problem Set #2 Posted Due February 19th Piazza site https://piazza.com/ 2/33 Last time we starting fitting

More information

Assessing the Quality of the Natural Cubic Spline Approximation

Assessing the Quality of the Natural Cubic Spline Approximation Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,

More information

Solution to Series 7

Solution to Series 7 Dr. Marcel Dettling Applied Statistical Regression AS 2015 Solution to Series 7 1. a) We begin the analysis by plotting histograms and barplots for all variables. > ## load data > load("customerwinback.rda")

More information

Economics "Modern" Data Science

Economics Modern Data Science Economics 217 - "Modern" Data Science Topics covered in this lecture K-nearest neighbors Lasso Decision Trees There is no reading for these lectures. Just notes. However, I have copied a number of online

More information

Spatial Interpolation & Geostatistics

Spatial Interpolation & Geostatistics (Z i Z j ) 2 / 2 Spatial Interpolation & Geostatistics Lag Lag Mean Distance between pairs of points 1 Tobler s Law All places are related, but nearby places are related more than distant places Corollary:

More information