Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines

Size: px
Start display at page:

Download "Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines"

Transcription

1 Spline Models Introduction to CS and NCS Regression splines Smoothing splines 3

2 Cubic Splines a knots: a< 1 < 2 < < m <b A function g defined on [a, b] is a cubic spline w.r.t knots { i } m i=1 if: 1) g is a cubic polynomial in each of the m +1intervals, g(x) =d i x 3 + c i x 2 + b i x + a i, x 2 [ i, i+1 ] where i =0:m, 0 = a and m+1 = b; 2) g is continuous up to the 2nd derivative: since g is continuous up to the 2nd derivative for any point inside an interval, it su ces to check g (0,1,2) ( + i )=g(0,1,2) ( i ),i=1:m. a From now on, x 2 R is one-dimensional. 4

3 How many free parameters we need to represent g? m +4. We need 4 parameters (d 1,c i,b i,a i ) for each of the (m + 1) intervals, but we also have 3 constraints at each of the m knots, so 4(m + 1) 3m = m +4. 5

4 Suppose the knots { i } m i=1 are given. If g 1 (x) and g 2 (x) are two cubic splines, so is a 1 g 1 (x)+a 2 g 2 (x), wherea 1 and a 2 are two constants. That is, for a set of given knots, the corresponding cubic splines form a linear space (of functions) with dim (m + 4). 6

5 A set of basis functions for cubic splines (wrt knots { i } m i=1 ) is given by h 0 (x) = 1; h 1 (x) =x; h 2 (x) =x 2 ; h 3 (x) =x 3 ; h i+3 (x) =(x i ) 3 +, i =1, 2,..., m. That is, any cubic spline f(x) can be uniquely expressed as f(x) = 0 + m+3 X i=1 jh j (x). Of course, there are many other choices of the basis functions. For example, R uses the B-splines basis functions. 7

6 Natural Cubic Splines (NCS) A cubic spline on [a, b] is a NCS if its second and third derivatives are zero at a and b. That is, a NCS is linear in the two extreme intervals [a, 1 ] and [ m,b]. Note that the linear function in two extreme intervals are totally determined by their neighboring intervals. The degree of freedom of NCS s with m knots is m. For a curve estimation problem with data (x i,y i ) n i=1,ifweputnknots at the n data points (assumed to be unique), then we obtain a smooth curve (using NCS) passing through all y s. 8

7 Regression Splines A basis expansion approach: g(x) = 1 h 1 (x)+ 2 h 2 (x)+ + p h p (x), where p = m +4for regression with cubic splines and p = m for NCS. Represent the model on the observed n data points using matrix notation, ˆ = arg min ky F k 2, 9

8 where 0 y 1 y 2 y n 1 C A n 1 = 0 h 1 (x 1 ) h 2 (x 1 ) h p (x 1 ) h 1 (x 2 ) h 2 (x 2 ) h p (x 2 ) h 1 (x n ) h 2 (x n ) h p (x n ) 1 C A n p 0 1 p 1 C A p 1 We can obtain the design matrix F by commands bs or ns in R, andthen call the regression function lm. Use K-fold CV to select the number of knots. 10

9 Understand how R counts the degree-of-feedom. To generate a cubic spline basis for a given set of x i s, you can use the command bs. You can tell R the location of knots. Or you can tell R the df. Recall that a cubic spline with m knots has m +4df, so we need m = df 4 knots. By default, R puts knots at the 1/(m + 1),..., m/(m + 1) quantiles of x 1:n. 11

10 How R counts the df is a little confusing. The df in command bs actually means the number of columns of the design matrix returned by bs.soifthe intercept is not included in the design matrix (which is the default), then the df in command bs is equal to the real df minus 1. So the following three design matrices (the first two are of n 5 and the last one is of n 6) correspond to the same regression model with cubic splines of df 6. > bs(x, knots=quantile(x, c(1/3, 2/3))); > bs(x, df=5); > bs(x, df=6, intercept=true); 12

11 To generate a NCS basis for a given set of x i s, use the command ns. Recall that the linear functions in the two extreme intervals are totally determined by the other cubic splines. So data points in the two extreme intervals (i.e., outside the two boundary knots) are wasted since they do not a ect the fitting. Therefore, by default, R puts the two boundary knots as the min and max of x i s. You can tell R the location of knots, which are the interior knots. Recall that a NCS with m knots has m df. So the df is equal to the number of (interior) knots plus 2, where 2 means the two boundary knots. 13

12 Or you can tell R the df. If intercept = TRUE, thenweneedm = df 2 knots, otherwise we need m = df 1 knots. Again, by default, R puts knots at the 1/(m + 1),..., m/(m + 1) quantiles of x 1:n. The following three design matrices (the first two are of n 3 and the last one is of n 4) correspond to the same regression model with NCS of df 4. > ns(x, knots=quantile(x, c(1/3, 2/3))); > ns(x, df=3); > ns(x, df=4, intercept=true); 14

13 Choice of Knots Location of knots: to simplify this problem, we ignore the selection of locations by default, the knots are located at the quantiles of x i s. Number of knots: can be formulated as a variable selection problem (an easier version, since there are just p models, not 2 p ). AIC/BIC/R 2 adj m-fold CV (cross-validation) 15

14 Summary: Regression Splines Use LS to fit a spline model: Specify the DF a p, and then fit a regression model with a design matrix of p columns (including the intercept). How to do it in R? How to select the number/location of knots? a Not the polynomial degree, but the DF of the spline, related to the number of knots. 16

15 Smoothing Splines In Regression Splines (let s use NCS), we need to choose the number and the location of knots. What s a Smoothing Spline? Startwithaneasy but horrible solution: put knots at all the observed data points (x 1,...,x n ): y n 1 = F n n n 1. Instead of selecting knots, let s do ridge-type shrinkage ( will be defined later): min h ky F k 2 + t i, where the tuning parameter is often chosen by CV or GCV. Next we ll see how smoothing splines are derived from a di erent aspect. 19

16 Roughness Penalty Approach Let S[a, b] be the space of all smooth functions defined on [a, b]. Among all the functions in S[a, b], look for the minimizer of the following penalized residual sum of squares RSS(g, )= nx [y i g(x i )] 2 + i=1 Z b a [g 00 (x)] 2 dx, (1) where is a smoothing parameter. Theorem. ĝ = arg min RSS(g, ) is a NCS with knots at the n data points x 1,...,x n (x i 6= x j ). 20

17 (WLOG, assume n 2.) Let g be a function on [a, b] and g be a NCS with g(x i )= g(x i ), i =1:n. Does such g exist? Then Z g 00 2 Z g 00 2 ( ) with equality only if g g. PROOF :Leth(x) =g(x) g(x). Soh(x i )=0for i =1,...,n. Then ( ) holds true because Z g 002 = Z g Z Z +2 g 00 h 00 {z } =0 h

18 Smoothing Splines Write g(x) = P n i=1 ih i (x) where h i s are basis functions for NCS with knots at x 1,...,x n. nx [y i g(x i )] 2 =(y F ) t (y F ), i=1 where F n n with F ij = h j (x i ). Z b a g 00(x) 2 dx = Z h X = X i,j i i j Z ih 00 i (x)i 2dx h 00 i (x)h 00 j (x)dx = t, where n n with ij = R b a h00 i (x)h00 j (x)dx. 22

19 So and the solution is RSS(, )=(y F ) t (y F )+ t, ˆ = arg min RSS(, ) = (F t F + ) 1 F t y 23

20 Demmler & Reinsch (1975): a basis with double orthogonality property, i.e. F t F = I, = diag(d i ), where d 1 = d 2 =0(Why?). Using this basis, we have ˆ = (F t F + ) 1 F t y = (I + diag(d i )) 1 F t y, i.e., ˆi = 1 1+ d i ˆ(LS) i. 24

21 Smoother matrix S ŷ = Fˆ = F(F t F + ) 1 F t y = S y. Using D&R basis, S 1 = Fdiag F t. 1+ d i So columns of F are the eigen-vectors of S, which does not depend on. E ective df of a smoothing spline: df ( )=trs = nx i= d i. 25

22 Choice of Leave-one-out CV CV( ) = 1 n = 1 n nx i=1 nx i=1 [ y i ĝ [ i] (x i )] 2 2 yi ĝ(x i ). 1 S (i, i) Generalized CV nx yi ĝ(x i ) 2 GCV( )= 1 n i=1 1 1 n trs 26

23 Summary: Smoothing Splines Start with a model with the maximum complexity: NCS with knots at n (unique) x points. Fit a Ridge Regression model on the data. If we parameterize the NCS function space by the DR basis, then the design matrix is orthogonal and the corresponding coe cient is penalized di erently for each basis: no penalty for the two linear basis functions, higher penalty for wigglier basis functions. How to do it in R? How to select the tuning parameter or equivalently the df? What if we have collected two obs at the same location x? 27

24 Weighted Smoothing Splines Suppose the first two obs have the same x value, i.e., Then (x 1,y 1 ), (x 2,y 2 ), where x 1 = x 2. y1 g(x 1 ) 2 + y2 g(x 1 ) 2 = 2X i=1 yi y 1 + y y 1 + y 2 2 g(x 1 ) 2 = y 1 y 1 + y y 1 + y y2 y 1 + y 2 2 g(x 1 ) 2 2 So we can replace the first two obs by one, (x 1, y 1+y 2 2 ), and its weight is 2 while the weights for other obs are 1. 28

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity The truth is never linear! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! But often

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

Splines and penalized regression

Splines and penalized regression Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

Lecture 17: Smoothing splines, Local Regression, and GAMs

Lecture 17: Smoothing splines, Local Regression, and GAMs Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections 7.5-7 STATS 202: Data mining and analysis November 6, 2017 1 / 24 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want

More information

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function

More information

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that The theory of the linear model 41 Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that E(Y X) =X 0 b 0 0 the F-test statistic follows an F-distribution with (p p 0, n p) degrees

More information

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this

More information

Chapter 5: Basis Expansion and Regularization

Chapter 5: Basis Expansion and Regularization Chapter 5: Basis Expansion and Regularization DD3364 April 1, 2012 Introduction Main idea Moving beyond linearity Augment the vector of inputs X with additional variables. These are transformations of

More information

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is...

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1D Regression i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1 Non-linear problems What if the underlying function is

More information

Four equations are necessary to evaluate these coefficients. Eqn

Four equations are necessary to evaluate these coefficients. Eqn 1.2 Splines 11 A spline function is a piecewise defined function with certain smoothness conditions [Cheney]. A wide variety of functions is potentially possible; polynomial functions are almost exclusively

More information

Generalized Additive Models

Generalized Additive Models :p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1

More information

Abstract R-splines are introduced as splines t with a polynomial null space plus the sum of radial basis functions. Thin plate splines are a special c

Abstract R-splines are introduced as splines t with a polynomial null space plus the sum of radial basis functions. Thin plate splines are a special c R-Splines for Response Surface Modeling July 12, 2000 Sarah W. Hardy North Carolina State University Douglas W. Nychka National Center for Atmospheric Research Raleigh, NC 27695-8203 Boulder, CO 80307

More information

Curve fitting using linear models

Curve fitting using linear models Curve fitting using linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark September 28, 2012 1 / 12 Outline for today linear models and basis functions polynomial regression

More information

Computational Physics PHYS 420

Computational Physics PHYS 420 Computational Physics PHYS 420 Dr Richard H. Cyburt Assistant Professor of Physics My office: 402c in the Science Building My phone: (304) 384-6006 My email: rcyburt@concord.edu My webpage: www.concord.edu/rcyburt

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

ME 261: Numerical Analysis Lecture-12: Numerical Interpolation

ME 261: Numerical Analysis Lecture-12: Numerical Interpolation 1 ME 261: Numerical Analysis Lecture-12: Numerical Interpolation Md. Tanver Hossain Department of Mechanical Engineering, BUET http://tantusher.buet.ac.bd 2 Inverse Interpolation Problem : Given a table

More information

Introduction to Functions

Introduction to Functions Introduction to Functions Motivation to use function notation For the line y = 2x, when x=1, y=2; when x=2, y=4; when x=3, y=6;... We can see the relationship: the y value is always twice of the x value.

More information

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica Boston-Keio Workshop 2016. Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica... Mihoko Minami Keio University, Japan August 15, 2016 Joint work with

More information

February 2017 (1/20) 2 Piecewise Polynomial Interpolation 2.2 (Natural) Cubic Splines. MA378/531 Numerical Analysis II ( NA2 )

February 2017 (1/20) 2 Piecewise Polynomial Interpolation 2.2 (Natural) Cubic Splines. MA378/531 Numerical Analysis II ( NA2 ) f f f f f (/2).9.8.7.6.5.4.3.2. S Knots.7.6.5.4.3.2. 5 5.2.8.6.4.2 S Knots.2 5 5.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. S Knots 5 5 S Knots 5 5 5 5.35.3.25.2.5..5 5 5.6.5.4.3.2. 5 5 4 x 3 3.5 3 2.5 2.5.5 5

More information

Justify all your answers and write down all important steps. Unsupported answers will be disregarded.

Justify all your answers and write down all important steps. Unsupported answers will be disregarded. Numerical Analysis FMN011 2017/05/30 The exam lasts 5 hours and has 15 questions. A minimum of 35 points out of the total 70 are required to get a passing grade. These points will be added to those you

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

STAT 705 Introduction to generalized additive models

STAT 705 Introduction to generalized additive models STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear

More information

A comparison of spline methods in R for building explanatory models

A comparison of spline methods in R for building explanatory models A comparison of spline methods in R for building explanatory models Aris Perperoglou on behalf of TG2 STRATOS Initiative, University of Essex ISCB2017 Aris Perperoglou Spline Methods in R ISCB2017 1 /

More information

15.10 Curve Interpolation using Uniform Cubic B-Spline Curves. CS Dept, UK

15.10 Curve Interpolation using Uniform Cubic B-Spline Curves. CS Dept, UK 1 An analysis of the problem: To get the curve constructed, how many knots are needed? Consider the following case: So, to interpolate (n +1) data points, one needs (n +7) knots,, for a uniform cubic B-spline

More information

Mar. 20 Math 2335 sec 001 Spring 2014

Mar. 20 Math 2335 sec 001 Spring 2014 Mar. 20 Math 2335 sec 001 Spring 2014 Chebyshev Polynomials Definition: For an integer n 0 define the function ( ) T n (x) = cos n cos 1 (x), 1 x 1. It can be shown that T n is a polynomial of degree n.

More information

A review of spline function selection procedures in R

A review of spline function selection procedures in R Matthias Schmid Department of Medical Biometry, Informatics and Epidemiology University of Bonn joint work with Aris Perperoglou on behalf of TG2 of the STRATOS Initiative September 1, 2016 Introduction

More information

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG WHAT IS IT? The Generalized Regression platform was introduced in JMP Pro 11 and got much better in version

More information

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010 Lecture 8, Ceng375 Numerical Computations at December 9, 2010 Computer Engineering Department Çankaya University 8.1 Contents 1 2 3 8.2 : These provide a more efficient way to construct an interpolating

More information

An Overview MARS is an adaptive procedure for regression, and is well suited for high dimensional problems.

An Overview MARS is an adaptive procedure for regression, and is well suited for high dimensional problems. Data mining and Machine Learning, Mar 24, 2008 MARS: Multivariate Adaptive Regression Splines (Friedman 1991, Friedman and Silverman 1989) An Overview MARS is an adaptive procedure for regression, and

More information

Interpolation & Polynomial Approximation. Cubic Spline Interpolation II

Interpolation & Polynomial Approximation. Cubic Spline Interpolation II Interpolation & Polynomial Approximation Cubic Spline Interpolation II Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University

More information

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni Nonparametric Risk Attribution for Factor Models of Portfolios October 3, 2017 Kellie Ottoboni Outline The problem Page 3 Additive model of returns Page 7 Euler s formula for risk decomposition Page 11

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

Notes on point set topology, Fall 2010

Notes on point set topology, Fall 2010 Notes on point set topology, Fall 2010 Stephan Stolz September 3, 2010 Contents 1 Pointset Topology 1 1.1 Metric spaces and topological spaces...................... 1 1.2 Constructions with topological

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Basis Functions Tom Kelsey School of Computer Science University of St Andrews http://www.cs.st-andrews.ac.uk/~tom/ tom@cs.st-andrews.ac.uk Tom Kelsey ID5059-02-BF 2015-02-04

More information

Interpolation by Spline Functions

Interpolation by Spline Functions Interpolation by Spline Functions Com S 477/577 Sep 0 007 High-degree polynomials tend to have large oscillations which are not the characteristics of the original data. To yield smooth interpolating curves

More information

Spline Curves. Spline Curves. Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1

Spline Curves. Spline Curves. Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1 Spline Curves Prof. Dr. Hans Hagen Algorithmic Geometry WS 2013/2014 1 Problem: In the previous chapter, we have seen that interpolating polynomials, especially those of high degree, tend to produce strong

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 1 3.1 Linearization and Optimization of Functions of Vectors 1 Problem Notation 2 Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained

More information

See the course website for important information about collaboration and late policies, as well as where and when to turn in assignments.

See the course website for important information about collaboration and late policies, as well as where and when to turn in assignments. COS Homework # Due Tuesday, February rd See the course website for important information about collaboration and late policies, as well as where and when to turn in assignments. Data files The questions

More information

= f (a, b) + (hf x + kf y ) (a,b) +

= f (a, b) + (hf x + kf y ) (a,b) + Chapter 14 Multiple Integrals 1 Double Integrals, Iterated Integrals, Cross-sections 2 Double Integrals over more general regions, Definition, Evaluation of Double Integrals, Properties of Double Integrals

More information

Cubic Splines and Matlab

Cubic Splines and Matlab Cubic Splines and Matlab October 7, 2006 1 Introduction In this section, we introduce the concept of the cubic spline, and how they are implemented in Matlab. Of particular importance are the new Matlab

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Overfitting. Machine Learning CSE546 Carlos Guestrin University of Washington. October 2, Bias-Variance Tradeoff

Overfitting. Machine Learning CSE546 Carlos Guestrin University of Washington. October 2, Bias-Variance Tradeoff Overfitting Machine Learning CSE546 Carlos Guestrin University of Washington October 2, 2013 1 Bias-Variance Tradeoff Choice of hypothesis class introduces learning bias More complex class less bias More

More information

Model selection and validation 1: Cross-validation

Model selection and validation 1: Cross-validation Model selection and validation 1: Cross-validation Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013 Optional reading: ISL 2.2, 5.1, ESL 7.4, 7.10 1 Reminder: modern regression techniques Over the

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

Nonparametric regression using kernel and spline methods

Nonparametric regression using kernel and spline methods Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested

More information

Piecewise polynomial interpolation

Piecewise polynomial interpolation Chapter 2 Piecewise polynomial interpolation In ection.6., and in Lab, we learned that it is not a good idea to interpolate unctions by a highorder polynomials at equally spaced points. However, it transpires

More information

B-Spline Polynomials. B-Spline Polynomials. Uniform Cubic B-Spline Curves CS 460. Computer Graphics

B-Spline Polynomials. B-Spline Polynomials. Uniform Cubic B-Spline Curves CS 460. Computer Graphics CS 460 B-Spline Polynomials Computer Graphics Professor Richard Eckert March 24, 2004 B-Spline Polynomials Want local control Smoother curves B-spline curves: Segmented approximating curve 4 control points

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. 1/44 Cross-validation and the Bootstrap In the section we discuss two resampling

More information

8 Piecewise Polynomial Interpolation

8 Piecewise Polynomial Interpolation Applied Math Notes by R. J. LeVeque 8 Piecewise Polynomial Interpolation 8. Pitfalls of high order interpolation Suppose we know the value of a function at several points on an interval and we wish to

More information

Lecture 7: Splines and Generalized Additive Models

Lecture 7: Splines and Generalized Additive Models Lecture 7: and Generalized Additive Models Computational Statistics Thierry Denœux April, 2016 Introduction Overview Introduction Simple approaches Polynomials Step functions Regression splines Natural

More information

Topics in Machine Learning

Topics in Machine Learning Topics in Machine Learning Gilad Lerman School of Mathematics University of Minnesota Text/slides stolen from G. James, D. Witten, T. Hastie, R. Tibshirani and A. Ng Machine Learning - Motivation Arthur

More information

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors

More information

Polynomials tend to oscillate (wiggle) a lot, even when our true function does not.

Polynomials tend to oscillate (wiggle) a lot, even when our true function does not. AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 2: Spline Approximations Dianne P O Leary c 2001, 2002, 2007 Piecewise polynomial interpolation Piecewise polynomial interpolation Read: Chapter 3 Skip:

More information

davidr Cornell University

davidr Cornell University 1 NONPARAMETRIC RANDOM EFFECTS MODELS AND LIKELIHOOD RATIO TESTS Oct 11, 2002 David Ruppert Cornell University www.orie.cornell.edu/ davidr (These transparencies and preprints available link to Recent

More information

Objects 2: Curves & Splines Christian Miller CS Fall 2011

Objects 2: Curves & Splines Christian Miller CS Fall 2011 Objects 2: Curves & Splines Christian Miller CS 354 - Fall 2011 Parametric curves Curves that are defined by an equation and a parameter t Usually t [0, 1], and curve is finite Can be discretized at arbitrary

More information

NAG Library Function Document nag_smooth_spline_fit (g10abc)

NAG Library Function Document nag_smooth_spline_fit (g10abc) 1 Purpose NAG Library Function Document nag_smooth_spline_fit () nag_smooth_spline_fit () fits a cubic smoothing spline for a given smoothing parameter. 2 Specification #include #include

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis

More information

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K. A toolbo of smooths Simon Wood Mathematical Sciences, University of Bath, U.K. Smooths for semi-parametric GLMs To build adequate semi-parametric GLMs requires that we use functions with appropriate properties.

More information

Soft Threshold Estimation for Varying{coecient Models 2 ations of certain basis functions (e.g. wavelets). These functions are assumed to be smooth an

Soft Threshold Estimation for Varying{coecient Models 2 ations of certain basis functions (e.g. wavelets). These functions are assumed to be smooth an Soft Threshold Estimation for Varying{coecient Models Artur Klinger, Universitat Munchen ABSTRACT: An alternative penalized likelihood estimator for varying{coecient regression in generalized linear models

More information

Package orthogonalsplinebasis

Package orthogonalsplinebasis Type Package Package orthogonalsplinebasis Title Orthogonal B-Spline Basis Functions Version 0.1.6 Date 2015-03-30 Author Andrew Redd Depends methods, stats, graphics March 31, 2015 Maintainer Andrew Redd

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

2 Regression Splines and Regression Smoothers

2 Regression Splines and Regression Smoothers 2 Regression Splines and Regression Smoothers 2.1 Introduction This chapter launches a more detailed examination of statistical learning within a regression framework. Once again, the focus is on conditional

More information

Multivariable Regression Modelling

Multivariable Regression Modelling Multivariable Regression Modelling A review of available spline packages in R. Aris Perperoglou for TG2 ISCB 2015 Aris Perperoglou for TG2 Multivariable Regression Modelling ISCB 2015 1 / 41 TG2 Members

More information

8.1 Smoothing by Penalizing Curve Flexibility

8.1 Smoothing by Penalizing Curve Flexibility Copyright c Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Chapter 8 Splines 8.1 Smoothing by Penalizing Curve Flexibility Let s go

More information

A MATRIX FORMULATION OF THE CUBIC BÉZIER CURVE

A MATRIX FORMULATION OF THE CUBIC BÉZIER CURVE Geometric Modeling Notes A MATRIX FORMULATION OF THE CUBIC BÉZIER CURVE Kenneth I. Joy Institute for Data Analysis and Visualization Department of Computer Science University of California, Davis Overview

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

NAG Library Function Document nag_smooth_spline_fit (g10abc)

NAG Library Function Document nag_smooth_spline_fit (g10abc) g10 Smoothing in Statistics g10abc NAG Library Function Document nag_smooth_spline_fit (g10abc) 1 Purpose nag_smooth_spline_fit (g10abc) fits a cubic smoothing spline for a given smoothing parameter. 2

More information

Cubic spline interpolation

Cubic spline interpolation Cubic spline interpolation In the following, we want to derive the collocation matrix for cubic spline interpolation. Let us assume that we have equidistant knots. To fulfill the Schoenberg-Whitney condition

More information

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100.

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100. Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records

More information

The pspline Package. August 4, Author S original by Jim Ramsey R port by Brian Ripley

The pspline Package. August 4, Author S original by Jim Ramsey R port by Brian Ripley The pspline Package August 4, 2004 Version 1.0-8 Date 2004-08-04 Title Penalized Smoothing Splines Author S original by Jim Ramsey . R port by Brian Ripley .

More information

Numerical Methods 5633

Numerical Methods 5633 Numerical Methods 5633 Lecture 3 Marina Krstic Marinkovic mmarina@maths.tcd.ie School of Mathematics Trinity College Dublin Marina Krstic Marinkovic 1 / 15 5633-Numerical Methods Organisational Assignment

More information

Asymptotic Error Analysis

Asymptotic Error Analysis Asymptotic Error Analysis Brian Wetton Mathematics Department, UBC www.math.ubc.ca/ wetton PIMS YRC, June 3, 2014 Outline Overview Some History Romberg Integration Cubic Splines - Periodic Case More History:

More information

Curves and Surfaces. CS475 / 675, Fall Siddhartha Chaudhuri

Curves and Surfaces. CS475 / 675, Fall Siddhartha Chaudhuri Curves and Surfaces CS475 / 675, Fall 26 Siddhartha Chaudhuri Klein bottle: surface, no edges (Möbius strip: Inductiveload@Wikipedia) Möbius strip: surface, edge Curves and Surfaces Curve: D set Surface:

More information

CSE446: Linear Regression. Spring 2017

CSE446: Linear Regression. Spring 2017 CSE446: Linear Regression Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Prediction of continuous variables Billionaire says: Wait, that s not what I meant! You say: Chill

More information

5.5 Regression Estimation

5.5 Regression Estimation 5.5 Regression Estimation Assume a SRS of n pairs (x, y ),..., (x n, y n ) is selected from a population of N pairs of (x, y) data. The goal of regression estimation is to take advantage of a linear relationship

More information

Important Properties of B-spline Basis Functions

Important Properties of B-spline Basis Functions Important Properties of B-spline Basis Functions P2.1 N i,p (u) = 0 if u is outside the interval [u i, u i+p+1 ) (local support property). For example, note that N 1,3 is a combination of N 1,0, N 2,0,

More information

Fall CSCI 420: Computer Graphics. 4.2 Splines. Hao Li.

Fall CSCI 420: Computer Graphics. 4.2 Splines. Hao Li. Fall 2014 CSCI 420: Computer Graphics 4.2 Splines Hao Li http://cs420.hao-li.com 1 Roller coaster Next programming assignment involves creating a 3D roller coaster animation We must model the 3D curve

More information

Rational Bezier Surface

Rational Bezier Surface Rational Bezier Surface The perspective projection of a 4-dimensional polynomial Bezier surface, S w n ( u, v) B i n i 0 m j 0, u ( ) B j m, v ( ) P w ij ME525x NURBS Curve and Surface Modeling Page 97

More information

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs, GAMMs and other penalied GLMs using mgcv in R Simon Wood Mathematical Sciences, University of Bath, U.K. Simple eample Consider a very simple dataset relating the timber volume of cherry trees to

More information

In what follows, we will focus on Voronoi diagrams in Euclidean space. Later, we will generalize to other distance spaces.

In what follows, we will focus on Voronoi diagrams in Euclidean space. Later, we will generalize to other distance spaces. Voronoi Diagrams 4 A city builds a set of post offices, and now needs to determine which houses will be served by which office. It would be wasteful for a postman to go out of their way to make a delivery

More information

Nonparametric Approaches to Regression

Nonparametric Approaches to Regression Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

6 Model selection and kernels

6 Model selection and kernels 6. Bias-Variance Dilemma Esercizio 6. While you fit a Linear Model to your data set. You are thinking about changing the Linear Model to a Quadratic one (i.e., a Linear Model with quadratic features φ(x)

More information

Quadratic and cubic b-splines by generalizing higher-order voronoi diagrams

Quadratic and cubic b-splines by generalizing higher-order voronoi diagrams Quadratic and cubic b-splines by generalizing higher-order voronoi diagrams Yuanxin Liu and Jack Snoeyink Joshua Levine April 18, 2007 Computer Science and Engineering, The Ohio State University 1 / 24

More information

Hw 4 Due Feb 22. D(fg) x y z (

Hw 4 Due Feb 22. D(fg) x y z ( Hw 4 Due Feb 22 2.2 Exercise 7,8,10,12,15,18,28,35,36,46 2.3 Exercise 3,11,39,40,47(b) 2.4 Exercise 6,7 Use both the direct method and product rule to calculate where f(x, y, z) = 3x, g(x, y, z) = ( 1

More information

Interior Point I. Lab 21. Introduction

Interior Point I. Lab 21. Introduction Lab 21 Interior Point I Lab Objective: For decades after its invention, the Simplex algorithm was the only competitive method for linear programming. The past 30 years, however, have seen the discovery

More information

Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique

Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique Al Kadiri M. A. Abstract Benefits of using Ranked Set Sampling (RSS) rather than Simple Random Sampling (SRS) are indeed significant

More information

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran Assignment 4 (Sol.) Introduction to Data Analytics Prof. andan Sudarsanam & Prof. B. Ravindran 1. Which among the following techniques can be used to aid decision making when those decisions depend upon

More information

Statistical Modeling with Spline Functions Methodology and Theory

Statistical Modeling with Spline Functions Methodology and Theory This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H Hansen University of California at Los Angeles Jianhua Z Huang University of Pennsylvania Charles

More information

Lecture 03 Linear classification methods II

Lecture 03 Linear classification methods II Lecture 03 Linear classification methods II 25 January 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/35 Office hours Taylor Arnold Mondays, 13:00-14:15, HH 24, Office 203 (by appointment) Yu Lu

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

MEI Desmos Tasks for AS Pure

MEI Desmos Tasks for AS Pure Task 1: Coordinate Geometry Intersection of a line and a curve 1. Add a quadratic curve, e.g. y = x² 4x + 1 2. Add a line, e.g. y = x 3 3. Select the points of intersection of the line and the curve. What

More information

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2) SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple

More information

Lecture VIII. Global Approximation Methods: I

Lecture VIII. Global Approximation Methods: I Lecture VIII Global Approximation Methods: I Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Global Methods p. 1 /29 Global function approximation Global methods: function

More information

Generalized additive models I

Generalized additive models I I Patrick Breheny October 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/18 Introduction Thus far, we have discussed nonparametric regression involving a single covariate In practice, we often

More information

Remark. Jacobs University Visualization and Computer Graphics Lab : ESM4A - Numerical Methods 331

Remark. Jacobs University Visualization and Computer Graphics Lab : ESM4A - Numerical Methods 331 Remark Reconsidering the motivating example, we observe that the derivatives are typically not given by the problem specification. However, they can be estimated in a pre-processing step. A good estimate

More information