Moving Beyond Linearity

Size: px
Start display at page:

Download "Moving Beyond Linearity"

Transcription

1 Moving Beyond Linearity The truth is never linear! 1/23

2 Moving Beyond Linearity The truth is never linear! r almost never! 1/23

3 Moving Beyond Linearity The truth is never linear! r almost never! But often the linearity assumption is good enough. 1/23

4 Moving Beyond Linearity The truth is never linear! r almost never! But often the linearity assumption is good enough. When its not... polynomials, step functions, splines, local regression, and generalized additive models o er a lot of flexibility, without losing the ease and interpretability of linear models. 1/23

5 Polynomial Regression y i = x i + 2 x 2 i + 3 x 3 i d x d i + i Degree 4 Polynomial Wage Pr(Wage>250 Age) Age Age 2/23

6 Details Create new variables X 1 = X, X 2 = X 2, etc and then treat as multiple linear regression. 3/23

7 Details Create new variables X 1 = X, X 2 = X 2, etc and then treat as multiple linear regression. Not really interested in the coe cients; more interested in the fitted function values at any value x 0 : ˆf(x 0 )= ˆ0 + ˆ1x 0 + ˆ2x ˆ3x ˆ4x /23

8 Details Create new variables X 1 = X, X 2 = X 2, etc and then treat as multiple linear regression. Not really interested in the coe cients; more interested in the fitted function values at any value x 0 : ˆf(x 0 )= ˆ0 + ˆ1x 0 + ˆ2x ˆ3x ˆ4x 4 0. Since ˆf(x 0 ) is a linear function of the ˆ`, can get a simple expression for pointwise-variances Var[ ˆf(x 0 )] at any value x 0. In the figure we have computed the fit and pointwise standard errors on a grid of values for x 0.We show ˆf(x 0 ) ± 2 se[ ˆf(x 0 )]. 3/23

9 Details Create new variables X 1 = X, X 2 = X 2, etc and then treat as multiple linear regression. Not really interested in the coe cients; more interested in the fitted function values at any value x 0 : ˆf(x 0 )= ˆ0 + ˆ1x 0 + ˆ2x ˆ3x ˆ4x 4 0. Since ˆf(x 0 ) is a linear function of the ˆ`, can get a simple expression for pointwise-variances Var[ ˆf(x 0 )] at any value x 0. In the figure we have computed the fit and pointwise standard errors on a grid of values for x 0.We show ˆf(x 0 ) ± 2 se[ ˆf(x 0 )]. We either fix the degree d at some reasonably low value, else use cross-validation to choose d. 3/23

10 Details continued Logistic regression follows naturally. For example, in figure we model Pr(y i > 250x i )= exp( x i + 2 x 2 i dx d i ) 1 + exp( x i + 2 x 2 i dx d i ). To get confidence intervals, compute upper and lower bounds on on the logit scale, and then invert to get on probability scale. 4/23

11 Details continued Logistic regression follows naturally. For example, in figure we model Pr(y i > 250x i )= exp( x i + 2 x 2 i dx d i ) 1 + exp( x i + 2 x 2 i dx d i ). To get confidence intervals, compute upper and lower bounds on on the logit scale, and then invert to get on probability scale. Can do separately on several variables just stack the variables into one matrix, and separate out the pieces afterwards (see GAMs later). 4/23

12 Details continued Logistic regression follows naturally. For example, in figure we model Pr(y i > 250x i )= exp( x i + 2 x 2 i dx d i ) 1 + exp( x i + 2 x 2 i dx d i ). To get confidence intervals, compute upper and lower bounds on on the logit scale, and then invert to get on probability scale. Can do separately on several variables just stack the variables into one matrix, and separate out the pieces afterwards (see GAMs later). Caveat: polynomials have notorious tail behavior very bad for extrapolation. Can fit using y poly(x, degree = 3) in formula. 4/23

13 Bad tail behavior for polynomial regression Let us consider the function known as Runge s example (left panel) f(x) = 1 1+x 2 interpolated using a 15th polynomial(right panel). It turns out that high order interpolation using global polynomial often is dangerous.

14 Step Functions Another way of creating transformations of a variable cut the variable into distinct regions. C 1 (X) =I(X <35), C 2 (X) =I(35 apple X<50),...,C 3 (X) =I(X 65) Piecewise Constant Wage Pr(Wage>250 Age) Age Age 5/23

15 Step functions

16 Step functions Notice that for any value of X, C 0 (X) +C 1 (X) + + C K (X) = 1, since X must be in exactly one of the K +1intervals. We then use least squares to fit linear model using C 1 (X), C 2 (X),, C K (X) as predictors: y i = C 1 (X) + 1 C 1 (X) + + K C K (X) + i For logistic regression Pr(y i > 250x i ) = exp( C 1 (x i )) + + K C K (x i ) 1+exp( C 1 (x i )) + + K C K (x i )

17 Step functions continued Easy to work with. Creates a series of dummy variables representing each group. 6/23

18 Step functions continued Easy to work with. Creates a series of dummy variables representing each group. Useful way of creating interactions that are easy to interpret. For example, interaction e ect of Year and Age: I(Year < 2005) Age, I(Year 2005) Age would allow for di erent linear functions in each age category. 6/23

19 Step functions continued Easy to work with. Creates a series of dummy variables representing each group. Useful way of creating interactions that are easy to interpret. For example, interaction e ect of Year and Age: I(Year < 2005) Age, I(Year 2005) Age would allow for di erent linear functions in each age category. In R: I(year < 2005) or cut(age, c(18, 25, 40, 65, 90)). 6/23

20 Step functions continued Easy to work with. Creates a series of dummy variables representing each group. Useful way of creating interactions that are easy to interpret. For example, interaction e ect of Year and Age: I(Year < 2005) Age, I(Year 2005) Age would allow for di erent linear functions in each age category. In R: I(year < 2005) or cut(age, c(18, 25, 40, 65, 90)). Choice of cutpoints or knots can be problematic. For creating nonlinearities, smoother alternatives such as splines are available. 6/23

21 Basis functions Polynomial and piecewise-constant regression models are in fact special cases of a basis function approach. y i = b 1 (x i )+ 2 b 2 (x i )+ + K b K (x i )+ i where basis function b 1 ( ), b 2 ( ),, b K ( ) are fixed and known. For polynomial regression, b j (x i ) = x j i. For piecewise constant functions, b j (x i ) = I(c j apple x i < c j+1 ).

22 Moving Beyond Linearity

23 Piecewise Polynomials Instead of a single polynomial in X over its whole domain, we can rather use di erent polynomials in regions defined by knots. E.g. (see figure) ( x i + 21 x 2 i y i = + 31x 3 i + i if x i <c; x i + 22 x 2 i + 32x 3 i + i if x i c. Better to add constraints to the polynomials, e.g. continuity. Splines have the maximum amount of continuity. 7/23

24 Piecewise Cubic Continuous Piecewise Cubic Wage Wage Age Age Cubic Spline Linear Spline Wage Wage Age Age 8/23

25 Linear Splines A linear spline with knots at k,k=1,...,k is a piecewise linear polynomial continuous at each knot. We can represent this model as y i = b 1 (x i )+ 2 b 2 (x i )+ + K+3 b K+3 (x i )+ i, where the b k are basis functions. 9/23

26 Linear Splines A linear spline with knots at k,k=1,...,k is a piecewise linear polynomial continuous at each knot. We can represent this model as y i = b 1 (x i )+ 2 b 2 (x i )+ + K+3 b K+3 (x i )+ i, where the b k are basis functions. b 1 (x i ) = x i b k+1 (x i ) = (x i k ) +, k =1,...,K Here the () + means positive part; i.e. xi (x i k ) + = k if x i > k 0 otherwise 9/23

27 f(x) x b(x) x 10 / 23

28 Cubic Splines A cubic spline with knots at k,k=1,...,k is a piecewise cubic polynomial with continuous derivatives up to order 2 at each knot. Again we can represent this model with truncated power basis functions y i = b 1 (x i )+ 2 b 2 (x i )+ + K+3 b K+3 (x i )+ i, b 1 (x i ) = x i b 2 (x i ) = x 2 i b 3 (x i ) = x 3 i b k+3 (x i ) = (x i k ) 3 +, k =1,...,K where (x i k ) 3 + = (xi k ) 3 if x i > k 0 otherwise 11 / 23

29 f(x) x b(x) x 12 / 23

30 More for spline Simple linear model: y i = x i + i X matrix: quadratic model (polynomial) : X matrix: y i = x i + 2 x 2 i + i

31 More for spline

32 More for spline broken stick model: y i = x i + 11 (x i 0.6) + + i whip model (polynomial) : y i = x i + 11 (x i 0.5) (x i 0.55) + + 1k (x i 0.95) + + i X matrix: Spline model for f KX f (x) = x + b k (x k k ) + k=1

33 Splines Figure 6: *

34 Linear splines

35 cubic splines

36 cubic splines

37 B-Splines

38 B-Splines

39 Natural cubic spline X consisting of 50 points drawn at random from U[0,1], and an assumed error model with constant variance. The linear and cubic polynomial fits have two and four degrees of freedom, respectively, while the cubic spline and natural cubic spline each have six degrees of freedom. The cubic spline has two knots at 0.33 and 0.66, while the natural spline has boundary knots at 0.1 and 0.9, and four interior knots uniformly spaced between them.

40 Natural Cubic Splines A natural cubic spline extrapolates linearly beyond the boundary knots. This adds 4 = 2 2 extra constraints, and allows us to put more internal knots for the same degrees of freedom as a regular cubic spline. Wage Natural Cubic Spline Cubic Spline / 23

41 Fitting splines in R is easy: bs(x,...) for any degree splines, and ns(x,...) for natural cubic splines, in package splines. Natural Cubic Spline Wage Pr(Wage>250 Age) Age Age 14 / 23

42 Knot placement ne strategy is to decide K, the number of knots, and then place them at appropriate quantiles of the observed X. A cubic spline with K knots has K + 4 parameters or degrees of freedom. A natural spline with K knots has K degrees of freedom. Wage Natural Cubic Spline Comparison of a Polynomial degree-14 polynomial and a natural cubic spline, each with 15df Age 15 / 23

43 Knot placement ne strategy is to decide K, the number of knots, and then place them at appropriate quantiles of the observed X. A cubic spline with K knots has K + 4 parameters or degrees of freedom. A natural spline with K knots has K degrees of freedom. Wage Natural Cubic Spline Comparison of a Polynomial degree-14 polynomial and a natural cubic spline, each with 15df. ns(age, df=14) poly(age, deg=14) Age 15 / 23

44 Smoothing spline In the last section, we create splines by specifying a set of knots, producing a sequence of basis funcitons and then using LS to estimate spline coefficients. Now, we introduce a somewhat different approach smoothing spline. In fitting a smooth curve to a set of data, we need to find function g(x) to fits the observed data well. That is we want RSS = P n i=1 (y i g(x i )) 2 to be small. However, if we don t put any constraints on g(x i ), then we can make RSS zero by choosing g such that it interpolates all of y i. verfit, too rough. How might we ensure that g is smooth?

45 Smoothing Splines This section is a little bit mathematical Consider this criterion for fitting a smooth function g(x) to some data: nx Z minimize (y i g(x i )) 2 + g 00 (t) 2 dt g2s i=1 16 / 23

46 Smoothing Splines This section is a little bit mathematical Consider this criterion for fitting a smooth function g(x) to some data: nx Z minimize (y i g(x i )) 2 + g 00 (t) 2 dt g2s i=1 The first term is RSS, and tries to make g(x) match the data at each x i. 16 / 23

47 Smoothing Splines This section is a little bit mathematical Consider this criterion for fitting a smooth function g(x) to some data: nx Z minimize (y i g(x i )) 2 + g 00 (t) 2 dt g2s i=1 The first term is RSS, and tries to make g(x) match the data at each x i. The second term is a roughness penalty and controls how wiggly g(x) is. It is modulated by the tuning parameter / 23

48 Smoothing Splines This section is a little bit mathematical Consider this criterion for fitting a smooth function g(x) to some data: nx Z minimize (y i g(x i )) 2 + g 00 (t) 2 dt g2s i=1 The first term is RSS, and tries to make g(x) match the data at each x i. The second term is a roughness penalty and controls how wiggly g(x) is. It is modulated by the tuning parameter 0. The smaller, the more wiggly the function, eventually interpolating y i when = / 23

49 Smoothing Splines This section is a little bit mathematical Consider this criterion for fitting a smooth function g(x) to some data: nx Z minimize (y i g(x i )) 2 + g 00 (t) 2 dt g2s i=1 The first term is RSS, and tries to make g(x) match the data at each x i. The second term is a roughness penalty and controls how wiggly g(x) is. It is modulated by the tuning parameter 0. The smaller, the more wiggly the function, eventually interpolating y i when = 0. As!1, the function g(x) becomes linear. 16 / 23

50 Smoothing Splines continued The solution is a natural cubic spline, with a knot at every unique value of x i. The roughness penalty still controls the roughness via. 17 / 23

51 Smoothing Splines continued The solution is a natural cubic spline, with a knot at every unique value of x i. The roughness penalty still controls the roughness via. Some details Smoothing splines avoid the knot-selection issue, leaving a single to be chosen. 17 / 23

52 Smoothing Splines continued The solution is a natural cubic spline, with a knot at every unique value of x i. The roughness penalty still controls the roughness via. Some details Smoothing splines avoid the knot-selection issue, leaving a single to be chosen. The algorithmic details are too complex to describe here. In R, the function smooth.spline() will fit a smoothing spline. 17 / 23

53 Smoothing Splines continued The solution is a natural cubic spline, with a knot at every unique value of x i. The roughness penalty still controls the roughness via. Some details Smoothing splines avoid the knot-selection issue, leaving a single to be chosen. The algorithmic details are too complex to describe here. In R, the function smooth.spline() will fit a smoothing spline. The vector of n fitted values can be written as ĝ = S y, where S is a n n matrix (determined by the x i and ). The e ective degrees of freedom are given by df = nx {S } ii. i=1 17 / 23

54 Smoothing spline It can be shown the criterion of smoothing spline has an explicit, finite-dimensional, unique minimizer which is a natrural cubic spline with knots at the unique values of the x i, i = 1,, N. We can write f (x) as f (x) = NX N j (x) j i=1, where the N j (x) are an N dimensional set of basis functions. The criterion thus reduces to

55 Smoothing spline

56 Smoothing spline

57 Smoothing spline

58 Smoothing spline

59 Smoothing spline Reinsch form ŷ = N (N T N + N ) 1 N T y = N (N T [I + N T N 1 ]N) 1 N T y K is known as the penalty matrix. = (I + N T N 1 ) 1 y = (I + K) 1 y (6)

60 Smoothing spline The eigen decomposition of S and d k the corresponding eigenvalue of K.

61 Smoothing spline

62 Smoothing spline

63 Smoothing Splines continued choosing We can specify df rather than! In R: smooth.spline(age, wage, df = 10) 18 / 23

64 Smoothing Splines continued choosing We can specify df rather than! In R: smooth.spline(age, wage, df = 10) The leave-one-out (L) cross-validated error is given by RSS cv ( )= nx (y i ĝ ( i) (x i )) 2 = i=1 nx apple yi ĝ (x i ) i=1 1 {S } ii 2. In R: smooth.spline(age, wage) 18 / 23

65 Smoothing Spline Wage Degrees of Freedom 6.8 Degrees of Freedom (LCV) Age 19 / 23

66 Local Regression Local Regression With a sliding weight function, we fit separate linear fits over the range of X by weighted least squares. See text for more details, and loess() function in R. 20 / 23

67 local regression

68 local regression locally weighted regression solves a separate weighted least squares prob- lem at each target point x 0 : the estimate is then f ˆ(x 0 ) = (x ˆ 0 )+ (x ˆ 0 )x 0 define the vector function b(x) T = (1, x), let B be the N 2 rgression matrix with ith row b(x i ) T, and W (x 0 ) the N N diagonal matrix with ith diagonal element K (x 0, x i ), then

69 Multidimensional spline

70 Multidimensional spline

71 Generalized Additive Models Allows for flexible nonlinearities in several variables, but retains the additive structure of linear models. y i = 0 + f 1 (x i1 )+f 2 (x i2 )+ + f p (x ip )+ i. <HS HS <Coll Coll >Coll f1(year) f2(age) f3(education) year age education 21 / 23

72 Fitting a GAM I If the functions f 1 have a basis representation, we can simply use least squares: I I I Natural cubic splines Polynomials Step functions wage = 0 + f 1 (year)+f 2 (age)+f 3 (education)+ 17 / 24

73 Fitting a GAM I therwise, we can use backfitting: 18 / 24

74 Fitting a GAM I therwise, we can use backfitting: 1. Keep f 2,...,f p fixed, and fit f 1 using the partial residuals: y i 0 f 2 (x i2 ) f p (x ip ), as the response. 18 / 24

75 Fitting a GAM I therwise, we can use backfitting: 1. Keep f 2,...,f p fixed, and fit f 1 using the partial residuals: y i 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1,f 3,...,f p fixed, and fit f 2 using the partial residuals: y i 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response. 18 / 24

76 Fitting a GAM I therwise, we can use backfitting: 1. Keep f 2,...,f p fixed, and fit f 1 using the partial residuals: y i 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1,f 3,...,f p fixed, and fit f 2 using the partial residuals: y i 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response / 24

77 Fitting a GAM I therwise, we can use backfitting: 1. Keep f 2,...,f p fixed, and fit f 1 using the partial residuals: y i 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1,f 3,...,f p fixed, and fit f 2 using the partial residuals: y i 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response Iterate 18 / 24

78 Fitting a GAM I therwise, we can use backfitting: 1. Keep f 2,...,f p fixed, and fit f 1 using the partial residuals: y i 0 f 2 (x i2 ) f p (x ip ), as the response. 2. Keep f 1,f 3,...,f p fixed, and fit f 2 using the partial residuals: y i 0 f 1 (x i1 ) f 3 (x i3 ) f p (x ip ), as the response Iterate I This works for smoothing splines and local regression. 18 / 24

79 GAM details Can fit a GAM simply using, e.g. natural splines: lm(wage ns(year, df = 5)+ ns(age, df = 5)+ education) 22 / 23

80 GAM details Can fit a GAM simply using, e.g. natural splines: lm(wage ns(year, df = 5)+ ns(age, df = 5)+ education) Coe cients not that interesting; fitted functions are. The previous plot was produced using plot.gam. 22 / 23

81 GAM details Can fit a GAM simply using, e.g. natural splines: lm(wage ns(year, df = 5)+ ns(age, df = 5)+ education) Coe cients not that interesting; fitted functions are. The previous plot was produced using plot.gam. Can mix terms some linear, some nonlinear and use anova() to compare models. 22 / 23

82 GAM details Can fit a GAM simply using, e.g. natural splines: lm(wage ns(year, df = 5)+ ns(age, df = 5)+ education) Coe cients not that interesting; fitted functions are. The previous plot was produced using plot.gam. Can mix terms some linear, some nonlinear and use anova() to compare models. Can use smoothing splines or local regression as well: gam(wage s(year, df = 5)+ lo(age, span =.5)+ education) 22 / 23

83 GAM details Can fit a GAM simply using, e.g. natural splines: lm(wage ns(year, df = 5)+ ns(age, df = 5)+ education) Coe cients not that interesting; fitted functions are. The previous plot was produced using plot.gam. Can mix terms some linear, some nonlinear and use anova() to compare models. Can use smoothing splines or local regression as well: gam(wage s(year, df = 5)+ lo(age, span =.5)+ education) GAMs are additive, although low-order interactions can be included in a natural way using, e.g. bivariate smoothers or interactions of the form ns(age,df=5):ns(year,df=5). 22 / 23

84 p(x) log 1 p(x) GAMs for classification = 0 + f 1 (X 1 )+f 2 (X 2 )+ + f p (X p ). HS <Coll Coll >Coll f1(year) f2(age) f3(education) year age education gam(i(wage > 250) year + s(age, df = 5)+education, family = binomial) 23 / 23

85 GAM

86 Nonparametric Beta regression Fractional data that are restricted in the standard unit interval (0, 1) with a highly skewed distribution are commonly encountered. the linear regression model cannot always guarantee that the fitted or predicted values will fall into the unit interval (0, 1). To overcome such issues, one possible approach is to first transform the response such that it can take values within ( 1, +1) and then apply a regression model to the transformed response. such an approach still has shortcomings; one shortcoming is that the coefficients cannot easily be interpreted in terms of the original response, and another shortcoming is that the fractional response is generally asymmetric and highly skewed. one appealing model is beta regression, which response variable is assumed to follow a beta distribution within the unit interval (0,1).

87 Nonparametric Beta regression Let (y i, X > i )>, i = 1,...,n, be vectors that are independent and identically distributed as (y, X), where y is a response variable that is restricted to the unit interval (0,1) and X i = (x i1,, x ip ) > 2 R p is the ith observation of the p covariates, which are assumed to be fixed and known. The beta density is: f (y; µ, ) = ( ) (µ ) ( ) yµ 1 (1 y) (1 µ) 1, (1) where µ 2 (0, 1) is the mean of y, > 0 is a precision parameter, and ( ) is the gamma function. The variance of y is var(y) = µ(1 µ)/(1 + ). linear beta regression model: g(µ i ) = px x ij j, (2) where g( ) is a strictly monotonic and twice-differentiable link function that maps (0,1) into R; j=1

88 Nonparametric Beta regression we propose nonparametric additive beta regression: px g(µ i ) = f j (x ij ), (3) j=1 where µ i = E(y i ), i = 1,, n. g( ) is a strictly monotonic and twice- differentiable link function that maps (0, 1) into R, and we use the logit link function g(µ) = log µ/(1 µ) in this study. f j s are unknown smooth functions to be estimated, and suppose that some of them are zero. using B-splines to approximate each of the unknown functions f j s, we can have the following approximation f j (x) t f nj (x) = Xm n k=1 jk k (x) with m n coefficients jk, k = 1,, m n. px Xm n g(µ i ) t 0 + jk k (x ij ). (4) j=1 k=1

89 Nonparametric Beta regression The objective function: L(, ) = l(, ) n px w j (m n )P ( j 2 ), (5) where l(, ) = P n i=1 l i (µ i, ) is the log-likelihood function; l i (µ i, ) = ln ( ) ln (µ i ) ln ((1 µ i ) ) + (µ i 1) ln y i + {(1 µ i ) 1} ln(1 y i ); P ( ) is a penalty function, w j ( ) is used to rescale the penalty with respect to the dimensionality of the parameter vector j. The penalized likelihood estimators are then defined as ( n, ) = argmax, L(, ). j=1

90 Nonparametric Beta regression Theorem (Estimation consistency). Define F T = { j : nj 2, 0, 1 apple j apple p}, and let M denote the cardinality of any set M {1,, p}. Under conditions (C1) (C4), we can obtain the following: (i) With probability converging to 1, F T apple M 1 F T = M 1 q for a finite constant M 1 > 1. (ii) If max{p 0 ( j 2 )} 2 m n /n 2! 0 as n!1, then P( F T F T )! p 1. pp (iii) nj j 2 2 = p(mn 2d+1 ) + p ( 4 max {P0 ( j 2 ) } 2 m 2 n ). n 2 j=1

91 Theorem (Selection consistency). Under conditions (C1) (C5), we have: (i) P( n= 0 )! 1. Define fˆ j (x) = m Pn have k=1 jk k (x). Under conditions (C1) (C5), we (i) P( ˆf j 2 > 0, j 2 F T and ˆf j 2 = 0, j 2 F F /F T )! p 1. qp (ii) fˆ j f j 2 2 = p mn 2d + p 4m n (max {P 0 ( j 2 )}) 2 /n 2. j=1

Lecture 17: Smoothing splines, Local Regression, and GAMs

Lecture 17: Smoothing splines, Local Regression, and GAMs Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections 7.5-7 STATS 202: Data mining and analysis November 6, 2017 1 / 24 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want

More information

Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines

Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines Spline Models Introduction to CS and NCS Regression splines Smoothing splines 3 Cubic Splines a knots: a< 1 < 2 < < m

More information

Lecture 7: Splines and Generalized Additive Models

Lecture 7: Splines and Generalized Additive Models Lecture 7: and Generalized Additive Models Computational Statistics Thierry Denœux April, 2016 Introduction Overview Introduction Simple approaches Polynomials Step functions Regression splines Natural

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial

More information

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function

More information

Splines and penalized regression

Splines and penalized regression Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,

More information

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this

More information

Generalized additive models I

Generalized additive models I I Patrick Breheny October 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/18 Introduction Thus far, we have discussed nonparametric regression involving a single covariate In practice, we often

More information

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni Nonparametric Risk Attribution for Factor Models of Portfolios October 3, 2017 Kellie Ottoboni Outline The problem Page 3 Additive model of returns Page 7 Euler s formula for risk decomposition Page 11

More information

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis

More information

Nonparametric regression using kernel and spline methods

Nonparametric regression using kernel and spline methods Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested

More information

Curve fitting using linear models

Curve fitting using linear models Curve fitting using linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark September 28, 2012 1 / 12 Outline for today linear models and basis functions polynomial regression

More information

Generalized Additive Models

Generalized Additive Models :p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

STAT 705 Introduction to generalized additive models

STAT 705 Introduction to generalized additive models STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Non-Linear Regression Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Understanding the need for non-parametric regressions 2 Familiarizing with two common

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that The theory of the linear model 41 Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that E(Y X) =X 0 b 0 0 the F-test statistic follows an F-distribution with (p p 0, n p) degrees

More information

Chapter 5: Basis Expansion and Regularization

Chapter 5: Basis Expansion and Regularization Chapter 5: Basis Expansion and Regularization DD3364 April 1, 2012 Introduction Main idea Moving beyond linearity Augment the vector of inputs X with additional variables. These are transformations of

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Nonlinearity and Generalized Additive Models Lecture 2

Nonlinearity and Generalized Additive Models Lecture 2 University of Texas at Dallas, March 2007 Nonlinearity and Generalized Additive Models Lecture 2 Robert Andersen McMaster University http://socserv.mcmaster.ca/andersen Definition of a Smoother A smoother

More information

Computational Physics PHYS 420

Computational Physics PHYS 420 Computational Physics PHYS 420 Dr Richard H. Cyburt Assistant Professor of Physics My office: 402c in the Science Building My phone: (304) 384-6006 My email: rcyburt@concord.edu My webpage: www.concord.edu/rcyburt

More information

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2) SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple

More information

APPM/MATH Problem Set 4 Solutions

APPM/MATH Problem Set 4 Solutions APPM/MATH 465 Problem Set 4 Solutions This assignment is due by 4pm on Wednesday, October 16th. You may either turn it in to me in class on Monday or in the box outside my office door (ECOT 35). Minimal

More information

Lecture VIII. Global Approximation Methods: I

Lecture VIII. Global Approximation Methods: I Lecture VIII Global Approximation Methods: I Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Global Methods p. 1 /29 Global function approximation Global methods: function

More information

Nonparametric Approaches to Regression

Nonparametric Approaches to Regression Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)

More information

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Weijie Cai, SAS Institute Inc., Cary NC July 1, 2008 ABSTRACT Generalized additive models are useful in finding predictor-response

More information

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010 Lecture 8, Ceng375 Numerical Computations at December 9, 2010 Computer Engineering Department Çankaya University 8.1 Contents 1 2 3 8.2 : These provide a more efficient way to construct an interpolating

More information

CS 450 Numerical Analysis. Chapter 7: Interpolation

CS 450 Numerical Analysis. Chapter 7: Interpolation Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

Mar. 20 Math 2335 sec 001 Spring 2014

Mar. 20 Math 2335 sec 001 Spring 2014 Mar. 20 Math 2335 sec 001 Spring 2014 Chebyshev Polynomials Definition: For an integer n 0 define the function ( ) T n (x) = cos n cos 1 (x), 1 x 1. It can be shown that T n is a polynomial of degree n.

More information

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K. A toolbo of smooths Simon Wood Mathematical Sciences, University of Bath, U.K. Smooths for semi-parametric GLMs To build adequate semi-parametric GLMs requires that we use functions with appropriate properties.

More information

Nonparametric Estimation of Distribution Function using Bezier Curve

Nonparametric Estimation of Distribution Function using Bezier Curve Communications for Statistical Applications and Methods 2014, Vol. 21, No. 1, 105 114 DOI: http://dx.doi.org/10.5351/csam.2014.21.1.105 ISSN 2287-7843 Nonparametric Estimation of Distribution Function

More information

Interpolation and Splines

Interpolation and Splines Interpolation and Splines Anna Gryboś October 23, 27 1 Problem setting Many of physical phenomenona are described by the functions that we don t know exactly. Often we can calculate or measure the values

More information

Polynomials tend to oscillate (wiggle) a lot, even when our true function does not.

Polynomials tend to oscillate (wiggle) a lot, even when our true function does not. AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 2: Spline Approximations Dianne P O Leary c 2001, 2002, 2007 Piecewise polynomial interpolation Piecewise polynomial interpolation Read: Chapter 3 Skip:

More information

Economics Nonparametric Econometrics

Economics Nonparametric Econometrics Economics 217 - Nonparametric Econometrics Topics covered in this lecture Introduction to the nonparametric model The role of bandwidth Choice of smoothing function R commands for nonparametric models

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. 1/44 Cross-validation and the Bootstrap In the section we discuss two resampling

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Nonparametric Mixed-Effects Models for Longitudinal Data

Nonparametric Mixed-Effects Models for Longitudinal Data Nonparametric Mixed-Effects Models for Longitudinal Data Zhang Jin-Ting Dept of Stat & Appl Prob National University of Sinagpore University of Seoul, South Korea, 7 p.1/26 OUTLINE The Motivating Data

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients

More information

Applied Statistics : Practical 9

Applied Statistics : Practical 9 Applied Statistics : Practical 9 This practical explores nonparametric regression and shows how to fit a simple additive model. The first item introduces the necessary R commands for nonparametric regression

More information

Predict Outcomes and Reveal Relationships in Categorical Data

Predict Outcomes and Reveal Relationships in Categorical Data PASW Categories 18 Specifications Predict Outcomes and Reveal Relationships in Categorical Data Unleash the full potential of your data through predictive analysis, statistical learning, perceptual mapping,

More information

Nonparametric Regression and Generalized Additive Models Part I

Nonparametric Regression and Generalized Additive Models Part I SPIDA, June 2004 Nonparametric Regression and Generalized Additive Models Part I Robert Andersen McMaster University Plan of the Lecture 1. Detecting nonlinearity Fitting a linear model to a nonlinear

More information

Model selection and validation 1: Cross-validation

Model selection and validation 1: Cross-validation Model selection and validation 1: Cross-validation Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013 Optional reading: ISL 2.2, 5.1, ESL 7.4, 7.10 1 Reminder: modern regression techniques Over the

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Lecture 9: Introduction to Spline Curves

Lecture 9: Introduction to Spline Curves Lecture 9: Introduction to Spline Curves Splines are used in graphics to represent smooth curves and surfaces. They use a small set of control points (knots) and a function that generates a curve through

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Assessing the Quality of the Natural Cubic Spline Approximation

Assessing the Quality of the Natural Cubic Spline Approximation Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,

More information

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error

More information

A technique for constructing monotonic regression splines to enable non-linear transformation of GIS rasters

A technique for constructing monotonic regression splines to enable non-linear transformation of GIS rasters 18 th World IMACS / MODSIM Congress, Cairns, Australia 13-17 July 2009 http://mssanz.org.au/modsim09 A technique for constructing monotonic regression splines to enable non-linear transformation of GIS

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

Natural Quartic Spline

Natural Quartic Spline Natural Quartic Spline Rafael E Banchs INTRODUCTION This report describes the natural quartic spline algorithm developed for the enhanced solution of the Time Harmonic Field Electric Logging problem As

More information

8 Piecewise Polynomial Interpolation

8 Piecewise Polynomial Interpolation Applied Math Notes by R. J. LeVeque 8 Piecewise Polynomial Interpolation 8. Pitfalls of high order interpolation Suppose we know the value of a function at several points on an interval and we wish to

More information

The pspline Package. August 4, Author S original by Jim Ramsey R port by Brian Ripley

The pspline Package. August 4, Author S original by Jim Ramsey R port by Brian Ripley The pspline Package August 4, 2004 Version 1.0-8 Date 2004-08-04 Title Penalized Smoothing Splines Author S original by Jim Ramsey . R port by Brian Ripley .

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

Making the Transition from R-code to Arc

Making the Transition from R-code to Arc Making the Transition from R-code to Arc Sanford Weisberg Supported by the National Science Foundation Division of Undergraduate Education Grants 93-54678 and 96-52887. May 17, 2000 Arc is the revision

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.

More information

HW 10 STAT 472, Spring 2018

HW 10 STAT 472, Spring 2018 HW 10 STAT 472, Spring 2018 1) (0 points) Do parts (a), (b), (c), and (e) of Exercise 2 on p. 298 of ISL. 2) (0 points) Do Exercise 3 on p. 298 of ISL. 3) For this problem, you can merely submit the things

More information

Linear Interpolating Splines

Linear Interpolating Splines Jim Lambers MAT 772 Fall Semester 2010-11 Lecture 17 Notes Tese notes correspond to Sections 112, 11, and 114 in te text Linear Interpolating Splines We ave seen tat ig-degree polynomial interpolation

More information

An introduction to interpolation and splines

An introduction to interpolation and splines An introduction to interpolation and splines Kenneth H. Carpenter, EECE KSU November 22, 1999 revised November 20, 2001, April 24, 2002, April 14, 2004 1 Introduction Suppose one wishes to draw a curve

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

HW 10 STAT 672, Summer 2018

HW 10 STAT 672, Summer 2018 HW 10 STAT 672, Summer 2018 1) (0 points) Do parts (a), (b), (c), and (e) of Exercise 2 on p. 298 of ISL. 2) (0 points) Do Exercise 3 on p. 298 of ISL. 3) For this problem, try to use the 64 bit version

More information

Lecture 9. Curve fitting. Interpolation. Lecture in Numerical Methods from 28. April 2015 UVT. Lecture 9. Numerical. Interpolation his o

Lecture 9. Curve fitting. Interpolation. Lecture in Numerical Methods from 28. April 2015 UVT. Lecture 9. Numerical. Interpolation his o Curve fitting. Lecture in Methods from 28. April 2015 to ity Interpolation FIGURE A S Splines Piecewise relat UVT Agenda of today s lecture 1 Interpolation Idea 2 3 4 5 6 Splines Piecewise Interpolation

More information

Dynamic Thresholding for Image Analysis

Dynamic Thresholding for Image Analysis Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British

More information

Statistical Modeling with Spline Functions Methodology and Theory

Statistical Modeling with Spline Functions Methodology and Theory This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H. Hansen University of California at Los Angeles Jianhua Z. Huang University of Pennsylvania

More information

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG WHAT IS IT? The Generalized Regression platform was introduced in JMP Pro 11 and got much better in version

More information

Fall CSCI 420: Computer Graphics. 4.2 Splines. Hao Li.

Fall CSCI 420: Computer Graphics. 4.2 Splines. Hao Li. Fall 2014 CSCI 420: Computer Graphics 4.2 Splines Hao Li http://cs420.hao-li.com 1 Roller coaster Next programming assignment involves creating a 3D roller coaster animation We must model the 3D curve

More information

Scientific Computing: Interpolation

Scientific Computing: Interpolation Scientific Computing: Interpolation Aleksandar Donev Courant Institute, NYU donev@courant.nyu.edu Course MATH-GA.243 or CSCI-GA.22, Fall 25 October 22nd, 25 A. Donev (Courant Institute) Lecture VIII /22/25

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Chapter 9 Chapter 9 1 / 50 1 91 Maximal margin classifier 2 92 Support vector classifiers 3 93 Support vector machines 4 94 SVMs with more than two classes 5 95 Relationshiop to

More information

Additive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June

Additive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June for the Austrian housing market, June 14 2012 Ao. Univ. Prof. Dr. Fachbereich Stadt- und Regionalforschung Technische Universität Wien Dr. Strategic Risk Management Bank Austria UniCredit, Wien Inhalt

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

Regression. Dr. G. Bharadwaja Kumar VIT Chennai

Regression. Dr. G. Bharadwaja Kumar VIT Chennai Regression Dr. G. Bharadwaja Kumar VIT Chennai Introduction Statistical models normally specify how one set of variables, called dependent variables, functionally depend on another set of variables, called

More information

Four equations are necessary to evaluate these coefficients. Eqn

Four equations are necessary to evaluate these coefficients. Eqn 1.2 Splines 11 A spline function is a piecewise defined function with certain smoothness conditions [Cheney]. A wide variety of functions is potentially possible; polynomial functions are almost exclusively

More information

IBM SPSS Categories 23

IBM SPSS Categories 23 IBM SPSS Categories 23 Note Before using this information and the product it supports, read the information in Notices on page 55. Product Information This edition applies to version 23, release 0, modification

More information

STA 414/2104 S: February Administration

STA 414/2104 S: February Administration 1 / 16 Administration HW 2 posted on web page, due March 4 by 1 pm Midterm on March 16; practice questions coming Lecture/questions on Thursday this week Regression: variable selection, regression splines,

More information

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017 Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked Plotting Menu: QCExpert Plotting Module graphs offers various tools for visualization of uni- and multivariate data. Settings and options in different types of graphs allow for modifications and customizations

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Monte Carlo Integration

Monte Carlo Integration Lab 18 Monte Carlo Integration Lab Objective: Implement Monte Carlo integration to estimate integrals. Use Monte Carlo Integration to calculate the integral of the joint normal distribution. Some multivariable

More information

Interpolation by Spline Functions

Interpolation by Spline Functions Interpolation by Spline Functions Com S 477/577 Sep 0 007 High-degree polynomials tend to have large oscillations which are not the characteristics of the original data. To yield smooth interpolating curves

More information

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is...

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1D Regression i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1 Non-linear problems What if the underlying function is

More information

lecture 10: B-Splines

lecture 10: B-Splines 9 lecture : -Splines -Splines: a basis for splines Throughout our discussion of standard polynomial interpolation, we viewed P n as a linear space of dimension n +, and then expressed the unique interpolating

More information

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica Boston-Keio Workshop 2016. Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica... Mihoko Minami Keio University, Japan August 15, 2016 Joint work with

More information

ME 261: Numerical Analysis Lecture-12: Numerical Interpolation

ME 261: Numerical Analysis Lecture-12: Numerical Interpolation 1 ME 261: Numerical Analysis Lecture-12: Numerical Interpolation Md. Tanver Hossain Department of Mechanical Engineering, BUET http://tantusher.buet.ac.bd 2 Inverse Interpolation Problem : Given a table

More information

Introduction to Mixed Models: Multivariate Regression

Introduction to Mixed Models: Multivariate Regression Introduction to Mixed Models: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #9 March 30, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate

More information

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Minitab 17 commands Prepared by Jeffrey S. Simonoff Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 1 3.1 Linearization and Optimization of Functions of Vectors 1 Problem Notation 2 Outline 3.1.1 Linearization 3.1.2 Optimization of Objective Functions 3.1.3 Constrained

More information

* * * * * * * * * * * * * * * ** * **

* * * * * * * * * * * * * * * ** * ** Generalized additive models Trevor Hastie and Robert Tibshirani y 1 Introduction In the statistical analysis of clinical trials and observational studies, the identication and adjustment for prognostic

More information

22s:152 Applied Linear Regression. Chapter 18: Nonparametric Regression (Lowess smoothing)

22s:152 Applied Linear Regression. Chapter 18: Nonparametric Regression (Lowess smoothing) 22s:152 Applied Linear Regression Chapter 18: Nonparametric Regression (Lowess smoothing) When the nature of the response function (or mean structure) is unknown and one wishes to investigate it, a nonparametric

More information

Asymptotic Error Analysis

Asymptotic Error Analysis Asymptotic Error Analysis Brian Wetton Mathematics Department, UBC www.math.ubc.ca/ wetton PIMS YRC, June 3, 2014 Outline Overview Some History Romberg Integration Cubic Splines - Periodic Case More History:

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

2014 Stat-Ease, Inc. All Rights Reserved.

2014 Stat-Ease, Inc. All Rights Reserved. What s New in Design-Expert version 9 Factorial split plots (Two-Level, Multilevel, Optimal) Definitive Screening and Single Factor designs Journal Feature Design layout Graph Columns Design Evaluation

More information

Consider functions such that then satisfies these properties: So is represented by the cubic polynomials on on and on.

Consider functions such that then satisfies these properties: So is represented by the cubic polynomials on on and on. 1 of 9 3/1/2006 2:28 PM ne previo Next: Trigonometric Interpolation Up: Spline Interpolation Previous: Piecewise Linear Case Cubic Splines A piece-wise technique which is very popular. Recall the philosophy

More information

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM) School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression

More information

Interpolation & Polynomial Approximation. Cubic Spline Interpolation II

Interpolation & Polynomial Approximation. Cubic Spline Interpolation II Interpolation & Polynomial Approximation Cubic Spline Interpolation II Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression Appendix to An R and S-PLUS Companion to Applied Regression John Fox Revised April 2005 1 Nonparametric Regression Models The traditional nonlinear regression model (described

More information