Semiparametric Tools: Generalized Additive Models

Size: px
Start display at page:

Download "Semiparametric Tools: Generalized Additive Models"

Transcription

1 Semiparametric Tools: Generalized Additive Models Jamie Monogan Washington University November 8, 2010 Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

2 Regression Splines Choose breakpoints (also called knots). This is the key tradeoff: more knots mean more flexibility, but can be more compute-intensive and sometimes too wavy. Strategies: 1 cardinal knots: uniform over range of X data, 2 at quantiles, 3 adaptive (complex), 4 at selected X i. Effects: 1 bad choices can be dramatic, 2 bad choices can miss important features. Setup: interior knots given by ξ 1 < ξ 2 < < ξ S over the range (X (1), X (n) ), along with the boundary knots: ξ 0 < X (1), X (n) < ξ S Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

3 Regression Splines We must choose a basis function which must join smoothly at knots. Truncated Power Series, with S knots: S s(x) = δ 0 + δ 1 x + δ 2 x 2 + δ 3 x 3 + δ 3+i (x ξ i ) 3 + i=1 Where (x ξ i ) 3 + means we include only positive terms, else zero: (x ξ i ) 3 + = max[0,(x ξ i ) 3 ] Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

4 Regression Splines So s(x) is a linear weighted combination of the S + 4 functions: P 0 (x) = 1 P 1 (x) = x P 2 (x) = x 2 P 3 (x) = x 3 P S1 (x) = (x ξ 1 ) 3 +, P S 2 (x) = (x ξ 2 ) 3 +, P S 3 (x) = (x ξ 3 ) 3 + meaning that the the function is linear in these S + 4 parameters and can be estimated with OLS. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

5 Cubic Regression Splines Cubic Splines add the condition that at the boundary knots, s(x) and s(x) are both zero, so s(x) is linear (but not necessarily flat) in the intervals: [ξ 0,ξ 1 ], [ξ S,ξ S+1 ]. Now estimate by applying the function: 1 regress y i on f (x i ), say for S=3: ŷ i = f (x i ) = δ 0 + δ 1 x i + δ 2 xi 2 + δ 3 xi 3 + δ 4 (x i ξ 1 ) δ 5 (x i ξ 2 ) δ 6 (x i ξ 3 ) 3 + using OLS. 2 Obtains 7 coefficient estimates: ˆδ i, i = 1,...,7. Extension: B-Splines, different parameterizations,... Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

6 Cubic Regression Splines The R function for Cubic Splines is smooth.spline x1<-seq(-10,10,length=100); y1<-cos(x1)/(rt(length(x1),50)*0.5) par(mfrow=c(1,3),oma=c(3,3,3,3),mar=c(2,0,2,0),bg="whitesmoke") plot(x1,y1,pch=3,ylim=range(y1)*1.2,col="slateblue") spline.out <- smooth.spline(x1,y1,all.knots=true) lines(spline.out$x,spline.out$y,col="forest green") mtext("all X Are Knots",outer=FALSE,side=3,cex=1.1,line=1.5) plot(x1,y1,pch=3,ylim=range(y1)*1.2,yaxt="n",col="slateblue") spline.out <- smooth.spline(x1,y1,all.knots=false,nknots=4) lines(spline.out$x,spline.out$y,col="forest green") mtext("4 Knots",outer=FALSE,side=3,cex=1.1,line=1.5) plot(x1,y1,pch=3,ylim=range(y1)*1.2,yaxt="n",col="slateblue") spline.out <- smooth.spline(x1,y1,all.knots=false,nknots=10) lines(spline.out$x,spline.out$y,col="forest green") mtext("20 Knots",outer=FALSE,side=3,cex=1.1,line=1.5) Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

7 Cubic Regression Splines All X Are Knots 4 Knots 20 Knots Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

8 Penalized Splines The goal is: min n (y i f (x i )) 2 + λ i=1 b a (f (t)) 2 dt where λ is a fixed constant, and a < x (1), x (n) < b. The idea is that the first term promotes fit and the second term penalizes overfitting. The function f (x) has an explicit and unique form that minimizes the cubic spline with knots at all of the x i. If λ =, then we get linear regression. As λ approaches 0, we get closer to interpolation. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

9 Estimating Penalized Splines For regression-style models, another way of notating the function to be minimized is: y Xβ 2 + λ b a (f (t)) 2 dt. Because f (t) is linear in the parameters by design, it can be written as quadratic: b a (f (t)) 2 dt = βsβ, where S is a matrix of known coefficients determined by the form of f (t). Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

10 Estimating Penalized Splines So the penalized least squares estimator is given by: ˆB = (X X + λs) 1 X y, with the associated hat (influence) matrix: A = X(X X + λs) 1 X. The matrix A also gives the effective degrees of freedom for the smoothed fit by its trace. The max[tr(a)] is the number of parameters minus the number of constraints, and the min[tr(a)] is the max minus the rank of the S matrix. As the number of parameters goes from zero to infinity, the edf moves upward between these two quantities. We can use edf for hypothesis testing between two models. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

11 Penalized Splines Code The R function for Penalized Splines is also smooth.spline nonlin.mat<-read.table("gam.test.dat",header=true) attach(nonlin.mat) postscript("class.stat.comp/cognitive2i.ps") par(mfrow=c(1,3),oma=c(3,3,3,3),mar=c(2,0,2,0)) plot(x1,y1,pch=3,ylim=range(y1)*1.2,col="slateblue") spline.out <- smooth.spline(x1,y1,spar=0.1,all.knots=true) lines(spline.out$x,spline.out$y,col="maroon2") mtext("interpolation",outer=false,side=3,cex=1.1,line=1.5) Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

12 Penalized Splines plot(x1,y1,pch=3,ylim=range(y1)*1.2,yaxt="n",col="slateblue") spline.out <- smooth.spline(x1,y1,spar=0.5) lines(spline.out$x,spline.out$y,col="maroon2") mtext("parameter: 0.5",outer=FALSE,side=3,cex=1.1,line=1.5) plot(x1,y1,pch=3,ylim=range(y1)*1.2,yaxt="n",col="slateblue") spline.out <- smooth.spline(x1,y1,spar=0.95) lines(spline.out$x,spline.out$y,col="maroon2") mtext("parameter: 9.5",outer=FALSE,side=3,cex=1.1,line=1.5) dev.off() detach(nonlin.mat) Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

13 Penalized Splines Interpolation Parameter: 0.5 Parameter: Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

14 Thin Plate Splines Some disadvantages of standard spline approaches: user must stipulated knots, bases given for only one variable, criteria for bases unclear. We want: knot-free spline bases over any number of explanatory variables that have optimal properties. Thin plate splines (Duchon 1977; Wahba 1990, Green & Silverman 1994, Wood 2006, Chapter 4) are a good solution to this problem. General Strategy: penalize with derivative functions, to produce a smooth function according to y i = g(x i ) + ǫ i where ǫ i is a random error vector with good properties and x i is a d -length explanatory variable vector. It automatically calculates how much weight to give the conflicting goals of following the data and making the fit as smooth as possible by putting the tradeoff into an explicit function. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

15 Thin Plate Splines Objective: find a function f that minimizes the vector norm: y f 2 = λj md (f ) where: y is the n -length outcome variable vector, f = f (x 1 ), f (x 2 ),..., f (x n ), λ is a smoothing parameter, J md is a penalty term based on the curviness of the smooth. This is very much in line with the spline technology that we have been studying except that there will be more automatic rather than human decisions. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

16 Thin Plate Splines Penalty Function The penalty is defined with the following function: J md = where 2m > d + 1. η 1 + +η d =m ( m! m ) f 2 η 1! η k! x η 1 1 dx xη d 1 dx d d For instance, when d = 2 η 1 = η 2 = 1, and m = 2, then: [( 2 ) ( f 2 ) ( f 2 )] f J 22 = 2 x x 1 x 2 x2 2 dx 1 dx 2. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

17 Thin Plate Splines One function that minimizes y f 2 = λj md (f ) is: ˆf (x) = n M δ i η md ( y f ) + α j φ j (x), i=1 j=1 where: δ and α are coefficient vectors to be estimated. δ has the linear constrant T δ = 0, with the matrix values T ij = φ j (x i ), which are linearly independent polynomials spanning R d of degree less than m as well as spanning the null space of J mn. Returning to the example where m = d = 3, Finally: φ 1 = 1, φ 2 = x 1, φ 3 = x 2. ( 1) 2 η md (χ) = π d/2 (m 1)!(m d/2)! χ2m d log(χ) d even Γ(d/2 m) 2 2m π d/2 (m 1)! χ2m d log(χ) d odd Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

18 Thin Plate Splines Now define the matrix E with elements: E ij = η md ( x i x j ). The fitting problem is now expressible as: minimize y E T 2 +λd ED,subject to T D = 0,with respect to T,α. This truncates the space of rough components, those with D parameters, while leaving the smooth components untouched. Primary challenge (besides all the math): comptuational efficiency: there are as many unknown quantities as datapoints and estimation time is proportional to d 3. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

19 Thin Plate Splines in R library(rgcvpack) # DEFINE A THREE-DIMENSIONAL FUNCTION (2 IN, 1 OUT) f <- function(x, y) { 0.75*exp( -((10*x-1)^2 + (10*y-1)^2)/5 ) *exp( -((10*x-7)^2 + (10*y-5)^2)/5 ) *exp( -((10*x-4)^2 + (10*y-7)^2)/5 ) } # CREATE A FAKE DATASET USING THIS FUNCTION set.seed(pi); n <- 15; x2 <- x1 <- seq(0,1,length=n) y <- outer(x1, x2, f); y <- y + rnorm(n^2,0,0.05*max(abs(y))) # THE FUNCTION NEEDS THESE AS VECTORS x1.vec<-rep(x1,n); x2.vec<-rep(x2,rep(n,n)) y.vec<-as.vector(y) Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

20 Thin Plate Splines in R #RUN THE THIN PLATE SPLINE WITH ALL DATA POINTS AS KNOTS #ORDER 3 thinpl.out <- fittps(cbind(x1.vec,x2.vec), y.vec, m=3) # GRAPH par(mar=c(3,3,1,1),col.axis="white",col.lab="white", col.sub="white", col="white",bg="slategray") persp(x1, x2, matrix(predict(thinpl.out),n,n), theta=130, phi=20, expand=0.50, xlab="x1", ylab="x2", zlab="y", xlim=c(0,1), ylim=c(0,1),zlim=range(y), ticktype="detailed", scale=false, main="thin Plate Spline") Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

21 y Thin Plate Splines in R Thin Plate Spline x x Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

22 Smoothing Parameter Selection Recall as λ approaches 0, we get closer to interpolation (for a scalar parameter). Penalized maximum likelihood methods can only estimate β coefficients conditional on smoothing parameters, λ. Two basic scenarios to estimation via minimizing the error quantity, E(M) = E ( µ ˆµ 2 /n ) : σ 2 known or assumed true, then estimation uses Mallow s C p -UBRE (Unbiased Risk Estimator). σ 2 unknown, then estimation uses generalized cross validation (GCV). Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

23 Smoothing Parameter Selection, Scale Parameter Known For regression, the expected mean square error is given by: ( ) E(M) = E µ Xˆβ 2 /n = E ( y Ay 2) /n σ 2 + 2tr(A)σ 2 /n where A = X(X X + λs) 1 X. Which means we minimize y Ay 2 /n σ 2 + 2tr(A)σ 2 /n So the smoothing parameters affect the estimation through A. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

24 Smoothing Parameter Selection, Scale Parameter Unknown In this case minimize the mean square prediction error: P = σ + M, which is the average squared error in predicting a new observation, y n+1, using the fitted model. P is most easily estimated with cross validation (later generalized cross validation): Jackknife out each case iteratively. At each step, calculate ˆµ [i], which is the prediction of y i from the model that does not include case i. Finally, calculate: ˆP = 1 n (y i = ˆµ [i] ) 2. n i 1 Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

25 Smoothing Parameter Selection, Scale Parameter Unknown The ˆµ [i] term means that we have to do the full jackknifing loop through all of the data. Actually this can be done in one step using the complete-data model: ˆP = 1 n n i=1 (y i ˆµ i ) 2 (1 A ii ) 2. This is analagous to the short-hand method for calculating the jackknifed standard error. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

26 Generalized Additive Models Big Picture: just like a GLM except we will do component-wise smoothing of some right-hand side variables. More computationally intenstive that GLM estimation with many more model-fitting choices to make. Results are often given graphically for smoothed parameters, especially if there are many. Definitive citations: Hastie and Tibshirani (1986), Generalized Additive Models (with discussion). Statistical Science 1, Wood (2006), Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC. Hastie (1993), in Chambers and Hastie, Statistical Models in S. Chapman & Hall. Hastie and Tibshirani (1990), Generalized Additive Models. Chapman & Hall. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

27 Generalized Additive Models Structure: Y = α + n f j (x j ) + ǫ j=1 E[ǫ] = 0 cor(ǫ i,x j ) = 0 Var(ǫ) = σ 2 Solved by an algorithm called backfitting. Typically we think of f j s as univariate and smooth, but they don t have to be either: f (x j1,x j2 ) like an interaction or other single dimension mapping, or categorical specifications. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

28 Generalized Additive Models To avoid a plethora of free constants in each of the f j (), it is common to assume E[f j (x j )] = 0, which can be achieved by centering if necessary. Big point: unlike a GLM, each term is represented additively and therefore we can use the same marginal interpretation as linear models (but without the linear assumption obviously). Two consequences: 1 The variation of the fitted response surface holding all but one explanatory variable constant does not depend on the values of the other explanatory values. 2 Plots of the fits separately are very useful. Botanical Example Let s study our cherry tree data. The simple model of interest is: log(volume i ) = f 1 (Height i ) + f 2 (Girth i ) + ǫ i Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

29 Details on GAM Model Specification The R formula for gam is just like glm except we have new smoother terms: s and te. The notation s(x1), gives a spline based smooth for the X1 explanatory variable. The notation te(x2) gives a tensor product based smooth for X2 explanatory variable. It is common to mix smoothed and unsmoothed terms in a model: Y ~ X1 + s(x2) + te(x3) There can be nested smoothing specifications: Y ~ s(x1) + s(x2) + s(x1,x2) Y ~ s(x1,x2) + s(x2,x3) We can also control the smooth with parameter vectors, for instance: Y ~ te(x1,x2, bs=c("tp","tp"), m=c(3,4), k=(5,6)) which gives a tensor product smooths of X1 and X2 with bases of dimension 3 for X1 and 4 for X2, and marginal penalties of 5 for X1 and 6 for X2. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

30 Full Syntax for gam There are many modeling options. gam(formula, family=gaussian(), data=list(), weights=null, subset=null, na.action, offset=null, method="gcv.cp", optimizer=c("outer","newton"), control=gam.control(), scale=0,select=false,knots=null,sp=null,min.sp=null, H=NULL,gamma=1,fit=TRUE,paraPen=NULL,G=NULL,in.out,...) with: formula a full R modeling formula, including smooth terms family if gaussian fitting is by least-squares, and if symmetric by a re-descending M-estimator data an optional data frame, list or environment weights optional regression-style weights for each case subset an optional subset of the data to be used na.action the regular model treatment of missing data offset used to supply a model offset for use in fitting control control parameters, see gam.control Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

31 Full Syntax for gam method smoothing parameter estimation method GCV.Cp to use GCV for unknown scale parameter and Mallows Cp/UBRE/AIC for known scale. GACV.Cp is equivalent, but using GACV in place of GCV. REML for REML estimation, including of unknown scale, P-REML for REML estimation, but using a Pearson estimate of the scale. ML and P-ML are similar, but using maximum likelihood in place of REML optimizer perf for performance iteration, outer for the more stable direct approach. outer can use several alternative optimizers, specified in the second element of optimizer: newton (default), bfgs, optim, nlm and nlm.fd (slow) scale positive values for the scale parameter, negative for unknown, zero for 1 into Poisson and binomial and unknown for other distributions select If TRUE then the fit can add an extra penalty to each term knots sp min.sp H list containing user specified knot values (must match k value supplied smoothing parameter vector in the order that the smooth terms appear in the model formula, negative elements indicate that the parameter should be estimated lower bounds for smoothing parameters user supplied fixed quadratic penalty on the parameters, often for ridge Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

32 Full Syntax for gam gamma multiplier to inflate the model d.f. in the GCV or UBRE/AIC score fit If TRUE then model is fit, if FALSE then the model is set up and an object G containing what would be required to fit is returned is returned parapen optional list specifying any penalties to be applied to parametric model terms G object returned by a previous call to gam with fit=false in.out optional list for initializing outer iteration Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

33 Terrorism Data Analysis Source: The International Policy Institute for Counter-Terrorism, Herzlia, Israel. Provided on an online database with details of attacks in Israel since September, Subsetted by Mark Harrison to give 103 suicide attacks over a three-year period from November 6, 2000 to November 3, 2003 when there was a steep drop. Information provided: date and place of the attack, attack type, the type of target and device employed, organizational affiliation of the attacker, and the number of casualties, along with a written description of the attack. Casualties are given personal attributes such as name, age, sex, nationality, and religion. Jamie Monogan (WUStL) Generalized Additive Models November 8, / 33

Generalized Additive Models

Generalized Additive Models :p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

Incorporating Geospatial Data in House Price Indexes: A Hedonic Imputation Approach with Splines. Robert J. Hill and Michael Scholz

Incorporating Geospatial Data in House Price Indexes: A Hedonic Imputation Approach with Splines. Robert J. Hill and Michael Scholz Incorporating Geospatial Data in House Price Indexes: A Hedonic Imputation Approach with Splines Robert J. Hill and Michael Scholz Department of Economics University of Graz, Austria OeNB Workshop Vienna,

More information

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K. A toolbo of smooths Simon Wood Mathematical Sciences, University of Bath, U.K. Smooths for semi-parametric GLMs To build adequate semi-parametric GLMs requires that we use functions with appropriate properties.

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity The truth is never linear! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! But often

More information

Lecture 17: Smoothing splines, Local Regression, and GAMs

Lecture 17: Smoothing splines, Local Regression, and GAMs Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections 7.5-7 STATS 202: Data mining and analysis November 6, 2017 1 / 24 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want

More information

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function

More information

Generalized additive models I

Generalized additive models I I Patrick Breheny October 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/18 Introduction Thus far, we have discussed nonparametric regression involving a single covariate In practice, we often

More information

Splines and penalized regression

Splines and penalized regression Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs, GAMMs and other penalied GLMs using mgcv in R Simon Wood Mathematical Sciences, University of Bath, U.K. Simple eample Consider a very simple dataset relating the timber volume of cherry trees to

More information

Package rgcvpack. February 20, Index 6. Fitting Thin Plate Smoothing Spline. Fit thin plate splines of any order with user specified knots

Package rgcvpack. February 20, Index 6. Fitting Thin Plate Smoothing Spline. Fit thin plate splines of any order with user specified knots Version 0.1-4 Date 2013/10/25 Title R Interface for GCVPACK Fortran Package Author Xianhong Xie Package rgcvpack February 20, 2015 Maintainer Xianhong Xie

More information

Economics Nonparametric Econometrics

Economics Nonparametric Econometrics Economics 217 - Nonparametric Econometrics Topics covered in this lecture Introduction to the nonparametric model The role of bandwidth Choice of smoothing function R commands for nonparametric models

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica Boston-Keio Workshop 2016. Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica... Mihoko Minami Keio University, Japan August 15, 2016 Joint work with

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial

More information

Nonparametric regression using kernel and spline methods

Nonparametric regression using kernel and spline methods Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Package gamm4. July 25, Index 10

Package gamm4. July 25, Index 10 Version 0.2-5 Author Simon Wood, Fabian Scheipl Package gamm4 July 25, 2017 Maintainer Simon Wood Title Generalized Additive Mixed Models using 'mgcv' and 'lme4' Description

More information

Lecture 7: Splines and Generalized Additive Models

Lecture 7: Splines and Generalized Additive Models Lecture 7: and Generalized Additive Models Computational Statistics Thierry Denœux April, 2016 Introduction Overview Introduction Simple approaches Polynomials Step functions Regression splines Natural

More information

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is...

1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1D Regression i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1 Non-linear problems What if the underlying function is

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis

More information

GAMs with integrated model selection using penalized regression splines and applications to environmental modelling.

GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. Simon N. Wood a, Nicole H. Augustin b a Mathematical Institute, North Haugh, St Andrews,

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.

More information

A review of spline function selection procedures in R

A review of spline function selection procedures in R Matthias Schmid Department of Medical Biometry, Informatics and Epidemiology University of Bonn joint work with Aris Perperoglou on behalf of TG2 of the STRATOS Initiative September 1, 2016 Introduction

More information

Package gplm. August 29, 2016

Package gplm. August 29, 2016 Type Package Title Generalized Partial Linear Models (GPLM) Version 0.7-4 Date 2016-08-28 Author Package gplm August 29, 2016 Maintainer Provides functions for estimating a generalized

More information

Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines

Spline Models. Introduction to CS and NCS. Regression splines. Smoothing splines Spline Models Introduction to CS and NCS Regression splines Smoothing splines 3 Cubic Splines a knots: a< 1 < 2 < < m

More information

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Non-Linear Regression Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Understanding the need for non-parametric regressions 2 Familiarizing with two common

More information

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors

More information

The mgcv Package. July 21, 2006

The mgcv Package. July 21, 2006 The mgcv Package July 21, 2006 Version 1.3-18 Author Simon Wood Maintainer Simon Wood Title GAMs with GCV smoothness estimation and GAMMs by REML/PQL

More information

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this

More information

STAT 705 Introduction to generalized additive models

STAT 705 Introduction to generalized additive models STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni

Nonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni Nonparametric Risk Attribution for Factor Models of Portfolios October 3, 2017 Kellie Ottoboni Outline The problem Page 3 Additive model of returns Page 7 Euler s formula for risk decomposition Page 11

More information

davidr Cornell University

davidr Cornell University 1 NONPARAMETRIC RANDOM EFFECTS MODELS AND LIKELIHOOD RATIO TESTS Oct 11, 2002 David Ruppert Cornell University www.orie.cornell.edu/ davidr (These transparencies and preprints available link to Recent

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

Lecture 9: Introduction to Spline Curves

Lecture 9: Introduction to Spline Curves Lecture 9: Introduction to Spline Curves Splines are used in graphics to represent smooth curves and surfaces. They use a small set of control points (knots) and a function that generates a curve through

More information

8 Piecewise Polynomial Interpolation

8 Piecewise Polynomial Interpolation Applied Math Notes by R. J. LeVeque 8 Piecewise Polynomial Interpolation 8. Pitfalls of high order interpolation Suppose we know the value of a function at several points on an interval and we wish to

More information

Nonparametric Approaches to Regression

Nonparametric Approaches to Regression Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)

More information

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Weijie Cai, SAS Institute Inc., Cary NC July 1, 2008 ABSTRACT Generalized additive models are useful in finding predictor-response

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Basis Functions Tom Kelsey School of Computer Science University of St Andrews http://www.cs.st-andrews.ac.uk/~tom/ tom@cs.st-andrews.ac.uk Tom Kelsey ID5059-02-BF 2015-02-04

More information

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG WHAT IS IT? The Generalized Regression platform was introduced in JMP Pro 11 and got much better in version

More information

Poisson Regression and Model Checking

Poisson Regression and Model Checking Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)

More information

Nonparametric Mixed-Effects Models for Longitudinal Data

Nonparametric Mixed-Effects Models for Longitudinal Data Nonparametric Mixed-Effects Models for Longitudinal Data Zhang Jin-Ting Dept of Stat & Appl Prob National University of Sinagpore University of Seoul, South Korea, 7 p.1/26 OUTLINE The Motivating Data

More information

Convexization in Markov Chain Monte Carlo

Convexization in Markov Chain Monte Carlo in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non

More information

Part A Statistical rationale for application of a Generalized Additive Mixed Model (GAMM)

Part A Statistical rationale for application of a Generalized Additive Mixed Model (GAMM) APPENDIX Part A Statistical rationale for application of a Generalized Additive Mixed Model (GAMM) Part B Layman introduction to GAMM Part C - References Part A. Statistical rationale for application of

More information

More advanced use of mgcv. Simon Wood Mathematical Sciences, University of Bath, U.K.

More advanced use of mgcv. Simon Wood Mathematical Sciences, University of Bath, U.K. More advanced use of mgcv Simon Wood Mathematical Sciences, University of Bath, U.K. Fine control of smoothness: gamma Suppose that we fit a model but a component is too wiggly. For GCV/AIC we can increase

More information

NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR

NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR J. D. Maca July 1, 1997 Abstract The purpose of this manual is to demonstrate the usage of software for

More information

TECHNICAL REPORT NO December 11, 2001

TECHNICAL REPORT NO December 11, 2001 DEPARTMENT OF STATISTICS University of Wisconsin 2 West Dayton St. Madison, WI 5376 TECHNICAL REPORT NO. 48 December, 2 Penalized Log Likelihood Density Estimation, via Smoothing-Spline ANOVA and rangacv

More information

Straightforward intermediate rank tensor product smoothing in mixed models

Straightforward intermediate rank tensor product smoothing in mixed models Straightforward intermediate rank tensor product smoothing in mixed models Simon N. Wood, Fabian Scheipl, Julian J. Faraway January 6, 2012 Abstract Tensor product smooths provide the natural way of representing

More information

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications

More information

Edge and local feature detection - 2. Importance of edge detection in computer vision

Edge and local feature detection - 2. Importance of edge detection in computer vision Edge and local feature detection Gradient based edge detection Edge detection by function fitting Second derivative edge detectors Edge linking and the construction of the chain graph Edge and local feature

More information

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2) SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

Package bgeva. May 19, 2017

Package bgeva. May 19, 2017 Version 0.3-1 Package bgeva May 19, 2017 Author Giampiero Marra, Raffaella Calabrese and Silvia Angela Osmetti Maintainer Giampiero Marra Title Binary Generalized Extreme Value

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures

More information

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums José Garrido Department of Mathematics and Statistics Concordia University, Montreal EAJ 2016 Lyon, September

More information

Chapter 5: Basis Expansion and Regularization

Chapter 5: Basis Expansion and Regularization Chapter 5: Basis Expansion and Regularization DD3364 April 1, 2012 Introduction Main idea Moving beyond linearity Augment the vector of inputs X with additional variables. These are transformations of

More information

Nonlinearity and Generalized Additive Models Lecture 2

Nonlinearity and Generalized Additive Models Lecture 2 University of Texas at Dallas, March 2007 Nonlinearity and Generalized Additive Models Lecture 2 Robert Andersen McMaster University http://socserv.mcmaster.ca/andersen Definition of a Smoother A smoother

More information

Quality assessment of data-based metamodels for multi-objective aeronautic design optimisation

Quality assessment of data-based metamodels for multi-objective aeronautic design optimisation Quality assessment of data-based metamodels for multi-objective aeronautic design optimisation Timur Topuz November 7 B W I vrije Universiteit Faculteit der Exacte Wetenschappen Studierichting Bedrijfswiskunde

More information

Curve fitting using linear models

Curve fitting using linear models Curve fitting using linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark September 28, 2012 1 / 12 Outline for today linear models and basis functions polynomial regression

More information

Interpolation by Spline Functions

Interpolation by Spline Functions Interpolation by Spline Functions Com S 477/577 Sep 0 007 High-degree polynomials tend to have large oscillations which are not the characteristics of the original data. To yield smooth interpolating curves

More information

Approximate Smoothing Spline Methods for Large Data Sets in the Binary Case Dong Xiang, SAS Institute Inc. Grace Wahba, University of Wisconsin at Mad

Approximate Smoothing Spline Methods for Large Data Sets in the Binary Case Dong Xiang, SAS Institute Inc. Grace Wahba, University of Wisconsin at Mad DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 982 September 30, 1997 Approximate Smoothing Spline Methods for Large Data Sets in the Binary

More information

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization and Markov Random Fields (MRF) CS 664 Spring 2008 Regularization in Low Level Vision Low level vision problems concerned with estimating some quantity at each pixel Visual motion (u(x,y),v(x,y))

More information

Additive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June

Additive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June for the Austrian housing market, June 14 2012 Ao. Univ. Prof. Dr. Fachbereich Stadt- und Regionalforschung Technische Universität Wien Dr. Strategic Risk Management Bank Austria UniCredit, Wien Inhalt

More information

ME 261: Numerical Analysis Lecture-12: Numerical Interpolation

ME 261: Numerical Analysis Lecture-12: Numerical Interpolation 1 ME 261: Numerical Analysis Lecture-12: Numerical Interpolation Md. Tanver Hossain Department of Mechanical Engineering, BUET http://tantusher.buet.ac.bd 2 Inverse Interpolation Problem : Given a table

More information

Discussion Notes 3 Stepwise Regression and Model Selection

Discussion Notes 3 Stepwise Regression and Model Selection Discussion Notes 3 Stepwise Regression and Model Selection Stepwise Regression There are many different commands for doing stepwise regression. Here we introduce the command step. There are many arguments

More information

GAM: The Predictive Modeling Silver Bullet

GAM: The Predictive Modeling Silver Bullet GAM: The Predictive Modeling Silver Bullet Author: Kim Larsen Introduction Imagine that you step into a room of data scientists; the dress code is casual and the scent of strong coffee is hanging in the

More information

Bayes Estimators & Ridge Regression

Bayes Estimators & Ridge Regression Bayes Estimators & Ridge Regression Readings ISLR 6 STA 521 Duke University Merlise Clyde October 27, 2017 Model Assume that we have centered (as before) and rescaled X o (original X) so that X j = X o

More information

Polynomials tend to oscillate (wiggle) a lot, even when our true function does not.

Polynomials tend to oscillate (wiggle) a lot, even when our true function does not. AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 2: Spline Approximations Dianne P O Leary c 2001, 2002, 2007 Piecewise polynomial interpolation Piecewise polynomial interpolation Read: Chapter 3 Skip:

More information

Hierarchical generalized additive models: an introduction with mgcv

Hierarchical generalized additive models: an introduction with mgcv Hierarchical generalized additive models: an introduction with mgcv Eric J Pedersen Corresp., 1, 2, David L. Miller 3, 4, Gavin L. Simpson 5, Noam Ross 6 1 Northwest Atlantic Fisheries Center, Fisheries

More information

Smoothing non-stationary noise of the Nigerian Stock Exchange All-Share Index data using variable coefficient functions

Smoothing non-stationary noise of the Nigerian Stock Exchange All-Share Index data using variable coefficient functions Smoothing non-stationary noise of the Nigerian Stock Exchange All-Share Index data using variable coefficient functions 1 Alabi Nurudeen Olawale, 2 Are Stephen Olusegun 1 Department of Mathematics and

More information

APPM/MATH Problem Set 4 Solutions

APPM/MATH Problem Set 4 Solutions APPM/MATH 465 Problem Set 4 Solutions This assignment is due by 4pm on Wednesday, October 16th. You may either turn it in to me in class on Monday or in the box outside my office door (ECOT 35). Minimal

More information

A Method for Comparing Multiple Regression Models

A Method for Comparing Multiple Regression Models CSIS Discussion Paper No. 141 A Method for Comparing Multiple Regression Models Yuki Hiruta Yasushi Asami Department of Urban Engineering, the University of Tokyo e-mail: hiruta@ua.t.u-tokyo.ac.jp asami@csis.u-tokyo.ac.jp

More information

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Lee, Rönnegård & Noh LRN@du.se Lee, Rönnegård & Noh HGLM book 1 / 24 Overview 1 Background to the book 2 Crack growth example 3 Contents

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form

More information

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM) School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression

More information

February 2017 (1/20) 2 Piecewise Polynomial Interpolation 2.2 (Natural) Cubic Splines. MA378/531 Numerical Analysis II ( NA2 )

February 2017 (1/20) 2 Piecewise Polynomial Interpolation 2.2 (Natural) Cubic Splines. MA378/531 Numerical Analysis II ( NA2 ) f f f f f (/2).9.8.7.6.5.4.3.2. S Knots.7.6.5.4.3.2. 5 5.2.8.6.4.2 S Knots.2 5 5.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. S Knots 5 5 S Knots 5 5 5 5.35.3.25.2.5..5 5 5.6.5.4.3.2. 5 5 4 x 3 3.5 3 2.5 2.5.5 5

More information

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017 Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison

More information

Notes on Simulations in SAS Studio

Notes on Simulations in SAS Studio Notes on Simulations in SAS Studio If you are not careful about simulations in SAS Studio, you can run into problems. In particular, SAS Studio has a limited amount of memory that you can use to write

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining Feature Selection Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. Admin Assignment 3: Due Friday Midterm: Feb 14 in class

More information

9.2 User s Guide SAS/STAT. The GAM Procedure. (Book Excerpt) SAS Documentation

9.2 User s Guide SAS/STAT. The GAM Procedure. (Book Excerpt) SAS Documentation SAS/STAT 9.2 User s Guide The GAM Procedure (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation for the complete manual

More information

Machine Learning for Signal Processing Lecture 4: Optimization

Machine Learning for Signal Processing Lecture 4: Optimization Machine Learning for Signal Processing Lecture 4: Optimization 13 Sep 2015 Instructor: Bhiksha Raj (slides largely by Najim Dehak, JHU) 11-755/18-797 1 Index 1. The problem of optimization 2. Direct optimization

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 19 th, 2007 2005-2007 Carlos Guestrin 1 Why not just use Linear Regression? 2005-2007 Carlos Guestrin

More information

Model selection and validation 1: Cross-validation

Model selection and validation 1: Cross-validation Model selection and validation 1: Cross-validation Ryan Tibshirani Data Mining: 36-462/36-662 March 26 2013 Optional reading: ISL 2.2, 5.1, ESL 7.4, 7.10 1 Reminder: modern regression techniques Over the

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

CSE446: Linear Regression. Spring 2017

CSE446: Linear Regression. Spring 2017 CSE446: Linear Regression Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Prediction of continuous variables Billionaire says: Wait, that s not what I meant! You say: Chill

More information

Overview of Clustering

Overview of Clustering based on Loïc Cerfs slides (UFMG) April 2017 UCBL LIRIS DM2L Example of applicative problem Student profiles Given the marks received by students for different courses, how to group the students so that

More information

Package mtsdi. January 23, 2018

Package mtsdi. January 23, 2018 Version 0.3.5 Date 2018-01-02 Package mtsdi January 23, 2018 Author Washington Junger and Antonio Ponce de Leon Maintainer Washington Junger

More information

Bernstein-Bezier Splines on the Unit Sphere. Victoria Baramidze. Department of Mathematics. Western Illinois University

Bernstein-Bezier Splines on the Unit Sphere. Victoria Baramidze. Department of Mathematics. Western Illinois University Bernstein-Bezier Splines on the Unit Sphere Victoria Baramidze Department of Mathematics Western Illinois University ABSTRACT I will introduce scattered data fitting problems on the sphere and discuss

More information

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,

More information

A Random Variable Shape Parameter Strategy for Radial Basis Function Approximation Methods

A Random Variable Shape Parameter Strategy for Radial Basis Function Approximation Methods A Random Variable Shape Parameter Strategy for Radial Basis Function Approximation Methods Scott A. Sarra, Derek Sturgill Marshall University, Department of Mathematics, One John Marshall Drive, Huntington

More information

Machine Learning. Topic 4: Linear Regression Models

Machine Learning. Topic 4: Linear Regression Models Machine Learning Topic 4: Linear Regression Models (contains ideas and a few images from wikipedia and books by Alpaydin, Duda/Hart/ Stork, and Bishop. Updated Fall 205) Regression Learning Task There

More information

Assessing the Quality of the Natural Cubic Spline Approximation

Assessing the Quality of the Natural Cubic Spline Approximation Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,

More information

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each

More information

CS-184: Computer Graphics

CS-184: Computer Graphics CS-184: Computer Graphics Lecture #12: Curves and Surfaces Prof. James O Brien University of California, Berkeley V2007-F-12-1.0 Today General curve and surface representations Splines and other polynomial

More information