Applied Statistics : Practical 9
|
|
- Marianna Lyons
- 6 years ago
- Views:
Transcription
1 Applied Statistics : Practical 9 This practical explores nonparametric regression and shows how to fit a simple additive model. The first item introduces the necessary R commands for nonparametric regression using a simulated example. The second item uses the cars dataset in R, while data for the third and fourth items are available on the course website. 1. A simple simulated example We consider first a simple simulated example we will use to compare different methods to fit a regression function. set.seed(1122) # so we all get the same 'random' data x<-seq(0,1,len=100) # equally spaced grid of predictor values true.func<-sin(2*pi*x)+2*x # defines the true (unkown) function f y<-true.func+rnorm(length(x),0,0.3) # add random i.i.d errors points(x,true.func,type='l',lwd=2) # plot the true function y x #superimposed to the observed data You can later try to change the amount of noise in the data (the standard deviation in the rnorm function) or the period in the sinusoidal function to see how this impact the estimation. Let us now try to estimate the regression function using a nearest neighbors estimator. We need to install and load the R package FNN which contains the function to find the nearest neighbors in a dataset. 1
2 install.packages("fnn") and now we can define a function for the estimator: library(fnn) fknn<-function(x,y,k=5){ # this is a function definition + fx<-rep(na,length(x)) + index<- get.knn(data=x,k=k)$nn.index # finds the K nearest neighbors for every x + for (i in 1:length(x)){ + fx[i]<-mean(y[index[i,]]) # avereges the K nearest neighbors + } + fx # returns the estimated f + } We can use the function above to get the nearest neighbors estimators: NN_fhat<-fkNN(x,y,K=5) points(x,nn_fhat,type='l',lwd=2) Try with different values of K and choose the best one by visual inspection. The visual inspection suggests a number of neighbors between 10 and 15. Consider now the kernel smoothing estimator, which is available in R with the command ksmooth: KN_fhat<-ksmooth(x,y,kernel="normal",bandwidth=0.5) points(kn_fhat$x,kn_fhat$y,type='l',lwd=2) Try using different values for the bandwidth h, which one provides the best fit? Remember that smaller values of the bandwidth leads to a high variation in the fitted curve, while larger values of h give a smoother curve. The visual inspection suggests a bandwidth around You can look at the help of the ksmooth function to learn which other kernel functions (in addition to the Gaussian one) are available. Does the change of the kernel impact on the curve estimate? The other only available kernel is the box function 1 [x h/2,x+h/2], which provides a less smooth estimate, whose irregularities can be detected by the human eye. To apply a local polynomial smoother to the same data: LP_fhat<-loess(y~x,span=2,degree=2) points(x,lp_fhat$fitted,type='l',lwd=2) Here you have two parameters that control the fit: the degree of the piecewise polynomial (set by the option degree) and the bandwidth (option span). Try different values for these two parameters (check the help for admissible values), which ones provide the best fit? 2
3 Visual inspection suggests a bandwidth of 0.5 for a second degree polynomial or a bandwidth between 0.25 and 0.3 for a first degree polynomial. The smoothing splines estimator is implemented in R in the function smooth.spline. To control the smoothing, you can alternatively specify the smoothing parameter λ, via the option spar in the function (where λ is a monotone function of spar, see the help for more details) SS_fhat<-smooth.spline(x,y,spar=0.5) points(x,ss_fhat$y,type='l',lwd=2) or you can specify the effective degrees of freedom of the fitted curve. SS_fhat<-smooth.spline(x,y,df=3) points(x,ss_fhat$y,type='l',lwd=2) You can get the equivalent the parameters spar, λ and the effective degrees of freedom associated with a fitted model with SS_fhat$lambda SS_fhat$df SS_fhat$spar Choose visually an appropriate fit and report your choice of λ and the correspondent effective degrees of freedom. A reasonable fit is obtained with spar= 0.7, corresponding to λ = and 8.4 effective degrees of freedom. An automatic choice of the smoothing parameters can be obtained using the option cv=true. SS_fhat<-smooth.spline(x,y,cv=TRUE) points(x,ss_fhat$y,type='l',lwd=2,col=3) What are the value of λ and the effective degrees of freedom selected by cross-validation? The cross-validation method selects λ = and 6.57 effective degrees of freedom, not too far from what could be expected from visual inspection. If we set cv=false (without selecting the parameter ourselves), the function chooses the smoothing parameter via generalized cross validation, a modified version of the cross-validation error. Now superimpose in the same plot the fitted curves obtained from the various non parametric estimators, with your choice for the best smoothing parameters. Which one perform best in this case? NN_fhat<-fkNN(x,y,K=13) points(x,nn_fhat,type='l',lwd=2) KN_fhat<-ksmooth(x,y,kernel="normal",bandwidth=0.15) points(kn_fhat$x,kn_fhat$y,type='l',lwd=2, col=2) 3
4 LP_fhat<-loess(y~x,span=0.5,degree=2) points(x,lp_fhat$fitted,type='l',lwd=2,col=3) SS_fhat<-smooth.spline(x,y,cv=TRUE) points(x,ss_fhat$y,type='l',lwd=2,col=4) y x Local polynomial and smoothing splines perform best. Regression (cubic) splines can be implemented using the package splines and specifying either the knots or the effective degrees of freedom (in this case equispaced knots are assumed). The function ns build the matrix G we have seen in the lecture notes, than we can simply use lm to fit the model. library(splines) reg_mod<-lm(y~ns(x,df=8)) points(x,reg_mod$fitted.values,type='l',lwd=2) Imagine now you are confronted with a more complicated function which contains some local feature you want to describe in your model: set.seed(1122) # so we all get the same 'random' data x<-seq(0,1,len=100) # equally spaced grid of predictor values true.func<-sin(2*pi*x)+2*x # defines the true (unkown) function f 4
5 true.func[71:80]<-3*sin(2*seq(0,pi,len=10)) y<-true.func+rnorm(length(x),0,0.3) # add random i.i.d errors points(x,true.func,type='l',lwd=2) # plot the true function superimposed to the observed reg_mod<-lm(y~ns(x,df=8)) points(x,reg_mod$fitted.values,type='l',lwd=2) As you can see, we are completely missing the local feature in the fit. We would like therefore to place additional knots between x = 0.7 and x = 0.8. knots<-c(0.2,0.4,0.6,0.7,0.72,0.74,0.75,0.76,0.77,0.8) reg_mod<-lm(y~ns(x,knots=knots)) points(x,reg_mod$fitted.values,type='l',lwd=2) Try to fit this curve using a smoothing spline. What happens? To get an accurate fit for the local feature, we are forced to choose lambda small enough that in the rest of the domain we get a very irregular curve. 2. Cars data We have seen in the first part of the course the dataset cars that which is provided by R and contains speeds and stopping distances for a set of cars. The aim was to predict the stopping time (dist) from the speed (speed). However, neither linear models nor generalized linear models provided a completely satisfactory fit for the data. We try now fitting a nonparametric regression. data(cars) attach(cars) plot(speed,dist) Fit a regression function using smoothing spline and choosing the smoothing parameter using generalized cross-validation. What is an appropriate choice of λ in this case? What are the effective degrees of freedom of the selected model? lambda = and the effective degrees of freedom are cars_fit<-smooth.spline(speed,dist,cv=false) cars_fit$df ## [1] cars_fit$lambda ## [1] plot(speed,dist) points(cars_fit$x,cars_fit$y,type='l') 5
6 dist speed Compare the fitted curve with what you would obtain from a linear or quadratic parametric model. cars_mod<-lm(dist~speed) cars_mod2<-lm(dist~speed+i(speed^2)) plot(speed,dist) points(speed,cars_mod$fitted.values,type='l',lwd=2,col=1) points(speed,cars_mod2$fitted.values,type='l',lwd=2,col=2) points(cars_fit$x,cars_fit$y,type='l',lwd=2,col=4) 6
7 dist speed As suggested by the number of effective degrees of freedom, the nonparametric model selects an intermediate choice between a linear and a quadratic model. 3. Signature acceleration data In a neurophysiological study, researchers put an accelerometer on the index finger of the participants when they are asked to write their signature. The researchers need first to estimate the acceleration as function of time for each participant. The file signature.txt on the course website contains the data for the first participant of the experiment. data<-read.table("signature.txt",header=true) attach(data) plot(time, acceleration) Looking at the plot, which method should be preferred among the ones we have considered in this practical? Why? The prominent presence of a localized feature around 0.75 seconds suggests the use of regression splines. Fit the nonparametric regression curve and find the value of the estimated value and position of the acceleration peak. We fit the curve using regression splines and choosing the knots so that they are dense around the localized feature at time We realize that an additional knot is also needed at the beginning to catch the rapid increase of the acceleration. If we use the predict command to evaluate the function on a finer grid, we find that the peak is at seconds and its estimated value is
8 knots<-c(0.05,0.1,0.4,0.6,0.7,0.72,0.74,0.75,0.76,0.77,0.8) sig_mod<-lm(acceleration~ns(time,knots=knots)) plot(time,acceleration) points(time,sig_mod$fitted.values,type='l',lwd=2) acceleration time 4. A simple additive model We see now how to fit a simple additive model. The file ozone_data contains 330 observations of the concentration of ozone, a measure of the pressure gradient and the day of the year in which the measurement have been taken. The aim is to fit an additive model for the concentration of ozone. You may need to install the package gam first. library(gam) data<-read.table("ozone_data.txt",header=true) attach(data) ozone_model<-gam(ozone~0+lo(pressure_grad,span=0.5,degree=2) + +lo(day,span=0.5,degree=2)) The lo function specifies that the predictor has to be smoothed with a local regression with the chosen degree and bandwidth. Then the function gam fits the additive model using the backfitting algorithm. What is the algebraic form of the model? Let Y i be the concentration of ozone for the i th observation, P i the correspondent pressure gradient and D i the day. The algebraic form of the model is Y i = f 1 (P i ) + f 2 (D i ) + ε i, 8
9 where ε i are i.i.d errors with zero mean and variance σ 2 and f 1 and f 2 are unknown regression functions. If we had use the formula ozone~ lo(pressure_grad,span=0.5,degree=2)+..., R would have included an intercept in the model. What would have been its algebraic form in this case? Y i = β 0 + f 1 (P i ) + f 2 (D i ) + ε i. We can now evaluate the fit by plotting the estimated regression functions and the marginal residuals (the difference between the observations and the other regression function): plot(ozone_model,residuals=true) Is the smoothing appropriate? Try changing the bandwidth in the lo function. The smoothing appears reasonable, it may be possible to slightly shorten the bandwidth for the pressure gradient. Alternatively, it is possible to use smoothing splines: ozone_spl<-gam(ozone~s(pressure_grad,df=3)+s(day,df=3)) #or ozone_sp2<-gam(ozone~s(pressure_grad,spar = 0.5)+s(day,spar=0.5)) 9
The pspline Package. August 4, Author S original by Jim Ramsey R port by Brian Ripley
The pspline Package August 4, 2004 Version 1.0-8 Date 2004-08-04 Title Penalized Smoothing Splines Author S original by Jim Ramsey . R port by Brian Ripley .
More informationLecture 17: Smoothing splines, Local Regression, and GAMs
Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections 7.5-7 STATS 202: Data mining and analysis November 6, 2017 1 / 24 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want
More informationMoving Beyond Linearity
Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial
More informationSTAT 705 Introduction to generalized additive models
STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear
More informationNonparametric Approaches to Regression
Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)
More informationEconomics Nonparametric Econometrics
Economics 217 - Nonparametric Econometrics Topics covered in this lecture Introduction to the nonparametric model The role of bandwidth Choice of smoothing function R commands for nonparametric models
More informationLecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010
Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors
More informationThis is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or
STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible
More informationGeneralized Additive Models
Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.
More informationLast time... Bias-Variance decomposition. This week
Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional
More informationLecture 16: High-dimensional regression, non-linear regression
Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationNonparametric regression using kernel and spline methods
Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested
More informationHW 10 STAT 472, Spring 2018
HW 10 STAT 472, Spring 2018 1) (0 points) Do parts (a), (b), (c), and (e) of Exercise 2 on p. 298 of ISL. 2) (0 points) Do Exercise 3 on p. 298 of ISL. 3) For this problem, you can merely submit the things
More informationA popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines
A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this
More informationHW 10 STAT 672, Summer 2018
HW 10 STAT 672, Summer 2018 1) (0 points) Do parts (a), (b), (c), and (e) of Exercise 2 on p. 298 of ISL. 2) (0 points) Do Exercise 3 on p. 298 of ISL. 3) For this problem, try to use the 64 bit version
More informationSplines and penalized regression
Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,
More informationSplines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)
Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function
More informationGeneralized Additive Model
Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1
More informationNon-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel
Non-Linear Regression Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Understanding the need for non-parametric regressions 2 Familiarizing with two common
More informationLecture 7: Splines and Generalized Additive Models
Lecture 7: and Generalized Additive Models Computational Statistics Thierry Denœux April, 2016 Introduction Overview Introduction Simple approaches Polynomials Step functions Regression splines Natural
More informationMoving Beyond Linearity
Moving Beyond Linearity The truth is never linear! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! But often
More informationNonparametric Regression and Generalized Additive Models Part I
SPIDA, June 2004 Nonparametric Regression and Generalized Additive Models Part I Robert Andersen McMaster University Plan of the Lecture 1. Detecting nonlinearity Fitting a linear model to a nonlinear
More informationNONPARAMETRIC REGRESSION TECHNIQUES
NONPARAMETRIC REGRESSION TECHNIQUES C&PE 940, 28 November 2005 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and other resources available at: http://people.ku.edu/~gbohling/cpe940
More informationLecture 7: Linear Regression (continued)
Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions
More informationNonparametric Regression
Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis
More informationCurve fitting using linear models
Curve fitting using linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark September 28, 2012 1 / 12 Outline for today linear models and basis functions polynomial regression
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationNONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR
NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR J. D. Maca July 1, 1997 Abstract The purpose of this manual is to demonstrate the usage of software for
More informationSTA 414/2104 S: February Administration
1 / 16 Administration HW 2 posted on web page, due March 4 by 1 pm Midterm on March 16; practice questions coming Lecture/questions on Thursday this week Regression: variable selection, regression splines,
More informationGeneralized additive models I
I Patrick Breheny October 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/18 Introduction Thus far, we have discussed nonparametric regression involving a single covariate In practice, we often
More informationGoals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)
SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple
More information3 Nonlinear Regression
3 Linear models are often insufficient to capture the real-world phenomena. That is, the relation between the inputs and the outputs we want to be able to predict are not linear. As a consequence, nonlinear
More informationInstance-based Learning
Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 19 th, 2007 2005-2007 Carlos Guestrin 1 Why not just use Linear Regression? 2005-2007 Carlos Guestrin
More informationCSC 411: Lecture 02: Linear Regression
CSC 411: Lecture 02: Linear Regression Raquel Urtasun & Rich Zemel University of Toronto Sep 16, 2015 Urtasun & Zemel (UofT) CSC 411: 02-Regression Sep 16, 2015 1 / 16 Today Linear regression problem continuous
More informationNonparametric Regression and Cross-Validation Yen-Chi Chen 5/27/2017
Nonparametric Regression and Cross-Validation Yen-Chi Chen 5/27/2017 Nonparametric Regression In the regression analysis, we often observe a data consists of a response variable Y and a covariate (this
More informationIntroduction to R. Hao Helen Zhang. Fall Department of Mathematics University of Arizona
Department of Mathematics University of Arizona hzhang@math.aricona.edu Fall 2019 What is R R is the most powerful and most widely used statistical software Video: A language and environment for statistical
More informationA review of spline function selection procedures in R
Matthias Schmid Department of Medical Biometry, Informatics and Epidemiology University of Bonn joint work with Aris Perperoglou on behalf of TG2 of the STRATOS Initiative September 1, 2016 Introduction
More information3 Nonlinear Regression
CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic
More informationIntroduction to ANSYS DesignXplorer
Lecture 4 14. 5 Release Introduction to ANSYS DesignXplorer 1 2013 ANSYS, Inc. September 27, 2013 s are functions of different nature where the output parameters are described in terms of the input parameters
More informationLecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017
Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last
More informationNonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni
Nonparametric Risk Attribution for Factor Models of Portfolios October 3, 2017 Kellie Ottoboni Outline The problem Page 3 Additive model of returns Page 7 Euler s formula for risk decomposition Page 11
More informationSection 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business
Section 3.4: Diagnostics and Transformations Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Regression Model Assumptions Y i = β 0 + β 1 X i + ɛ Recall the key assumptions
More informationInstance-Based Learning: Nearest neighbor and kernel regression and classificiation
Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each
More informationAssessing the Quality of the Natural Cubic Spline Approximation
Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,
More informationStat 8053, Fall 2013: Additive Models
Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An
More informationHomework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:
Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression 3.0-3.2 Pod-cast lecture on-line Next lectures: I posted a rough plan. It is flexible though so please come with suggestions Bayes
More informationLinear Regression and K-Nearest Neighbors 3/28/18
Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,
More informationInteractive Graphics. Lecture 9: Introduction to Spline Curves. Interactive Graphics Lecture 9: Slide 1
Interactive Graphics Lecture 9: Introduction to Spline Curves Interactive Graphics Lecture 9: Slide 1 Interactive Graphics Lecture 13: Slide 2 Splines The word spline comes from the ship building trade
More informationFour equations are necessary to evaluate these coefficients. Eqn
1.2 Splines 11 A spline function is a piecewise defined function with certain smoothness conditions [Cheney]. A wide variety of functions is potentially possible; polynomial functions are almost exclusively
More informationLecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010
Lecture 8, Ceng375 Numerical Computations at December 9, 2010 Computer Engineering Department Çankaya University 8.1 Contents 1 2 3 8.2 : These provide a more efficient way to construct an interpolating
More informationGeneralized Additive Models
:p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1
More informationGAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.
GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential
More informationPerceptron as a graph
Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2
More informationEECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines
EECS 556 Image Processing W 09 Interpolation Interpolation techniques B splines What is image processing? Image processing is the application of 2D signal processing methods to images Image representation
More informationLecture 9: Introduction to Spline Curves
Lecture 9: Introduction to Spline Curves Splines are used in graphics to represent smooth curves and surfaces. They use a small set of control points (knots) and a function that generates a curve through
More informationInstance-Based Learning: Nearest neighbor and kernel regression and classificiation
Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each
More informationSmooth Curve from noisy 2-Dimensional Dataset
Smooth Curve from noisy 2-Dimensional Dataset Avik Kumar Mahata 1, Utpal Borah 2,, Aravind Da Vinci 3, B.Ravishankar 4, Shaju Albert 5 1,4 Material Science and Engineering, National Institute of Technology,
More informationModel Inference and Averaging. Baging, Stacking, Random Forest, Boosting
Model Inference and Averaging Baging, Stacking, Random Forest, Boosting Bagging Bootstrap Aggregating Bootstrap Repeatedly select n data samples with replacement Each dataset b=1:b is slightly different
More informationVideo 11.1 Vijay Kumar. Property of University of Pennsylvania, Vijay Kumar
Video 11.1 Vijay Kumar 1 Smooth three dimensional trajectories START INT. POSITION INT. POSITION GOAL Applications Trajectory generation in robotics Planning trajectories for quad rotors 2 Motion Planning
More informationNonlinearity and Generalized Additive Models Lecture 2
University of Texas at Dallas, March 2007 Nonlinearity and Generalized Additive Models Lecture 2 Robert Andersen McMaster University http://socserv.mcmaster.ca/andersen Definition of a Smoother A smoother
More informationSmoothing Scatterplots Using Penalized Splines
Smoothing Scatterplots Using Penalized Splines 1 What do we mean by smoothing? Fitting a "smooth" curve to the data in a scatterplot 2 Why would we want to fit a smooth curve to the data in a scatterplot?
More informationSplines. Chapter Smoothing by Directly Penalizing Curve Flexibility
Chapter 7 Splines 7.1 Smoothing by Directly Penalizing Curve Flexibility Let s go back to the problem of smoothing one-dimensional data. We imagine, that is to say, that we have data points (x 1, y 1 ),(x
More informationGoing nonparametric: Nearest neighbor methods for regression and classification
Going nonparametric: Nearest neighbor methods for regression and classification STAT/CSE 46: Machine Learning Emily Fox University of Washington May 3, 208 Locality sensitive hashing for approximate NN
More informationA toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.
A toolbo of smooths Simon Wood Mathematical Sciences, University of Bath, U.K. Smooths for semi-parametric GLMs To build adequate semi-parametric GLMs requires that we use functions with appropriate properties.
More informationEdge and local feature detection - 2. Importance of edge detection in computer vision
Edge and local feature detection Gradient based edge detection Edge detection by function fitting Second derivative edge detectors Edge linking and the construction of the chain graph Edge and local feature
More informationPackage slp. August 29, 2016
Version 1.0-5 Package slp August 29, 2016 Author Wesley Burr, with contributions from Karim Rahim Copyright file COPYRIGHTS Maintainer Wesley Burr Title Discrete Prolate Spheroidal
More informationLOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave.
LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave. http://en.wikipedia.org/wiki/local_regression Local regression
More informationPackage RLRsim. November 4, 2016
Type Package Package RLRsim November 4, 2016 Title Exact (Restricted) Likelihood Ratio Tests for Mixed and Additive Models Version 3.1-3 Date 2016-11-03 Maintainer Fabian Scheipl
More informationTopics in Machine Learning
Topics in Machine Learning Gilad Lerman School of Mathematics University of Minnesota Text/slides stolen from G. James, D. Witten, T. Hastie, R. Tibshirani and A. Ng Machine Learning - Motivation Arthur
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationSandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing
Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications
More informationComputational Physics PHYS 420
Computational Physics PHYS 420 Dr Richard H. Cyburt Assistant Professor of Physics My office: 402c in the Science Building My phone: (304) 384-6006 My email: rcyburt@concord.edu My webpage: www.concord.edu/rcyburt
More informationSupplementary Figure 1. Decoding results broken down for different ROIs
Supplementary Figure 1 Decoding results broken down for different ROIs Decoding results for areas V1, V2, V3, and V1 V3 combined. (a) Decoded and presented orientations are strongly correlated in areas
More informationSee the course website for important information about collaboration and late policies, as well as where and when to turn in assignments.
COS Homework # Due Tuesday, February rd See the course website for important information about collaboration and late policies, as well as where and when to turn in assignments. Data files The questions
More informationPackage lmesplines. R topics documented: February 20, Version
Version 1.1-10 Package lmesplines February 20, 2015 Title Add smoothing spline modelling capability to nlme. Author Rod Ball Maintainer Andrzej Galecki
More information15.10 Curve Interpolation using Uniform Cubic B-Spline Curves. CS Dept, UK
1 An analysis of the problem: To get the curve constructed, how many knots are needed? Consider the following case: So, to interpolate (n +1) data points, one needs (n +7) knots,, for a uniform cubic B-spline
More informationStat 4510/7510 Homework 6
Stat 4510/7510 1/11. Stat 4510/7510 Homework 6 Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details so that
More informationEdge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels
Edge Detection Edge detection Convert a 2D image into a set of curves Extracts salient features of the scene More compact than pixels Origin of Edges surface normal discontinuity depth discontinuity surface
More informationChapter 5: Basis Expansion and Regularization
Chapter 5: Basis Expansion and Regularization DD3364 April 1, 2012 Introduction Main idea Moving beyond linearity Augment the vector of inputs X with additional variables. These are transformations of
More information1D Regression. i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is...
1D Regression i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize: to get. The set of all possible functions is... 1 Non-linear problems What if the underlying function is
More informationInterpolation - 2D mapping Tutorial 1: triangulation
Tutorial 1: triangulation Measurements (Zk) at irregular points (xk, yk) Ex: CTD stations, mooring, etc... The known Data How to compute some values on the regular spaced grid points (+)? The unknown data
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Basis Functions Tom Kelsey School of Computer Science University of St Andrews http://www.cs.st-andrews.ac.uk/~tom/ tom@cs.st-andrews.ac.uk Tom Kelsey ID5059-02-BF 2015-02-04
More informationPackage SiZer. February 19, 2015
Version 0.1-4 Date 2011-3-21 Title SiZer: Significant Zero Crossings Package SiZer February 19, 2015 Author Derek Sonderegger Maintainer Derek Sonderegger
More informationCPSC 340: Machine Learning and Data Mining. More Regularization Fall 2017
CPSC 340: Machine Learning and Data Mining More Regularization Fall 2017 Assignment 3: Admin Out soon, due Friday of next week. Midterm: You can view your exam during instructor office hours or after class
More informationNonparametric Mixed-Effects Models for Longitudinal Data
Nonparametric Mixed-Effects Models for Longitudinal Data Zhang Jin-Ting Dept of Stat & Appl Prob National University of Sinagpore University of Seoul, South Korea, 7 p.1/26 OUTLINE The Motivating Data
More informationInstance-based Learning
Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 15 th, 2007 2005-2007 Carlos Guestrin 1 1-Nearest Neighbor Four things make a memory based learner:
More informationME 261: Numerical Analysis Lecture-12: Numerical Interpolation
1 ME 261: Numerical Analysis Lecture-12: Numerical Interpolation Md. Tanver Hossain Department of Mechanical Engineering, BUET http://tantusher.buet.ac.bd 2 Inverse Interpolation Problem : Given a table
More informationWatershed Sciences 4930 & 6920 GEOGRAPHIC INFORMATION SYSTEMS
HOUSEKEEPING Watershed Sciences 4930 & 6920 GEOGRAPHIC INFORMATION SYSTEMS Quizzes Lab 8? WEEK EIGHT Lecture INTERPOLATION & SPATIAL ESTIMATION Joe Wheaton READING FOR TODAY WHAT CAN WE COLLECT AT POINTS?
More informationStatistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects
A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects Patralekha Bhattacharya Thinkalytics The PDLREG procedure in SAS is used to fit a finite distributed lagged model to time series data
More informationCPSC 695. Methods for interpolation and analysis of continuing surfaces in GIS Dr. M. Gavrilova
CPSC 695 Methods for interpolation and analysis of continuing surfaces in GIS Dr. M. Gavrilova Overview Data sampling for continuous surfaces Interpolation methods Global interpolation Local interpolation
More informationCS 450 Numerical Analysis. Chapter 7: Interpolation
Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80
More information( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value.
Calibration OVERVIEW... 2 INTRODUCTION... 2 CALIBRATION... 3 ANOTHER REASON FOR CALIBRATION... 4 CHECKING THE CALIBRATION OF A REGRESSION... 5 CALIBRATION IN SIMPLE REGRESSION (DISPLAY.JMP)... 5 TESTING
More information1 StatLearn Practical exercise 5
1 StatLearn Practical exercise 5 Exercise 1.1. Download the LA ozone data set from the book homepage. We will be regressing the cube root of the ozone concentration on the other variables. Divide the data
More informationComparison of Linear Regression with K-Nearest Neighbors
Comparison of Linear Regression with K-Nearest Neighbors Rebecca C. Steorts, Duke University STA 325, Chapter 3.5 ISL Agenda Intro to KNN Comparison of KNN and Linear Regression K-Nearest Neighbors vs
More informationNonparametric Regression
1 Nonparametric Regression Given data of the form (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ), we seek an estimate of the regression function g(x) satisfying the model y = g(x) + ε where the noise term satisfies
More informationSpline Models. Introduction to CS and NCS. Regression splines. Smoothing splines
Spline Models Introduction to CS and NCS Regression splines Smoothing splines 3 Cubic Splines a knots: a< 1 < 2 < < m
More information99 International Journal of Engineering, Science and Mathematics
Journal Homepage: Applications of cubic splines in the numerical solution of polynomials Najmuddin Ahmad 1 and Khan Farah Deeba 2 Department of Mathematics Integral University Lucknow Abstract: In this
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationStatistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2
Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Weijie Cai, SAS Institute Inc., Cary NC July 1, 2008 ABSTRACT Generalized additive models are useful in finding predictor-response
More information