Package robustgam. February 20, 2015

Similar documents
Package robustgam. January 7, 2013

Package robustreg. R topics documented: April 27, Version Date Title Robust Regression Functions

Generalized Additive Models

Package freeknotsplines

Package TVsMiss. April 5, 2018

Package caic4. May 22, 2018

Package rpst. June 6, 2017

Package gamm4. July 25, Index 10

Package rgcvpack. February 20, Index 6. Fitting Thin Plate Smoothing Spline. Fit thin plate splines of any order with user specified knots

Package acebayes. R topics documented: November 21, Type Package

Package flam. April 6, 2018

Package CompGLM. April 29, 2018

Package palmtree. January 16, 2018

Package StVAR. February 11, 2017

Package RAMP. May 25, 2017

Package FWDselect. December 19, 2015

Package ICsurv. February 19, 2015

Package CVR. March 22, 2017

Generalized Additive Models

Package GWRM. R topics documented: July 31, Type Package

Package sparsereg. R topics documented: March 10, Type Package

Package GAMBoost. February 19, 2015

Package MTLR. March 9, 2019

Stat 8053, Fall 2013: Additive Models

Package mrm. December 27, 2016

Package bgeva. May 19, 2017

Package gplm. August 29, 2016

Package flexcwm. May 20, 2018

Package mtsdi. January 23, 2018

Package slp. August 29, 2016

Package optimus. March 24, 2017

Package intccr. September 12, 2017

Package PedCNV. February 19, 2015

Package sprinter. February 20, 2015

Package SOIL. September 20, 2017

Package ordinalnet. December 5, 2017

Package RWiener. February 22, 2017

Package milr. June 8, 2017

Package hgam. February 20, 2015

Package fcr. March 13, 2018

Package robets. March 6, Type Package

Package bigreg. July 25, 2016

Package logspline. February 3, 2016

Package msgps. February 20, 2015

Package abe. October 30, 2017

Package ADMM. May 29, 2018

Package detect. February 15, 2013

Package endogenous. October 29, 2016

Package inca. February 13, 2018

Package sbf. R topics documented: February 20, Type Package Title Smooth Backfitting Version Date Author A. Arcagni, L.

Package GeDS. December 19, 2017

Package truncreg. R topics documented: August 3, 2016

Package libstabler. June 1, 2017

Package listdtr. November 29, 2016

Package OLScurve. August 29, 2016

Package mcemglm. November 29, 2015

Package gee. June 29, 2015

Package SSLASSO. August 28, 2018

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering

Package assortnet. January 18, 2016

Package pampe. R topics documented: November 7, 2015

Package MSwM. R topics documented: February 19, Type Package Title Fitting Markov Switching Models Version 1.

Package SAM. March 12, 2019

Package vinereg. August 10, 2018

Package KRMM. R topics documented: June 3, Type Package Title Kernel Ridge Mixed Model Version 1.0 Author Laval Jacquin [aut, cre]

Package geogrid. August 19, 2018

Nonparametric Mixed-Effects Models for Longitudinal Data

Package Funclustering

Chapter 10: Extensions to the GLM

Package cwm. R topics documented: February 19, 2015

Package sgd. August 29, 2016

Discussion Notes 3 Stepwise Regression and Model Selection

Package ssmn. R topics documented: August 9, Type Package. Title Skew Scale Mixtures of Normal Distributions. Version 1.1.

Package SeleMix. R topics documented: November 22, 2016

Package survivalmpl. December 11, 2017

Package simex. R topics documented: September 7, Type Package Version 1.7 Date Imports stats, graphics Suggests mgcv, nlme, MASS

Package plsrbeta. February 20, 2015

Package structree. R topics documented: August 1, 2017

Package FMsmsnReg. March 30, 2016

Package sspline. R topics documented: February 20, 2015

Package pnmtrem. February 20, Index 9

Package GLDreg. February 28, 2017

Package SparseFactorAnalysis

Package intcensroc. May 2, 2018

Package subsemble. February 20, 2015

Package OptimaRegion

Package rfsa. R topics documented: January 10, Type Package

Package snn. August 23, 2015

Package coxsei. February 24, 2015

Package lmesplines. R topics documented: February 20, Version

Package sprm. February 22, 2016

Package cgh. R topics documented: February 19, 2015

Package citools. October 20, 2018

Package edrgraphicaltools

Package ordinalclust

Generalized additive models I

Package cumseg. February 19, 2015

Package DTRreg. August 30, 2017

Package GauPro. September 11, 2017

Transcription:

Type Package Package robustgam February 20, 2015 Title Robust Estimation for Generalized Additive Models Version 0.1.7 Date 2013-5-7 Author Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee Maintainer Raymond K. W. Wong <raymondkww.dev@gmail.com> Description This package provides robust estimation for generalized additive models. It implements a fast and stable algorithm in Wong, Yao and Lee (2013). The implementation also contains three automatic selection methods for smoothing parameter. They are designed to be robust to outliers. For more details, see Wong, Yao and Lee (2013). License GPL (>= 2) Depends Rcpp (>= 0.9.13), RcppArmadillo (>= 0.3.4.4), mgcv (>= 1.7-20), robustbase (>= 0.9-3) LinkingTo Rcpp, RcppArmadillo NeedsCompilation yes Repository CRAN Date/Publication 2014-01-02 19:18:27 R topics documented: robustgam-package..................................... 2 cplot.robustgam....................................... 2 pred.robustgam....................................... 4 robustgam.......................................... 6 robustgam.cv........................................ 8 robustgam.gic....................................... 11 robustgam.gic.optim.................................... 14 Index 18 1

2 cplot.robustgam robustgam-package Robust Estimation for Generalized Additive Models Description Details This package provides robust estimation for generalized additive models. It implements a fast and stable algorithm in Wong, Yao and Lee (2013). The implementation also contains three automatic selection methods for smoothing parameter. They are designed to be robust to outliers. For more details, see Wong, Yao and Lee (2013). Package: robustgam Type: Package Version: 0.1.5 Date: 2013-1-6 License: GPL (>= 2) Author(s) Raymond K. W. Wong Maintainer: Raymond K. W. Wong <raymondkww.dev@gmail.com> References Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear. Examples # see example in function robustgam cplot.robustgam Plot of Component Smooth Functions from robustgam fit Description This function plots the component smooth functions. Usage cplot.robustgam(fit, ranges, len=100)

cplot.robustgam 3 Arguments fit ranges len the robustgam fit matrix with each column containing the range of predictor in each plot grid size for the plot; it is used for all component smooth functions Author(s) Raymond K. W. Wong <raymondkww.dev@gmail.com> References Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear. See Also robustgam.gic, robustgam.gic.optim, robustgam.cv, pred.robustgam Examples # load library library(robustgam) # test function test.fun <- function(x,...) { return(2*sin(2*pi*(1-x)^2)) } # some setting set.seed(1234) true.family <- poisson() out.prop <- 0.05 n <- 100 # generating dataset for poisson case x <- runif(n) x <- x[order(x)] true.eta <- test.fun(x) true.mu <- true.family$linkinv(test.fun(x)) y <- rpois(n, true.mu) # for poisson case # create outlier for poisson case out.n <- trunc(n*out.prop) out.list <- sample(1:n, out.n, replace=false) y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,true))) # robust GAM fit robustfit <- robustgam(x, y, family=true.family, p=3, c=1.6, sp=0.000143514, show.msg=false, smooth.basis= tp ) ## the smoothing parameter is selected by the RBIC, the command is: # robustfit.gic <- robustgam.gic.optim(x, y, family=true.family, p=3, c=1.6, show.msg=false,

4 pred.robustgam # count.lim=400, smooth.basis= tp, lsp.initial=log(1e-2),lsp.min=-15, lsp.max=10, # gic.constant=log(n), method="l-bfgs-b"); robustfit <- robustfit.gic$optim.fit # ordinary GAM fit nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for tp # prediction x.new <- seq(range(x)[1], range(x)[2], len=1000) robustfit.new <- pred.robustgam(robustfit, data.frame(x=x.new))$predict.values nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response")) # plot plot(x, y) lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue") lines(x.new, robustfit.new, col="red") lines(x.new, nonrobustfit.new, col="green") legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"), lty=c(1,1,1)) # plot component smooth function cplot.robustgam(robustfit, ranges=range(x)) pred.robustgam Prediction method for robustgam Description A prediction function for output of robustgam. Usage pred.robustgam(fit, data, type="response") Arguments fit data type fit object of robustgam a data.frame object. Call the variable X if there is only one covariate. Otherwise, call the first variable X1, the second one X2, and so on. type of output Value predict.comp a matrix containing the individual additive components for each covariates. predict.values the type of output required by type. For example, if type="response", the predicted values of the data is outputed.

pred.robustgam 5 Author(s) Raymond K. W. Wong <raymondkww.dev@gmail.com> References Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear. See Also robustgam Examples # load library library(robustgam) # test function test.fun <- function(x,...) { return(2*sin(2*pi*(1-x)^2)) } # some setting set.seed(1234) true.family <- poisson() out.prop <- 0.05 n <- 100 # generating dataset for poisson case x <- runif(n) x <- x[order(x)] true.eta <- test.fun(x) true.mu <- true.family$linkinv(test.fun(x)) y <- rpois(n, true.mu) # for poisson case # create outlier for poisson case out.n <- trunc(n*out.prop) out.list <- sample(1:n, out.n, replace=false) y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,true))) # robust GAM fit robustfit <- robustgam(x, y, family=true.family, p=3, c=1.6, sp=0.000143514, show.msg=false, smooth.basis= tp ) # ordinary GAM fit nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for tp # prediction x.new <- seq(range(x)[1], range(x)[2], len=1000) robustfit.new <- pred.robustgam(robustfit, data.frame(x=x.new))$predict.values nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response"))

6 robustgam # plot plot(x, y) lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue") lines(x.new, robustfit.new, col="red") lines(x.new, nonrobustfit.new, col="green") legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"), lty=c(1,1,1)) robustgam Pseudo Data Algorithm for Robust Estmation of GAMs Description This function implements a fast and stable algorithm developed in Wong, Yao and Lee (2013) for robust estimation of generalized additive models. Currently, this implementation only covers binomial and poisson distributions. This function does not choose smoothing parameter by itself and the user has to specify it manunally. For implementation with automatic smoothing parameter selection, see robustgam.gic, robustgam.gic.optim, robustgam.cv. For prediction, see pred.robustgam. Usage robustgam(x, y, family, p=3, K=30, c=1.345, sp=-1, show.msg=false, count.lim=200, w.count.lim=50, smooth.basis="tp", wx=false) Arguments X y family p K c sp show.msg count.lim w.count.lim smooth.basis wx a vector or a matrix (each covariate form a column) of covariates a vector of responses A family object specifying the distribution and the link function. See glm and family. order of the basis. It depends on the option of smooth.basis. number of knots of the basis; dependent on the option of smooth.basis. tunning parameter for Huber function; a smaller value of c corresponds to a more robust fit. It is recommended to set as 1.2 and 1.6 for binomial and poisson distribution respectively. a vector of smoothing parameter. If only one value is specified, it will be used for all smoothing parameters. If show.msg=t, progress is displayed. maximum number of iterations of the whole algorithm maximum number of updates on the weight. It corresponds to zeta in Wong, Yao and Lee (2013) the specification of basis. Four choices are available: "tp" = thin plate regression spline, "cr" = cubic regression spline, "ps" = P-splines, "tr" = truncated power spline. For more details, see smooth.construct. If wx=t, robust weight on the covariates are applied. For details, see Real Data Example in Wong, Yao and Lee (2013)

robustgam 7 Value fitted.values fitted values initial.fitted the starting values of the algorithm beta B sd basis converge w family wx beta.fit Author(s) estimated coefficients (corresponding to the basis) the basis: fitted linear estimator = B%\*%beta for internal use the smooth construct object. For more details, see smooth.construct If converge=t, the algorithm converged. for internal use the family object Indicate whether robust weight on covariates is applied for internal use Raymond K. W. Wong <raymondkww.dev@gmail.com> References Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear. See Also robustgam.gic, robustgam.gic.optim, robustgam.cv, pred.robustgam Examples # load library library(robustgam) # test function test.fun <- function(x,...) { return(2*sin(2*pi*(1-x)^2)) } # some setting set.seed(1234) true.family <- poisson() out.prop <- 0.05 n <- 100 # generating dataset for poisson case x <- runif(n) x <- x[order(x)] true.eta <- test.fun(x) true.mu <- true.family$linkinv(test.fun(x))

8 robustgam.cv y <- rpois(n, true.mu) # for poisson case # create outlier for poisson case out.n <- trunc(n*out.prop) out.list <- sample(1:n, out.n, replace=false) y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,true))) # robust GAM fit robustfit <- robustgam(x, y, family=true.family, p=3, c=1.6, sp=0.000143514, show.msg=false, smooth.basis= tp ) ## the smoothing parameter is selected by the RBIC, the command is: # robustfit.gic <- robustgam.gic.optim(x, y, family=true.family, p=3, c=1.6, show.msg=false, # count.lim=400, smooth.basis= tp, lsp.initial=log(1e-2),lsp.min=-15, lsp.max=10, # gic.constant=log(n), method="l-bfgs-b"); robustfit <- robustfit.gic$optim.fit # ordinary GAM fit nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for tp # prediction x.new <- seq(range(x)[1], range(x)[2], len=1000) robustfit.new <- pred.robustgam(robustfit, data.frame(x=x.new))$predict.values nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response")) # plot plot(x, y) lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue") lines(x.new, robustfit.new, col="red") lines(x.new, nonrobustfit.new, col="green") legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"), lty=c(1,1,1)) robustgam.cv Smoothing parameter selection by robust cross validation Description This function combines the robustgam with automatic smoothing parameter selection. The smoothing parameter is selected through robust cross validation criterion described in Wong, Yao and Lee (2013). The criterion is designed to be robust to outliers. This function uses grid search to find the smoothing parameter that minimizes the criterion. Usage robustgam.cv(x, y, family, p=3, K=30, c=1.345, show.msg=false, count.lim=200, w.count.lim=50, smooth.basis="tp", wx=false, sp.min=1e-7, sp.max=1e-3, len=50, show.msg.2=true, ngroup=length(y), seed=12345)

robustgam.cv 9 Arguments X y family p K c show.msg count.lim w.count.lim smooth.basis wx sp.min sp.max len show.msg.2 ngroup seed a vector or a matrix (each covariate form a column) of covariates a vector of responses A family object specifying the distribution and the link function. See glm and family. order of the basis. It depends on the option of smooth.basis. number of knots of the basis; dependent on the option of smooth.basis. tunning parameter for Huber function; a smaller value of c corresponds to a more robust fit. It is recommended to set as 1.2 and 1.6 for binomial and poisson distribution respectively. If show.msg=t, progress of robustgam is displayed. maximum number of iterations of the whole algorithm maximum number of updates on the weight. It corresponds to zeta in Wong, Yao and Lee (2013) the specification of basis. Four choices are available: "tp" = thin plate regression spline, "cr" = cubic regression spline, "ps" = P-splines, "tr" = truncated power spline. For more details, see smooth.construct. If wx=t, robust weight on the covariates are applied. For details, see Real Data Example in Wong, Yao and Lee (2013) A vector of minimum values of the searching range for smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. A vector of maximum values of the searching range for smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. A vector of grid sizes. If only one value is specified, it will be used for all smoothing parameters. If show.msg.2=t, progress of the grid search is displayed. number of group used in the cross validation. If ngroup=length(y), full cross validation is implemented. If ngroup=2, two-fold cross validation is implemented. The seed for random generator used in generating partitions. Value fitted.values fitted values (of the optimum fit) initial.fitted the starting values of the algorithm (of the optimum fit) beta optim.index estimated coefficients (corresponding to the basis) (of the optimum fit) the index of the optimum fit optim.index2 the index of the optimum fit in another representation: optim.ndex2=arrayind(optim.index,len) optim.criterion the optimum value of robust cross validation criterion

10 robustgam.cv optim.sp criteria sp optim.fit the optimum value of the smoothing parameter the values of criteria for all fits during grid search the grid of smoothing parameter the robustgam fit object of the optimum fit. It is handy for applying the prediction method. Author(s) Raymond K. W. Wong <raymondkww.dev@gmail.com> References Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear. See Also robustgam.gic, robustgam.gic.optim, robustgam.cv, pred.robustgam Examples # load library library(robustgam) # test function test.fun <- function(x,...) { return(2*sin(2*pi*(1-x)^2)) } # some setting set.seed(1234) true.family <- poisson() out.prop <- 0.05 n <- 100 # generating dataset for poisson case x <- runif(n) x <- x[order(x)] true.eta <- test.fun(x) true.mu <- true.family$linkinv(test.fun(x)) y <- rpois(n, true.mu) # for poisson case # create outlier for poisson case out.n <- trunc(n*out.prop) out.list <- sample(1:n, out.n, replace=false) y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,true))) ## Not run: # robust GAM fit robustfit.gic <- robustgam.cv(x, y, family=true.family, p=3, c=1.6, show.msg=false,

robustgam.gic 11 count.lim=400, smooth.basis= tp,ngroup=5); robustfit <- robustfit.gic$optim.fit # ordinary GAM fit nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for tp # prediction x.new <- seq(range(x)[1], range(x)[2], len=1000) robustfit.new <- pred.robustgam(robustfit, data.frame(x=x.new))$predict.values nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response")) # plot plot(x, y) lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue") lines(x.new, robustfit.new, col="red") lines(x.new, nonrobustfit.new, col="green") legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"), lty=c(1,1,1)) ## End(Not run) robustgam.gic Smoothing parameter selection by GIC (grid search) Description Usage This function combines the robustgam with automatic smoothing parameter selection. The smoothing parameter is selected through generalized information criterion (GIC) described in Wong, Yao and Lee (2013). The GIC is designed to be robust to outliers. There are two criteria for user to choose from: Robust AIC and Robust BIC. The robust BIC usually gives smoother surface than robust AIC. This function uses grid search to find the smoothing parameter that minimizes the criterion. robustgam.gic(x, y, family, p=3, K=30, c=1.345, show.msg=false, count.lim=200, w.count.lim=50,smooth.basis="tp", wx=false, sp.min=1e-7, sp.max=1e-3, len=50, show.msg.2=true, gic.constant=log(length(y))) Arguments X y family p K a vector or a matrix (each covariate form a column) of covariates a vector of responses A family object specifying the distribution and the link function. See glm and family. order of the basis. It depends on the option of smooth.basis. number of knots of the basis; dependent on the option of smooth.basis.

12 robustgam.gic c show.msg count.lim w.count.lim smooth.basis wx sp.min sp.max len show.msg.2 gic.constant tunning parameter for Huber function; a smaller value of c corresponds to a more robust fit. It is recommended to set as 1.2 and 1.6 for binomial and poisson distribution respectively. If show.msg=t, progress of robustgam is displayed. maximum number of iterations of the whole algorithm maximum number of updates on the weight. It corresponds to zeta in Wong, Yao and Lee (2013) the specification of basis. Four choices are available: "tp" = thin plate regression spline, "cr" = cubic regression spline, "ps" = P-splines, "tr" = truncated power spline. For more details, see smooth.construct. If wx=t, robust weight on the covariates are applied. For details, see Real Data Example in Wong, Yao and Lee (2013) A vector of minimum values of the searching range for smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. A vector of maximum values of the searching range for smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. A vector of grid sizes. If only one value is specified, it will be used for all smoothing parameters. If show.msg.2=t, progress of the grid search is displayed. If gic.contant=log(length(y)), robust BIC is used. If gic.constant=2, robust AIC is used. Value fitted.values fitted values (of the optimum fit) initial.fitted the starting values of the algorithm (of the optimum fit) beta optim.index optim.index2 optim.gic optim.sp fit.list gic gic.comp1 gic.comp2 sp gic.constant optim.fit estimated coefficients (corresponding to the basis) (of the optimum fit) the index of the optimum fit the index of the optimum fit in another representation: optim.index2=arrayind(optim.index,len) the optimum value of robust AIC or robust BIC the optimum value of the smoothing parameter a list object containing all fits during the grid search the values of GIC for all fits during grid search for internal use for internal use the grid of smoothing parameter the gic.constant specified in the input the robustgam fit object of the optimum fit. It is handy for applying the prediction method.

robustgam.gic 13 Author(s) Raymond K. W. Wong <raymondkww.dev@gmail.com> References Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear. See Also robustgam.gic, robustgam.gic.optim, robustgam.cv, pred.robustgam Examples # load library library(robustgam) # test function test.fun <- function(x,...) { return(2*sin(2*pi*(1-x)^2)) } # some setting set.seed(1234) true.family <- poisson() out.prop <- 0.05 n <- 100 # generating dataset for poisson case x <- runif(n) x <- x[order(x)] true.eta <- test.fun(x) true.mu <- true.family$linkinv(test.fun(x)) y <- rpois(n, true.mu) # for poisson case # create outlier for poisson case out.n <- trunc(n*out.prop) out.list <- sample(1:n, out.n, replace=false) y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,true))) ## Not run: # robust GAM fit robustfit.gic <- robustgam.gic(x, y, family=true.family, p=3, c=1.6, show.msg=false, count.lim=400, smooth.basis= tp ); robustfit <- robustfit.gic$optim.fit # ordinary GAM fit nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for tp # prediction x.new <- seq(range(x)[1], range(x)[2], len=1000) robustfit.new <- pred.robustgam(robustfit, data.frame(x=x.new))$predict.values

14 robustgam.gic.optim nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response")) # plot plot(x, y) lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue") lines(x.new, robustfit.new, col="red") lines(x.new, nonrobustfit.new, col="green") legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"), lty=c(1,1,1)) ## End(Not run) robustgam.gic.optim Smoothing parameter selection by GIC (by optim) Description Usage This function is the same as robustgam.gic, except that the R internal optimization function optim is used to find the smoothing parameter that minimizes the RAIC or RBIC criterion. robustgam.gic.optim(x, y, family, p=3, K=30, c=1.345, show.msg=false, count.lim=200, w.count.lim=50, smooth.basis="tp", wx=false, lsp.initial=log(1e-4), lsp.min=-20, lsp.max=10, gic.constant=log(length(y)), method="l-bfgs-b",optim.control=list(trace=1)) Arguments X y family p K c show.msg count.lim w.count.lim smooth.basis a vector or a matrix (each covariate form a column) of covariates a vector of responses A family object specifying the distribution and the link function. See glm and family. order of the basis. It depends on the option of smooth.basis. number of knots of the basis; dependent on the option of smooth.basis. tunning parameter for Huber function; a smaller value of c corresponds to a more robust fit. It is recommended to set as 1.2 and 1.6 for binomial and poisson distribution respectively. If show.msg=t, progress of robustgam is displayed. maximum number of iterations of the whole algorithm maximum number of updates on the weight. It corresponds to zeta in Wong, Yao and Lee (2013) the specification of basis. Four choices are available: "tp" = thin plate regression spline, "cr" = cubic regression spline, "ps" = P-splines, "tr" = truncated power spline. For more details, see smooth.construct.

robustgam.gic.optim 15 wx lsp.initial lsp.min lsp.max gic.constant method optim.control If wx=t, robust weight on the covariates are applied. For details, see Real Data Example in Wong, Yao and Lee (2013) A vector of initial values of the log of smoothing parameters used to start the optimization algorithm. a vector of minimum values of the searching range for the log of smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. a vector of maximum values of the searching range for the log of smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. If gic.contant=log(length(y)), robust BIC is used. If gic.constant=2, robust AIC is used. method of optimization. For more details, see optim setting for optim. For more details, see optim Value fitted.values beta beta.fit gic sp gic.optim w gic.constant optim.fit fitted values (of the optimum fit) estimated coefficients (corresponding to the basis) (of the optimum fit) for internal use the optimum value of robust AIC or robust BIC the optimum value of the smoothing parameter the output of optim for internal use the gic.constant specified in the input the robustgam fit object of the optimum fit. It is handy for applying the prediction method. Author(s) Raymond K. W. Wong <raymondkww.dev@gmail.com> References Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear. See Also robustgam.gic, robustgam.gic.optim, robustgam.cv, pred.robustgam

16 robustgam.gic.optim Examples # load library library(robustgam) # test function test.fun <- function(x,...) { return(2*sin(2*pi*(1-x)^2)) } # some setting set.seed(1234) true.family <- poisson() out.prop <- 0.05 n <- 100 # generating dataset for poisson case x <- runif(n) x <- x[order(x)] true.eta <- test.fun(x) true.mu <- true.family$linkinv(test.fun(x)) y <- rpois(n, true.mu) # for poisson case # create outlier for poisson case out.n <- trunc(n*out.prop) out.list <- sample(1:n, out.n, replace=false) y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,true))) ## Not run: # robust GAM fit robustfit.gic <- robustgam.gic.optim(x, y, family=true.family, p=3, c=1.6, show.msg=false, count.lim=400, smooth.basis= tp, lsp.initial=log(1e-2),lsp.min=-15, lsp.max=10, gic.constant=log(n), method="l-bfgs-b"); robustfit <- robustfit.gic$optim.fit # ordinary GAM fit nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for tp # prediction x.new <- seq(range(x)[1], range(x)[2], len=1000) robustfit.new <- pred.robustgam(robustfit, data.frame(x=x.new))$predict.values nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response")) # plot plot(x, y) lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue") lines(x.new, robustfit.new, col="red") lines(x.new, nonrobustfit.new, col="green") legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"), lty=c(1,1,1))

robustgam.gic.optim 17 ## End(Not run)

Index Topic Bounded score function, Generalized information criterion, Generalized linear model, Robust estimating equation, Robust quasi-likelihood, Smoothing parameter selection robustgam-package, 2 cplot.robustgam, 2 family, 6, 9, 11, 14 glm, 6, 9, 11, 14 optim, 14, 15 pred.robustgam, 3, 4, 6, 7, 10, 13, 15 robustgam, 4, 5, 6, 8, 11 robustgam-package, 2 robustgam.cv, 3, 6, 7, 8, 10, 13, 15 robustgam.gic, 3, 6, 7, 10, 11, 13 15 robustgam.gic.optim, 3, 6, 7, 10, 13, 14, 15 smooth.construct, 6, 7, 9, 12, 14 18