The simpleboot Package

Similar documents
Package uclaboot. June 18, 2003

R Programming Basics - Useful Builtin Functions for Statistics

Package complmrob. October 18, 2015

Package GLDreg. February 28, 2017

Applied Regression Modeling: A Business Approach

Fathom Dynamic Data TM Version 2 Specifications

Package mlegp. April 15, 2018

Package lgarch. September 15, 2015

Package hgam. February 20, 2015

The supclust Package

Package robets. March 6, Type Package

Package rknn. June 9, 2015

Package simtool. August 29, 2016

Package logspline. February 3, 2016

Applied Regression Modeling: A Business Approach

Package sparsereg. R topics documented: March 10, Type Package

R practice. Eric Gilleland. 20th May 2015

Package smicd. September 10, 2018

Package fbroc. August 29, 2016

Package abe. October 30, 2017

Package logitnorm. R topics documented: February 20, 2015

Package ICEbox. July 13, 2017

The relaimpo Package

Regression on the trees data with R

Package edrgraphicaltools

Common R commands used in Data Analysis and Statistical Inference

Package sfa. R topics documented: February 20, 2015

Package macorrplot. R topics documented: October 2, Title Visualize artificial correlation in microarray data. Version 1.50.

Package intccr. September 12, 2017

MS&E 226: Small Data

Package CausalImpact

Package ridge. R topics documented: February 15, Title Ridge Regression with automatic selection of the penalty parameter. Version 2.

STATS PAD USER MANUAL

Package mvprobit. November 2, 2015

The mblm Package. August 12, License GPL version 2 or newer. confint.mblm. mblm... 3 summary.mblm...

Package flam. April 6, 2018

Introduction to the Bootstrap and Robust Statistics

Exercise 2.23 Villanova MAT 8406 September 7, 2015

Package twilight. August 3, 2013

Lab #13 - Resampling Methods Econ 224 October 23rd, 2018

The nor1mix Package. August 3, 2006

IPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 16)

Linear Methods for Regression and Shrinkage Methods

Advanced Econometric Methods EMET3011/8014

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 27

Package Rramas. November 25, 2017

Package DTRreg. August 30, 2017

Regression Analysis and Linear Regression Models

Quantitative - One Population

Package ccapp. March 7, 2016

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering

The nor1mix Package. June 12, 2007

1 Lab 1. Graphics and Checking Residuals

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Package leaps. R topics documented: May 5, Title regression subset selection. Version 2.9

Package mlvar. August 26, 2018

The RcmdrPlugin.HH Package

Package SiZer. February 19, 2015

Package sure. September 19, 2017

The BHH2 Package. July 27, 2006

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Package listdtr. November 29, 2016

Package bayesdp. July 10, 2018

36-402/608 HW #1 Solutions 1/21/2010

Package citools. October 20, 2018

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Package enpls. May 14, 2018

Package survivalmpl. December 11, 2017

Package reghelper. April 8, 2017

Bivariate (Simple) Regression Analysis

Package swissmrp. May 30, 2018

Package bootlr. July 13, 2015

Package OptimaRegion

Package caretensemble

Package OLScurve. August 29, 2016

Introduction to Data Science

The quantchem Package

Package glinternet. June 15, 2018

Package fitplc. March 2, 2017

Package pvclust. October 23, 2015

Package svmpath. R topics documented: August 30, Title The SVM Path Algorithm Date Version Author Trevor Hastie

Applied Regression Modeling: A Business Approach

Package lcc. November 16, 2018

More Summer Program t-shirts

Package fso. February 19, 2015

Package areaplot. October 18, 2017

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Package ArCo. November 5, 2017

Package MixSim. April 29, 2017

MS in Applied Statistics: Study Guide for the Data Science concentration Comprehensive Examination. 1. MAT 456 Applied Regression Analysis

Package sspse. August 26, 2018

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:

7 Control Structures, Logical Statements

Package DTRlearn. April 6, 2018

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Package pcapa. September 16, 2016

Package oaxaca. January 3, Index 11

Package biglars. February 19, 2015

Transcription:

The simpleboot Package April 1, 2005 Version 1.1-1 Date 2005-03-31 LazyLoad yes Depends R (>= 2.0.0), boot Title Simple Bootstrap Routines Author <rpeng@jhsph.edu> Maintainer <rpeng@jhsph.edu> Simple bootstrap routines License GNU GPL (version 2 or later) URL http://sandybox.typepad.com/software/ R topics documented: hist.simpleboot....................................... 2 lm.boot........................................... 3 lm.simpleboot.methods................................... 4 loess.boot.......................................... 5 loess.simpleboot.methods.................................. 7 one.boot........................................... 7 pairs.boot.......................................... 9 perc............................................. 10 plot.lm.simpleboot..................................... 11 plot.loess.simpleboot.................................... 11 samples........................................... 12 simpleboot-internal..................................... 13 two.boot........................................... 13 Index 16 1

2 hist.simpleboot hist.simpleboot Histograms for bootstrap sampling distributions. Construct a histogram of the bootstrap distribution of univariate statistic. ## S3 method for class 'simpleboot': hist(x, do.rug = FALSE, xlab = "Bootstrap samples", main = "",...) x do.rug xlab main An object of class "simpleboot" returned from either one.boot, two.boot, or pairs.boot. Should a rug of the bootstrap distribution be plotted under the histogram? The label for the x-axis. The title for the histogram.... Other arguments passed to hist. hist constructs a histogram for the bootstrap distribution of a univariate statistic. It cannot be used with linear model or loess bootstraps. In the histogram a red dotted line is plotted denoting the observed value of the statistic. Nothing is returned. x <- rnorm(100) ## Bootstrap the 75th percentile b <- one.boot(x, quantile, R = 1000, probs = 0.75) hist(b)

lm.boot 3 lm.boot Linear model bootstrap. Bootstrapping of linear model fits (using lm). Bootstrapping can be done by either resampling rows of the original data frame or resampling residuals from the original model fit. lm.boot(lm.object, R, rows = TRUE, new.xpts = NULL, ngrid = 100, weights = NULL) lm.object R rows new.xpts ngrid weights A linear model fit, produced by lm. The number of bootstrap replicates to use. Should we resample rows? Setting rows to FALSE indicates resampling of residuals. s at which you wish to make new predictions. If specified, fitted values from each bootstrap sample will be stored. If new.xpts is NULL and the regression is 2 dimensional, then predictions are made on an evenly spaced grid (containing ngrid points) spanning the range of the predictor values. Reseampling weights; a vector of length equal to the number of observations. Currently, "lm.simpleboot" objects have a simple print method (which shows the original fit), a summary method and a plot method. An object of class "lm.simpleboot" (which is a list) containing the elements: method boot.list orig.lm new.xpts weights Which method of bootstrapping was used (rows or residuals). A list containing values from each of the bootstrap samples. Currently, bootstrapped values are model coefficients, residual sum of squares, R-square, and fitted values for predictions. The original model fit. The locations where predictions were made. The resampling weights. If none were used, this component is NULL See Also The plot.lm.simpleboot method.

4 lm.simpleboot.methods data(airquality) attach(airquality) set.seed(30) lmodel <- lm(ozone ~ Wind) lboot <- lm.boot(lmodel, R = 1000) summary(lboot) ## With weighting w <- runif(nrow(model.frame(lmodel))) lbootw <- lm.boot(lmodel, R = 1000, weights = w) summary(lbootw) ## Resample residuals lboot2 <- lm.boot(lmodel, R = 1000, rows = FALSE) summary(lboot2) lm.simpleboot.methods Methods for linear model bootstrap. Methods for "lm.simpleboot" class objects. ## S3 method for class 'lm.simpleboot': summary(object,...) ## S3 method for class 'summary.lm.simpleboot': print(x,...) ## S3 method for class 'lm.simpleboot': fitted(object,...) object An object of class "lm.simpleboot", as returned by lm.boot. x An object of class "summary.lm.simpleboot", a result of a call to summary.lm.simplebo... Other arguments passed to and from other methods. print is essentially the same as the usual printing of a linear model fit, except the bootstrap standard errors are printed for each model coefficient. fitted returns the fitted values from each bootstrap sample for the predictor values specified by the new.xpts in lm.boot (or from the grid if new.xpts was not specified). This is a p x R matrix where p is the number of points where prediction was desired and R is the number of bootstrap samples specified. Using fitted is the equivalent of using samples(object, name = "fitted").

loess.boot 5 summary returns a list containing the original estimated coefficients and their bootstrap standard errors. See Also lm.boot. data(airquality) attach(airquality) lmodel <- lm(ozone ~ Wind + Solar.R) lboot <- lm.boot(lmodel, R = 300) summary(lboot) loess.boot 2-D Loess bootstrap. Bootstrapping of loess fits produced by the loess function in the modreg package. Bootstrapping can be done by resampling rows from the original data frame or resampling residuals from the original model fit. loess.boot(lo.object, R, rows = TRUE, new.xpts = NULL, ngrid = 100, weights = NULL) lo.object R rows new.xpts ngrid weights A loess fit, produced by loess. The number of bootstrap replicates. Should we resample rows? Setting rows to FALSE indicates resampling of residuals. Locations where new predictions are to be made. If new.xpts is NULL, then an evenly spaced grid spanning the range of X (containing ngrid points) is used. In either case Number of grid points to use if new.xpts is NULL. Resampling weights; a vector with length equal to the number of observations.

6 loess.boot The user can specify locations for new predictions through new.xpts or an evenly spaced grid will be used. In either case, fitted values at each new location will be stored from each bootstrap sample. These fitted values can be retrieved using either the fitted method or the samples function. Note that the loess function has many parameters for the user to set that can be difficult to reproduce in the bootstrap setting. Right now, the user can only specify the span argument to loess in the original fit. An object of class "loess.simpleboot" (which is a list) containing the elements: method boot.list orig.loess new.xpts Which method of bootstrapping was used (rows or residuals). A list containing values from each of the bootstrap samples. Currently, only residual sum of squares and fitted values are stored. The original loess fit. The locations where predictions were made (specified in the original call to loess.boot). set.seed(1234) x <- runif(100) ## Simple sine function simulation y <- sin(2*pi*x) +.2 * rnorm(100) plot(x, y) ## Sine function with noise lo <- loess(y ~ x, span =.4) ## Bootstrap with resampling of rows lo.b <- loess.boot(lo, R = 500) ## Plot original fit with +/- 2 std. errors plot(lo.b) ## Plot all loess bootstrap fits plot(lo.b, all.lines = TRUE) ## Bootstrap with resampling residuals lo.b2 <- loess.boot(lo, R = 500, rows = FALSE) plot(lo.b2)

loess.simpleboot.methods 7 loess.simpleboot.methods Methods for loess bootstrap. Methods for "loess.simpleboot" class objects. ## S3 method for class 'loess.simpleboot': fitted(object,...) object An object of class "loess.simpleboot" as returned by the function loess.boot.... Other arguments passed to and from other methods. fitted returns a n x R matrix of fitted values where n is the number of new locations at which predictions were made and R is the number of bootstrap replications used in the original loess bootstrap. This is the equivalent of calling samples(object, "fitted"). Nothing is returned. one.boot One sample bootstrap of a univariate statistic. one.boot is used for bootstrapping a univariate statistic for one sample problems. include the mean, median, etc. one.boot(data, FUN, R, student = FALSE, M, weights = NULL,...)

8 one.boot data FUN R student M weights The data. This should be a vector of numbers. The statistic to be bootstrapped. This can be either a quoted string containing the name of a function or simply the function name. The number of bootstrap replicates to use. Should we do a studentized bootstrap? This requires a double bootstrap so it might take longer. If student is set to TRUE, then M is the number of internal bootstrap replications to do. Resampling weights; a vector of length equal to the number of observations.... Other (named) arguments that should be passed to FUN. An object of class "simpleboot", which is almost identical to the regular "boot" object. For example, the boot.ci function can be used on this object. set.seed(20) x <- rgamma(100, 1) b.mean <- one.boot(x, mean, 1000) print(b.mean) boot.ci(b.mean) ## No studentized interval here hist(b.mean) ## This next line could take some time on a slow computer b.median <- one.boot(x, median, R = 500, student = TRUE, M = 50) boot.ci(b.median) hist(b.median) ## Bootstrap with weights set.seed(10) w <- runif(100) bw <- one.boot(x, median, 1000, weights = w) print(bw) ## Studentized bw.stud <- one.boot(x, median, R = 500, student = TRUE, M = 50, weights = w) boot.ci(bw.stud, type = "stud")

pairs.boot 9 pairs.boot Two sample bootstrap. pairs.boot is used to bootstrap a statistic which operates on two samples and returns a single value. An example of such a statistic is the correlation coefficient (i.e. cor). Resampling is done pairwise, so x and y must have the same length (and be ordered correctly). One can alternatively pass a two-column matrix to x. pairs.boot(x, y = NULL, FUN, R, student = FALSE, M, weights = NULL,...) x y Either a vector of numbers representing the first sample or a two column matrix containing both samples. If NULL it is assumed that x is a two-column matrix. Otherwise, y is the second sample. FUN The statistic to bootstrap. If x and y are separate vectors then FUN should operate on separate vectors. Similarly, if x is a matrix, then FUN should operate on two-column matrices. FUN can be either a quoted string or a function name. R student M weights The number of bootstrap replicates. Should we do a studentized bootstrap? This requires a double bootstrap so it might take longer. If student is set to TRUE, then M is the number of internal bootstrap replications to do. Resampling weights.... Other (named) arguments that should be passed to FUN. An object of class "simpleboot", which is almost identical to the regular "boot" object. For example, the boot.ci function can be used on this object. set.seed(1) x <- rnorm(100) y <- 2 * x + rnorm(100) boot.cor <- pairs.boot(x, y, FUN = cor, R = 1000) boot.ci(boot.cor) ## With weighting set.seed(20)

10 perc w <- (100:1)^2 bw <- pairs.boot(x, y, FUN = cor, R = 5000, weights = w) boot.ci(bw, type = c("norm", "basic", "perc")) perc Extract percentiles from a bootstrap sampling distribution. perc can be used to extract percentiles from the sampling distribution of a statistic. perc(boot.out, p = c(0.025, 0.975)) perc.lm(lm.boot.obj, p) boot.out Output from either one.boot, two.boot, or pairs.boot. p numeric vector with values in [0, 1]. lm.boot.obj An object of class "lm.simpleboot", returned from lm.boot. perc automatically calls perc.lm if boot.out is of the class "lm.simpleboot" so there is no need to use perc.lm separately. For bootstraps which are not linear model bootstraps, perc returns a vector of percentiles of length length(p). Linear interpolation of percentiles is done if necessary. perc.lm returns a matrix of percentiles of each of the model coefficients. For example, if there are k model coefficients, the perc.lm returns a length(p) by k matrix. x <- rnorm(100) b <- one.boot(x, median, R = 1000) perc(b, c(.90,.95,.99))

plot.lm.simpleboot 11 plot.lm.simpleboot Plot method for linear model bootstraps. Plot regression lines with bootstrap standard errors. This method only works for 2-D regression fits. ## S3 method for class 'lm.simpleboot': plot(x, add = FALSE,...) x add An object of class "lm.simpleboot" returned by lm.boot. Switch indicating whether the regression line should be added to the current plot.... Additional arguments passed down to plot. Ignored if add = TRUE. This function plots the data and the original regression line fit along with +/- 2 bootstrap standard errors at locations specified by the new.xpts argument to lm.boot (or on an evenly spaced grid). Nothing is returned. ## None right now plot.loess.simpleboot Plot method for loess bootstraps. Plot loess lines with bootstrap standard errors. ## S3 method for class 'loess.simpleboot': plot(x, add = FALSE, all.lines = FALSE,...)

12 samples x add all.lines An object of class "loess.simpleboot" as returned by the function loess.boot. Should the loess line be plotted over the current plot? Should we plot each of the individual loess lines from the bootstrap samples?... Other arguments passed to plot. plot constructs (and plots) the original loess fit and +/- 2 bootstrap standard errors at locations specified in the new.xpts in loess.boot (or on an evenly spaced grid). Nothing is returned. ## See the help page for `loess.boot' for an example. samples Extract sampling distributions from bootstrapped linear/loess models. Extract sampling distributions of various entities from either a linear model or a loess bootstrap. Entities for linear models are currently, model coefficients, residual sum of squares, R-square, and fitted values (given a set of X values in the original bootstrap). For loess, one can extract residual sum of squares and fitted values. samples(object, name = c("fitted", "coef", "rsquare", "rss")) object name The output from either lm.boot or loess.boot. The name of the entity to extract. The default is fitted values. Either a vector or matrix depending on the entity extracted. For example, when extracting the sampling distributions for linear model coefficents, the return value is p x R matrix where p is the number of coefficients and R is the number of bootstrap replicates.

simpleboot-internal 13 data(airquality) attach(airquality) lmodel <- lm(ozone ~ Solar.R + Wind) lboot <- lm.boot(lmodel, R = 500) ## Get sampling distributions for coefficients s <- samples(lboot, "coef") ## Histogram for the intercept hist(s[1,]) simpleboot-internal Internal simpleboot functions Internal simpleboot functions. lm.boot.resample(lm.object, R, rows, new.xpts, weights) lo.boot.resample(lo.object, R, rows, new.xpts, weights) model.frame.lm.simpleboot(formula,...) This functions are not to be called by the user. two.boot Two sample bootstrap of differences between univariate statistics. two.boot is used to bootstrap the difference between various univariate statistics. An example is the difference of means. Bootstrapping is done by independently resampling from sample1 and sample2. two.boot(sample1, sample2, FUN, R, student = FALSE, M, weights = NULL,...)

14 two.boot sample1 sample2 FUN R student M weights First sample; a vector of numbers. Second sample; a vector of numbers. The statistic which is applied to each sample. This can be a quoted string or a function name. Number of bootstrap replicates. Should we do a studentized bootstrap? This requires a double bootstrap so it might take longer. If student is set to TRUE, then M is the number of internal bootstrap replications to do. Resampling weights; a list with two components. The first component of the list is a vector of weights for sample1 and the second component of the list is a vector of weights for sample2.... Other (named) arguments that should be passed to FUN. The differences are always taken as FUN(sample1) - FUN(sample2). If you want the difference to be reversed you need to reverse the order of the arguments sample1 and sample2. An object of class "simpleboot", which is almost identical to the regular "boot" object. For example, the boot.ci function can be used on this object. set.seed(50) x <- rnorm(100, 1) ## Mean 1 normals y <- rnorm(100, 0) ## Mean 0 normals b <- two.boot(x, y, median, R = 1000) boot.ci(b) ## No studentized confidence intervals hist(b) ## Histogram of the bootstrap replicates b <- two.boot(x, y, quantile, R = 1000, probs =.75) ## With weighting ## Here all members of the first group has equal weighting ## but members of the the second have unequal weighting w <- list(rep(1, 100), 100:1) bw <- two.boot(x, y, median, R = 1000, weights = w) boot.ci(b) ## Studentized bstud <- two.boot(x, y, median, R = 500, student = TRUE, M = 50) boot.ci(bstud, type = "stud")

two.boot 15 ## Studentized with weights bwstud <- two.boot(x, y, median, R = 500, student = TRUE, M = 50, weights = w) boot.ci(bstud, type = "stud")

Index Topic hplot hist.simpleboot, 1 Topic internal simpleboot-internal, 12 Topic loess loess.boot, 4 loess.simpleboot.methods, 6 plot.loess.simpleboot, 10 Topic regression lm.boot, 2 lm.simpleboot.methods, 3 plot.lm.simpleboot, 10 Topic univar one.boot, 7 pairs.boot, 8 perc, 9 two.boot, 12 Topic utilities samples, 11 plot.loess.simpleboot, 10 print.lm.simpleboot (lm.boot), 2 print.loess.simpleboot (loess.boot), 4 print.summary.lm.simpleboot (lm.simpleboot.methods), 3 samples, 11 simpleboot-internal, 12 summary.lm.simpleboot (lm.simpleboot.methods), 3 two.boot, 12 fitted.lm.simpleboot (lm.simpleboot.methods), 3 fitted.loess.simpleboot (loess.simpleboot.methods), 6 hist.simpleboot, 1 lm.boot, 2 lm.boot.resample (simpleboot-internal), 12 lm.simpleboot.methods, 3 lo.boot.resample (simpleboot-internal), 12 loess.boot, 4 loess.simpleboot.methods, 6 model.frame.lm.simpleboot (simpleboot-internal), 12 one.boot, 7 pairs.boot, 8 perc, 9 plot.lm.simpleboot, 10 16