Package sparsereg. R topics documented: March 10, Type Package

Similar documents
Package SparseFactorAnalysis

Package bayesdp. July 10, 2018

Package CausalImpact

Package diagis. January 25, 2018

Package rpst. June 6, 2017

Package texteffect. November 27, 2017

Package epitab. July 4, 2018

Package mfbvar. December 28, Type Package Title Mixed-Frequency Bayesian VAR Models Version Date

Package smicd. September 10, 2018

Package cosinor. February 19, 2015

CHAPTER 1 INTRODUCTION

Package Bergm. R topics documented: September 25, Type Package

Package BayesCR. September 11, 2017

Package sure. September 19, 2017

Package PTE. October 10, 2017

Package MTLR. March 9, 2019

Package quickreg. R topics documented:

Package mfa. R topics documented: July 11, 2018

Package Kernelheaping

Package intccr. September 12, 2017

Package bsam. July 1, 2017

Package mcemglm. November 29, 2015

Package SAMUR. March 27, 2015

Package robustreg. R topics documented: April 27, Version Date Title Robust Regression Functions

Package survivalmpl. December 11, 2017

Package milr. June 8, 2017

Package ICEbox. July 13, 2017

Package DSBayes. February 19, 2015

Package beast. March 16, 2018

Package flam. April 6, 2018

Package manet. September 19, 2017

Package acebayes. R topics documented: November 21, Type Package

Package MIICD. May 27, 2017

Package tuts. June 12, 2018

Package sspse. August 26, 2018

Package binomlogit. February 19, 2015

Package endogenous. October 29, 2016

Package glmnetutils. August 1, 2017

Package CausalGAM. October 19, 2017

Package gsscopu. R topics documented: July 2, Version Date

Package MatchIt. April 18, 2017

Package MfUSampler. June 13, 2017

Package cwm. R topics documented: February 19, 2015

Package OsteoBioR. November 15, 2018

Package MCMC4Extremes

Package DTRreg. August 30, 2017

Package ECctmc. May 1, 2018

Package polywog. April 20, 2018

Package HDInterval. June 9, 2018

Package logitnorm. R topics documented: February 20, 2015

Package naivebayes. R topics documented: January 3, Type Package. Title High Performance Implementation of the Naive Bayes Algorithm

Package mlvar. August 26, 2018

Package treedater. R topics documented: May 4, Type Package

Package HMMCont. February 19, 2015

Package TVsMiss. April 5, 2018

Package StVAR. February 11, 2017

Package SmoothHazard

Package mvprobit. November 2, 2015

Package RCA. R topics documented: February 29, 2016

Package GLDreg. February 28, 2017

Using the package glmbfp: a binary regression example.

Package manet. August 23, 2018

Package BigVAR. R topics documented: April 3, Type Package

Package abe. October 30, 2017

Package clusternomics

Package BKPC. March 13, 2018

Package PedCNV. February 19, 2015

Package optimus. March 24, 2017

Package QCAtools. January 3, 2017

Package rereg. May 30, 2018

Package BDMMAcorrect

Package robustgam. February 20, 2015

Package SSLASSO. August 28, 2018

Package svalues. July 15, 2018

Package CGP. June 12, 2018

Package rdd. March 14, 2016

Package mwa. R topics documented: February 24, Type Package

Package caretensemble

Package listdtr. November 29, 2016

Also, for all analyses, two other files are produced upon program completion.

Linear Modeling with Bayesian Statistics

Package strat. November 23, 2016

Package dalmatian. January 29, 2018

Package fcr. March 13, 2018

Package ICsurv. February 19, 2015

Package Kernelheaping

Package dgo. July 17, 2018

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS

Package reghelper. April 8, 2017

Package ridge. R topics documented: February 15, Title Ridge Regression with automatic selection of the penalty parameter. Version 2.

Package Rprofet. R topics documented: April 14, Type Package Title WOE Transformation and Scorecard Builder Version 2.2.0

Package misclassglm. September 3, 2016

Bayesian Modelling with JAGS and R

Package FisherEM. February 19, 2015

Package CatEncoders. March 8, 2017

Package robustgam. January 7, 2013

Package bmeta. R topics documented: January 8, Type Package

Package Numero. November 24, 2018

Package uclaboot. June 18, 2003

Package gibbs.met. February 19, 2015

Transcription:

Type Package Package sparsereg March 10, 2016 Title Sparse Bayesian Models for Regression, Subgroup Analysis, and Panel Data Version 1.2 Date 2016-03-01 Author Marc Ratkovic and Dustin Tingley Maintainer Marc Ratkovic <ratkovic@princeton.edu> Sparse modeling provides a mean selecting a small number of non-zero effects from a large possible number of candidate effects. This package includes a suite of methods for sparse modeling: estimation via EM or MCMC, approximate confidence intervals with nominal coverage, and diagnostic and summary plots. The method can implement sparse linear regression and sparse probit regression. Beyond regression analyses, applications include subgroup analysis, particularly for conjoint experiments, and panel data. Future versions will include extensions to models with truncated outcomes, propensity score, and instrumental variable analysis. License GPL (>= 2) Depends R (>= 3.0.2), MASS, ggplot2 Imports Rcpp (>= 0.11.0), msm, VGAM, MCMCpack, coda, glmnet, gridextra, grid, GIGrvg LinkingTo Rcpp, RcppArmadillo NeedsCompilation yes Repository CRAN Date/Publication 2016-03-10 23:32:18 R topics documented: sparsereg-package...................................... 2 difference.......................................... 2 plot.sparsereg........................................ 4 print.sparsereg........................................ 5 sparsereg.......................................... 6 summary.sparsereg..................................... 9 violinplot.......................................... 10 1

2 difference Index 12 sparsereg-package Sparse regression for experimental and observational data. Sparse modeling provides a mean selecting a small number of non-zero effects from a large possible number of candidate effects. This package includes a suite of methods for sparse modeling: estimation via EM or MCMC, approximate confidence intervals with nominal coverage, and diagnostic and summary plots. Beyond regression analyses, applications include subgroup analysis, particularly for conjoint experiments, and panel data. Future versions will include extensions to limited dependent variables, models with truncated outcomes, and propensity score and instrumental variable analysis. Package: sparsereg Type: Package Version: 1.0 Date: 2015-03-20 License: GPL (>= 2) Author(s) Marc Ratkovic and Dustin Tingley Maintainer: Marc Ratkovic (ratkovic@princeton.edu) FindIt, glmnet difference Plotting difference in posterior estimates from a sparse regression. Function for plotting differences in posterior density estimates for separate parameters from sparse regression analysis.

difference 3 Usage difference(x,type="mode",var1=null,var2=null,plot.it=true, main="difference",xlabel="effect", ylabel="density") Arguments x type var1, var2 Object of class sparsereg. Whether to difference the posterior mode or posterior mean. Options are "mode" and "mean". Variables names for the effects to difference. plot.it Whether to plot the density of the difference. main, xlabel, ylabel Main title, x-axis label, and y-axis label. Generates a density of the estimated posterior of the difference between the effects of two variables. sparsereg, plot.sparsereg, summary.sparsereg, violinplot, print.sparsereg Examples ## Not run: set.seed(1) n<-500 k<-100 Sigma<-diag(k) Sigma[Sigma==0]<-.5 X<-mvrnorm(n,mu=rep(0,k),Sigma=Sigma) y.true<-3+x[,2]*2+x[,3]*(-3) y<-y.true+rnorm(n) ##Fit a linear model with five covariates. s1<-sparsereg(y,x[,1:5]) difference(s1,var1=1,var2=2) ## End(Not run)

4 plot.sparsereg plot.sparsereg Plotting output from a sparse regression. Usage Function for plotting coefficients from sparsereg analysis. ## S3 method for class sparsereg plot(x,...) Arguments x Object from output of class sparsereg.... Additional items to pass to plot. Options below. The function returns up to three plots in one figure. Each plot corresponds with main effects, interaction effects, and two-way interactions. Additional options to pass below. main1, main2, main3 Main titles for plots of main effects, interactive effects, and two-way interactions. xlabel Label for x-axis. plot.one Takes on the value of FALSE or 1, 2, or 3, denoting whether to return a single plot for main effects (1), interactive effects (2), or two-way interactions (3). sparsereg, plot.sparsereg, summary.sparsereg, violinplot, difference Examples ## Not run: set.seed(1) n<-500 k<-100 Sigma<-diag(k) Sigma[Sigma==0]<-.5 X<-mvrnorm(n,mu=rep(0,k),Sigma=Sigma) y.true<-3+x[,2]*2+x[,3]*(-3) y<-y.true+rnorm(n)

print.sparsereg 5 ##Fit a linear model with five covariates. s1<-sparsereg(y,x[,1:5]) plot(s1) ## End(Not run) print.sparsereg A summary of the estimated posterior mode of each parameter. Usage The funciton prints a summary of the estimated posterior mode of each parameter. ## S3 method for class sparsereg print(x,... ) Arguments x Object of class sparsereg.... Additional arguments to pass to print. None supported in this version. Uses the summary function from the package coda to return a summary of the posterior mode of a sparsereg object. sparsereg, plot.sparsereg, summary.sparsereg, violinplot, difference Examples ## Not run: set.seed(1) n<-500 k<-100 Sigma<-diag(k) Sigma[Sigma==0]<-.5

6 sparsereg X<-mvrnorm(n,mu=rep(0,k),Sigma=Sigma) y.true<-3+x[,2]*2+x[,3]*(-3) y<-y.true+rnorm(n) ##Fit a linear model with five covariates. s1<-sparsereg(y,x[,1:5]) print(s1) ## End(Not run) sparsereg Sparse regression for experimental and observational data. Usage Function for fitting a Bayesian LASSOplus model for sparse models with uncertainty, facilitating the discovery of various types of interactions. Function takes a dependent variable, an optional matrix of (pre-treatment) covariates, and a (optional) matrix of categorical treatment variables. Includes correct calculation of uncertainty estimates, including for data with repeated observations. sparsereg(y, X, treat=null, EM=FALSE, gibbs=200, burnin=200, thin=10, type="linear", scale.type="none", baseline.vec=null, id=null, id2=null, id3=null, save.temp=false, conservative=true) Arguments y X treat EM gibbs burnin thin type Dependent variable. Covariates. Typical vocabulary would refer to these as "pre-treatment" covariates. Matrix of categorical treatment variables. May be a matrix with one column in the case of there being only one treatment variable. Whether to fit model via EM or MCMC. EM is much quicker, but only returns point estimates. MCMC is slower, but returns posterior intervals and approximate confidence intervals. Number of posterior samples to save. Between each saved sample, thin samples are drawn. Number of burnin samples. Between each burnin sample, thin samples are drawn. These iterations will not be included in the resulting analysis. Extent of thinning of the MCMC chain. Between each posterior sample, whether burnin or saved, thin draws are made. Type of regression model to fit. Allowed types are linear or probit.

sparsereg 7 baseline.vec id, id2, id3 scale.type save.temp conservative Optional vector with one entry for each column of the treatment matrix. Each entry gives the baseline condition for that treatment, which then during preprocessing is omitted for estimation so it serves as an excluded category. Vectors the same lenght of the sample denoting clustering in the data. In a conjoint experiment with repeated observations, these correspond with respondent IDs. Up to three different sets of random effects are allowed. Indicates the types of interactions that will be created and used in estimation. scale.type="none" generates no interactions and corresponds to simply running LASSOplus with no interactions between variables. scale.type="tx" creates interactions between each X variable and each level of the treatment variables. scale.type="tt" creates interactions between each level of separate treatment variables. scale.type="ttx" interacts each X variable with all values generated by scale.type="tt". Note that users can create their own interactions of interest, select scale.type="none", to return the sparse version of the user specified model. Whether to save intermediate output in a file named temp_sparsereg. Useful for very long runs. Experimental. If set to FALSE, the estimate is less conservative in selecting a variable. Value The function sparsereg allows for estimation of a broad range of sparse regressions. The method allows for continuous, binary, and censored outcomes. In experimental data, it can be used for subgroup analysis. It pre-processes lower-order terms to generate higher-order interactions terms that are uncorrelated with their lower order component, with pre-processing generated through scale.type. In observational data, it can be used in place of a standard regression, especially in the presence of a large number of variables. The method also adjusts uncertainty estimates when there are repeated observations through using random effects. For example, a conjoint design may have the same people make several comparisons, or a panel data regression may have multiple observations on the same unit. The object contains the estimated posterior for all of the modeled effects, and analyzing the object is facilitated by the functions plot, summary, violinplot, and difference. beta.mode beta.mean beta.ci sigma.sq X varmat Matrix of sparse (mode) estimates with rows equal to number of effects and columns for posterior samples. Matrix of mean estimates with rows equal to number of effects and columns for posterior samples. These estimates are not sparse, but they do predict better than the mode. Matrix of effects used to calculate approximate confidence intervals. Vector of posterior estimate of error variance. Matrix of covariates fit. Includes interaction terms, depending on scale.type. Matrix of showing which lower-order terms correspond with which effects. Used in producing figures.

8 sparsereg baseline modeltype Vector of baseline categories for treatments. Type of sparsereg model fit. In this case, onestage. Used by summary functions. Egami, Naoki and Imai, Kosuke. 2015. "Causal Interaction in High-Dimension." Working paper. plot.sparsereg, summary.sparsereg, violinplot, difference, print.sparsereg Examples ## Not run: set.seed(1) n<-500 k<-5 treat<-sample(c("a","b","c"),n,replace=true,pr=c(.5,.25,.25)) treat2<-sample(c("a","b","c","d"),n,replace=true,pr=c(.25,.25,.25,.25)) Sigma<-diag(k) Sigma[Sigma==0]<-.5 X<-mvrnorm(n,m=rep(0,k),S=Sigma) y.true<-3+x[,2]*2+(treat=="a")*2 +(treat=="b")*(-2)+x[,2]*(treat=="b")*(-2)+ X[,2]*(treat2=="c")*2 y<-y.true+rnorm(n,sd=2) ##Fit a linear model. s1<-sparsereg(y, X, cbind(treat,treat2), scale.type="tx") s1.em<-sparsereg(y, X, cbind(treat,treat2), EM=TRUE, scale.type="tx") ##Summarize results from MCMC fit summary(s1) plot(s1) violinplot(s1) ##Summarize results from MCMC fit summary(s1.em) plot(s1.em) ##Extension using a baseline category s1.base<-sparsereg(y, X, treat, scale.type="tx", baseline.vec="a") summary(s1.base) plot(s1.base) violinplot(s1.base) ## End(Not run)

summary.sparsereg 9 summary.sparsereg Summaries for a sparse regression. The function prints and returns a summary table for a sparsereg object. Usage ## S3 method for class sparsereg summary(object,... ) Arguments object Object of type sparsereg.... Additional items to pass to summary. Options below. Generates a table for an object of class sparsereg. Additional arguments to pass summary below. interval Length of posterior interval to return. Must be between 0 and 1, default is.9. The symmetric interval is returned. ci Type of interval to return. Options are "quantile" (default) for quantiles and "HPD" for the highest posterior density interval. order How to order returned coefficients. Options are "magnitude", sorted by magnitude and omitting zero effects, "sort", sorted by size from highest to lowest and omitting zero effects, and "none" which returns all effects normal Whether to return the normal approximate confidence interval (default of TRUE) or posterior interval (FALSE). select Either "mode" or a number between 0 and 1. Whether to select variables for printing off the median of the mode (default) or off the probability of being non-zero. printit Whether to print a summary table. stage Currently this argument is ignored. sparsereg, plot.sparsereg, violinplot, difference, print.sparsereg

10 violinplot Examples ## Not run: set.seed(1) n<-500 k<-100 Sigma<-diag(k) Sigma[Sigma==0]<-.5 X<-mvrnorm(n,mu=rep(0,k),Sigma=Sigma) y.true<-3+x[,2]*2+x[,3]*(-3) y<-y.true+rnorm(n) ##Fit a linear model with five covariates. s1<-sparsereg(y,x[,1:5]) summary(s1) ## End(Not run) violinplot Function for plotting posterior distribution of effects of interest. The function produces a violin plot for specified effects. examining particular marginal effects of interest. This can be useful for presenting or Usage violinplot(x, columns=null, newlabels=null, type="mode", stage=null) Arguments x columns newlabels type stage Object of class sparsereg. A vector of numbers (or strings) corresponding to columns (or column names) to produce plots for. New labels for columns rather than variable names in object. If empty, variable names are used. Options are "mode" and "mean". Whether to plot the posterior mode or mean. Currently, this argument is ignored. Generates a violin plot for coefficients from object from class sparsereg. The desired coefficients can be requested using the columns argument and they can be assigned new names through newlabels.

violinplot 11 sparsereg, plot.sparsereg, summary.sparsereg, difference, print.sparsereg Examples ## Not run: set.seed(1) n<-500 k<-100 Sigma<-diag(k) Sigma[Sigma==0]<-.5 X<-mvrnorm(n,mu=rep(0,k),Sigma=Sigma) y.true<-3+x[,2]*2+x[,3]*(-3) y<-y.true+rnorm(n) ##Fit a linear model with five covariates. s1<-sparsereg(y,x[,1:5]) violinplot(s1,1:3) ## End(Not run)

Index Topic package sparsereg-package, 2 difference, 2, 4, 5, 8, 9, 11 plot.sparsereg, 3, 4, 4, 5, 8, 9, 11 print.sparsereg, 3, 5, 8, 9, 11 sparsereg, 3 5, 6, 9, 11 sparsereg-package, 2 summary.sparsereg, 3 5, 8, 9, 11 violinplot, 3 5, 8, 9, 10 12