Package StagedChoiceSplineMix

Similar documents
Package sensory. R topics documented: February 23, Type Package

Package RAMP. May 25, 2017

Package milr. June 8, 2017

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering

Package semisup. March 10, Version Title Semi-Supervised Mixture Model

Package palmtree. January 16, 2018

Package mrm. December 27, 2016

Package SSLASSO. August 28, 2018

Package StVAR. February 11, 2017

Package hiernet. March 18, 2018

Package mcemglm. November 29, 2015

Package misclassglm. September 3, 2016

Package intccr. September 12, 2017

Package ECctmc. May 1, 2018

Package glinternet. June 15, 2018

Package DPBBM. September 29, 2016

Package robustreg. R topics documented: April 27, Version Date Title Robust Regression Functions

Package vinereg. August 10, 2018

Package ssmn. R topics documented: August 9, Type Package. Title Skew Scale Mixtures of Normal Distributions. Version 1.1.

Package extweibquant

Package cwm. R topics documented: February 19, 2015

Package flam. April 6, 2018

Package ssa. July 24, 2016

Package simsurv. May 18, 2018

Package FMsmsnReg. March 30, 2016

Package GLDreg. February 28, 2017

Package PTE. October 10, 2017

Package ICsurv. February 19, 2015

Package BayesCR. September 11, 2017

Package FisherEM. February 19, 2015

Package treedater. R topics documented: May 4, Type Package

Package CatEncoders. March 8, 2017

Package gpdtest. February 19, 2015

Package TVsMiss. April 5, 2018

Package CALIBERrfimpute

Package gee. June 29, 2015

Package clusternomics

Clustering Lecture 5: Mixture Model

The glmmml Package. August 20, 2006

Package CompGLM. April 29, 2018

CATLVM User Guide for Latent Class-Profile Analysis

Package SAM. March 12, 2019

Package Rambo. February 19, 2015

Package mixphm. July 23, 2015

Package CVR. March 22, 2017

Package NetworkRiskMeasures

Package binomlogit. February 19, 2015

Package SeleMix. R topics documented: November 22, 2016

Package PrivateLR. March 20, 2018

Package simglm. July 25, 2018

Package MetaLasso. R topics documented: March 22, Type Package

Package pnmtrem. February 20, Index 9

Package nsprcomp. August 29, 2016

Package TBSSurvival. July 1, 2012

Package DSBayes. February 19, 2015

Package DTRreg. August 30, 2017

Package bayescl. April 14, 2017

Package wskm. July 8, 2015

Package RCA. R topics documented: February 29, 2016

Package mtsdi. January 23, 2018

Package freeknotsplines

Package TestDataImputation

Package missforest. February 15, 2013

Package rgcvpack. February 20, Index 6. Fitting Thin Plate Smoothing Spline. Fit thin plate splines of any order with user specified knots

Package datasets.load

Package bisect. April 16, 2018

Package oglmx. R topics documented: May 5, Type Package

Package assortnet. January 18, 2016

Package longclust. March 18, 2018

Package subplex. April 5, 2018

Package SmoothHazard

Package VarReg. April 24, 2018

Package texteffect. November 27, 2017

Package penalizedlda

Package polywog. April 20, 2018

Package MultiRR. October 21, 2015

Package stapler. November 27, 2017

Package caic4. May 22, 2018

CATLVM User Guide for Latent Transition Analysis

Package TBSSurvival. January 5, 2017

Package epitab. July 4, 2018

Package sfc. August 29, 2016

Package SAENET. June 4, 2015

Package hsiccca. February 20, 2015

Package bingat. R topics documented: July 5, Type Package Title Binary Graph Analysis Tools Version 1.3 Date

Package survivalmpl. December 11, 2017

Package mixsqp. November 14, 2018

Package flexcwm. May 20, 2018

Package automl. September 13, 2018

Package xwf. July 12, 2018

Package strat. November 23, 2016

Package GenSA. R topics documented: January 17, Type Package Title Generalized Simulated Annealing Version 1.1.

Package ibm. August 29, 2016

Package quadprogxt. February 4, 2018

Package ClusterBootstrap

Package endogenous. October 29, 2016

Package dkdna. June 1, Description Compute diffusion kernels on DNA polymorphisms, including SNP and bi-allelic genotypes.

Package attrcusum. December 28, 2016

Package glassomix. May 30, 2013

Package stockr. June 14, 2018

Transcription:

Type Package Package StagedChoiceSplineMix August 11, 2016 Title Mixture of Two-Stage Logistic Regressions with Fixed Candidate Knots Version 1.0.0 Date 2016-08-10 Author Elizabeth Bruch <ebruch@umich.edu>, Fred Feinberg <feinf@umich.edu>, Kee Yeun Lee <keeyeun.lee@polyu.edu.hk> Maintainer Kee Yeun Lee <keeyeun.lee@polyu.edu.hk> Description Analyzing a mixture of two-stage logistic regressions with fixed candidate knots. See Bruch, E., F. Feinberg, K. Lee (in press)<doi:10.1073/pnas.1522494113>. Depends R (>= 3.0.1), plyr, base, stats, methods License GPL-2 RoxygenNote 5.0.1 NeedsCompilation no Repository CRAN Date/Publication 2016-08-11 13:13:59 R topics documented: bs.se............................................. 2 gen.init........................................... 2 move.knot.......................................... 3 StagedChoiceSplineMix.................................. 4 twostglogitregmixem.................................... 9 Index 11 1

2 gen.init bs.se Performs a parametric bootstrapping standard error approximation Description The function performs a parametric bootstrapping standard error approximation for mixtures of two-stage logistic regressions. Usage bs.se(output, B = 100) Arguments output B output of StagedChoiceSplineMix number of bootstrap samples (def:100) See Also StagedChoiceSplineMix boot.se Examples ## parametric bootstrapping to calculate standard error: ## use best output from StagedChoiceSplineMix exampl #out.se<-bs.se(output=out$best, B=100) #out.se$betab.se #out.se$betaw.se #out.se$lambda.se gen.init Generates initial values for mixtures of logistic regressions Description A sub-function of StagedChoiceSplineMix. This function generates initial values for mixtures of logistic regressions. This function is used if starting points for parameters are not specified by the user or when the EM algorithm needs to be initialized due to errors. Usage gen.init(y, x, k, er)

move.knot 3 Arguments y x k er The total number of errors. See Also StagedChoiceSplineMix "mixtools" package version 1.0.3 move.knot Generates a new set of knots for the following iteration Description A sub-function of StagedChoiceSplineMix. This function generates a new set of knots for the following iteration. Please refer to Bruch et al. (in press) for the precise rule used. Usage move.knot(num.knot, sp.knots, k) Arguments num.knot sp.knots k References Bruch, E., F. Feinberg, K. Lee (in press), "Detecting Cupid s Vector: Extracting Decision Rules from Online Dating Activity Data," Proceedings of the National Academy of Sciences. See Also StagedChoiceSplineMix

4 StagedChoiceSplineMix StagedChoiceSplineMix Performs iterations between an EM algorithm for a mixture of twostage logistic regressions with fixed candidate knots and knot movements Description Usage The function performs iterations between an EM algorithm for a mixture of two-stage logistic regressions with fixed candidate knots and knot movements. The function generates candidate knots for each splined variable. Three sub-functions (gen.init,twostglogitregmixem,move.knot) are used within the function. StagedChoiceSplineMix(data = NULL, M = 100, sp.cols = NULL, num.knots = NULL, sp.knots = NULL, betab = NULL, betaw = NULL, lambda = NULL, k = 2, nst = 20, epsilon = 1e-06, maxit = 500, maxrestarts = 100, maxer = 20) Arguments data Raw data for StagedChoiceSplineMix Format 1st column: id 2nd column: the 1st stage binary variable (browsing) 3rd column: the 2nd stage binary variable (writing) conditional on the 1st stage binary variable. It should be left blank (or NA) if 1st stage variable is equal to 0 The rest of columns: covariates including splined variables See Format(below) for details M number of iterations (def: 100) sp.cols vector of column numbers of splined variables in a data set (if sp.col is 0, twostglogitregmixem function should be used) num.knots sp.knots betab betaw vector of numbers of knot candidates for splined variables. (def: a vector all of whose entries are "19") list of knot configurations. For each splined variable, a knot configuration is a k by 4 matrix whose rows represent latent classes and columns represent knots [browsing knot 1, browsing knot 2, writing knot 1, writing knot 2]. (def: approximately 1/3 and 2/3 of knot candidates for knot 1 and knot 2 respectively) matrix of starting points for betab (browsing parameters). If not given, gen.init generates starting points. matrix of starting points for betaw (writing parameters). If not given, gen.init generates starting points.

StagedChoiceSplineMix 5 lambda vector of starting points for lambda (membership proportion). If not given, gen.init generates starting points k number of latent classes (def: 2) nst epsilon maxit maxrestarts maxer Format number of random multiple starting points to try given a knot configuration. For each knot configuration, the output with the largest log-likelihood is stored among nst trials. (def: 20) stopping tolerance for the EM algorithm. (def:1e-06) maximum number of the EM iterations allowed. If convergence is not declared before maxit, the EM algorithm stops with an error message and generates new starting points. (def: 500) maximum number of restarts (due to a singularity problem) allowed in the EM iterations. If convergence is not declared before maxrestarts, the algorithm stops with an error message and generates new starting points. (def: 100) maximum number of errors allowed within a given knot configuration. If convergence is not declared before maxer, it tries a new knot configuration. (def: 20) The simulated data(simdata) in the examples is generated using information below: 1. User identifier: userid Number of users: 700 Number of observations per user: 200 2. Dependent variables: browsed: 1st stage binary dependent variable wrote: 2nd stage binary dependent variable 3. Covariates: x1 (discrete variable): random draws from a binomial distribution with 0.7 success probability sp1 (splined variable): random draws from a uniform distribution between -2 and 2 4. Number of latent classes: 3 5. True parameters: i.betab(browsing) Class1 Class2 Class3 intercept 1.5 2 0.3 x1-0.2-0.1 0.6 sp1-2.5 0.5 1 sp1 knot1 3-1 1 sp1 knot2 2-0.5-1.5

6 StagedChoiceSplineMix ii.betaw(writing) Class1 Class2 Class3 intercept -2-1 0.3 x1-0.3-0.2 0.2 sp1-3 -2 0 sp1 knot1 3 2-1.5 sp1 knot2 3 2-1 iii.lambda(membership proportion) Class1 Class2 Class3 0.3 0.3 0.4 6. True knot configuration: 19 candidate knots for sp1 (20-iles) knot1(browsing) knot2(browsing) knot1(writing) knot2(writing) Class1 8 14 4 11 Class2 5 15 5 12 Class3 5 13 7 14 Value StagedChoiceSplineMix returns a list of the following items: best: best output, i.e., that giving the largest log-likelihood among M outputs. best$loglik: log-likelihood best$betab: parameter estimates of betab best$betaw: parameter estimates of betaw best$lambda: parameter estimates of lambda best$sp.knots: knot configuration loglike: vector of log-likelihoods for M outputs. References Bruch, E., F. Feinberg, K. Lee (in press), "Detecting Cupid s Vector: Extracting Decision Rules from Online Dating Activity Data," Proceedings of the National Academy of Sciences. See Also gen.init move.knot twostglogitregmixem "mixtools" package version 1.0.3

StagedChoiceSplineMix 7 Examples ####################################### ###### 1. Generate data (simdata) ##### set.seed(77) k<-3 betab<-matrix(c(1.5,2.0,0.3,-0.2,-0.1,0.6,-2.5,0.5,1,3,-1,1,2,-0.5,-1.5),5,k,byrow=true) betaw<-matrix(c(-2.0,-1,0.3,-0.3,-0.2,0.2,-3,-2,0,3,2,-1.5,3,2,-1),5,k,byrow=true) sp1.knots<-matrix(c(8,14,4,11,5,15,5,12,5,13,7,14),3,4,byrow=true) lambda<-c(0.3,0.3,0.4) # Large data set #n_id<-700 #nobsb<-200 # Small data set n_id<-100 nobsb<-100 nb<-n_id*nobsb idb <- rep(1:n_id, each=nobsb) xb.1<-rbinom(nb,size=1,prob=0.7) xb.com<-cbind(1,xb.1) xb.sp1<-runif((nb),-2,2) xb<-cbind(xb.com,xb.sp1) sp1.mat.b<- matrix(double(20*nb),ncol=20) sp1.mat.b[,1]<-xb.sp1 sp1.quan<-quantile(xb.sp1,c(0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5, 0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95)) for (i in 1:19){ sp1.mat.b[,(i+1)]<-xb.sp1-sp1.quan[i] sp1.mat.b[sp1.mat.b<0]<-0 sp1.mat.b[,1]<-xb.sp1 xb.class<-list() xb.class[[i]]<-cbind(xb.com,sp1.mat.b[,1],sp1.mat.b[,(sp1.knots[i,1:2])+1]) xbetab<-matrix(double(nb*k),ncol=k) xbetab[,i]<-data.matrix(xb.class[[i]])%*%betab[,i] num.idb<-as.vector(ddply(data.frame(idb),"idb",count)[,-1]) w<-sample(c(1:k), n_id, replace = TRUE, prob = lambda)

8 StagedChoiceSplineMix wb<-rep(w,num.idb) yb.temp<-matrix(double(nb*k),ncol=k) yb.temp[,i]<-rbinom((nb), size = 1, prob = (1/(1+exp(-xbetab[, i])))) yb<-sapply(1:nb, function(i) yb.temp[i,wb[i]]) idw<-idb[yb==1] nw<-length(idw) sp1.mat.w<-sp1.mat.b[yb==1,] xw.com<-xb.com[yb==1,] xw<-xb[yb==1,] xw.class<-list() xw.class[[i]]<-cbind(xw.com,sp1.mat.w[,1],sp1.mat.w[,(sp1.knots[i,3:4])+1]) xbetaw<-matrix(double(nw*k),ncol=k) xbetaw[,i]<-data.matrix(xw.class[[i]])%*%betaw[,i] num.idw<-as.vector(ddply(data.frame(idw),"idw",count)[,-1]) ww<-wb[yb==1] yw.temp<-matrix(double(nw*k),ncol=k) yw.temp[,i]<-rbinom((nw), size = 1, prob = (1/(1+exp(-xbetaw[, i])))) yw<-sapply(1:nw, function(i) yw.temp[i,ww[i]]) yb.aug<-cbind(1:length(yb),yb) yw.aug<-cbind(which(yb==1),yw) colnames(yb.aug)[1]<-"num" colnames(yw.aug)[1]<-"num" ybyw<-merge(yb.aug,yw.aug, all = TRUE)[,-1] simdata<-cbind(idb,ybyw,xb.1,sp1.mat.b[,1]) simdata[is.na(simdata)]<-"" simdata[,3]<-as.integer(simdata[,3]) colnames(simdata)[1]<-"userid" colnames(simdata)[2]<-"browsed" colnames(simdata)[3]<-"wrote" colnames(simdata)[4]<-"x1" colnames(simdata)[5]<-"sp1" ######################################## ##### 2. Run StagedChoiceSplineMix ##### ## number of latent classes

twostglogitregmixem 9 set.seed(66) k<-3 ## starting points: true parameters used in the data generation (optional) betab<-matrix(c(1.5,2.0,0.3,-0.2,-0.1,0.6,-2.5,0.5,1,3,-1,1,2,-0.5,-1.5),5,k,byrow=true) betaw<-matrix(c(-2.0,-1,0.3,-0.3,-0.2,0.2,-3,-2,0,3,2,-1.5,3,2,-1),5,k,byrow=true) lambda<-c(0.3,0.3,0.4) ## number of random multiple starting points to try given a knot configuration nst<-1 ## vector of the columns of spline variables in the data set (required) sp.cols<-5 # vector of the numbers of candidate knots for splined variables (optional) num.knots<-19 ## true knot configuration used in the data generation sp1.knots<-matrix(c(8,14,4,11,5,15,5,12,5,13,7,14),3,4,byrow=true) ## list of knot configuration of spline variables (optional) sp.knots<-list(sp1.knots) ## Run "StagedChoiceSplineMix" out<-stagedchoicesplinemix(data=simdata, M=1, sp.cols=sp.cols, num.knots, sp.knots, betab, betaw, lambda, k=k, nst=nst, epsilon=1e-06, maxit=500, maxrestarts=100, maxer=20) ## output out$loglike # vector of M log-likelihoods out$best$loglik # log-likelihood of the best output out$best$betab # betab estimates of the best output out$best$betaw # betaw estimates of the best output out$best$lambda # lambda estimates of the best output out$best$sp.knots # knot configuration of the best output twostglogitregmixem Performs an EM algorithm for mixtures of two-stage logistic regressions Description Usage A sub-function of StagedChoiceSplineMix. This function performs an EM algorithm for mixtures of two-stage logistic regressions. twostglogitregmixem(yb = NULL, idb = NULL, yw = NULL, idw = NULL, xb = NULL, xw = NULL, xb.class = NULL, xw.class = NULL, sp.cols = NULL, num.knots = NULL, sp.knots = NULL, betab = NULL,

10 twostglogitregmixem Arguments betaw = NULL, lambda = NULL, k = NULL, epsilon = 1e-06, maxit = 500, maxrestarts = 100, maxer = 20, verb = FALSE) yb The 1st stage binary variable (browsing). See StagedChoiceSplineMix for details. idb yw idw xb xw xb.class xw.class sp.cols num.knots sp.knots betab betaw lambda k epsilon maxit maxrestarts maxer verb See Also StagedChoiceSplineMix regmixem Corresponding id for yb. The 2nd stage binary variable (writing). Corresponding id for yw. Corresponding covariance matrix for yb. Corresponding covariance matrix for yw. Corresponding latent classes for yb. Corresponding latent classes for yw.

Index boot.se, 2 bs.se, 2 gen.init, 2, 4 6 move.knot, 3, 4, 6 regmixem, 10 StagedChoiceSplineMix, 2, 3, 4, 9, 10 twostglogitregmixem, 4, 6, 9 11