Package mixphm. July 23, 2015

Similar documents
Package FisherEM. February 19, 2015

Package areaplot. October 18, 2017

Package Rambo. February 19, 2015

Package DPBBM. September 29, 2016

Package basictrendline

Package pwrrasch. R topics documented: September 28, Type Package

Package simsurv. May 18, 2018

Package Funclustering

Package MIICD. May 27, 2017

Package munfold. R topics documented: February 8, Type Package. Title Metric Unfolding. Version Date Author Martin Elff

epub WU Institutional Repository

Package mtsdi. January 23, 2018

Package SmoothHazard

Package longclust. March 18, 2018

Package bisect. April 16, 2018

Package datasets.load

Package ClustGeo. R topics documented: July 14, Type Package

Package glogis. R topics documented: March 16, Version Date

Package weco. May 4, 2018

Package r2d2. February 20, 2015

Package sensory. R topics documented: February 23, Type Package

Package sciplot. February 15, 2013

Package intccr. September 12, 2017

Package ICsurv. February 19, 2015

Package gems. March 26, 2017

Package Mondrian. R topics documented: March 4, Type Package

Package clustvarsel. April 9, 2018

The cmprsk Package. January 1, 2007

Package sizemat. October 19, 2018

Package pcapa. September 16, 2016

Package desplot. R topics documented: April 3, 2018

Package OLScurve. August 29, 2016

Package rereg. May 30, 2018

Package mrm. December 27, 2016

Package lvm4net. R topics documented: August 29, Title Latent Variable Models for Networks

Package sigqc. June 13, 2018

Package MixSim. April 29, 2017

Package cgh. R topics documented: February 19, 2015

Package GLDreg. February 28, 2017

Package CorporaCoCo. R topics documented: November 23, 2017

Package ssmn. R topics documented: August 9, Type Package. Title Skew Scale Mixtures of Normal Distributions. Version 1.1.

Package BayesCR. September 11, 2017

Package FMsmsnReg. March 30, 2016

Package SCVA. June 1, 2017

Package svapls. February 20, 2015

Package smoothr. April 4, 2018

Package cwm. R topics documented: February 19, 2015

Package dyncorr. R topics documented: December 10, Title Dynamic Correlation Package Version Date

Package MatrixCorrelation

Package beanplot. R topics documented: February 19, Type Package

Package semisup. March 10, Version Title Semi-Supervised Mixture Model

Package texteffect. November 27, 2017

Package ConvergenceClubs

Package tidylpa. March 28, 2018

Package vinereg. August 10, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package compeir. February 19, 2015

Package manet. September 19, 2017

Package TBSSurvival. July 1, 2012

Package extweibquant

Package SeleMix. R topics documented: November 22, 2016

Package robets. March 6, Type Package

Package CINID. February 19, 2015

Package pampe. R topics documented: November 7, 2015

Package clampseg. May 25, 2018

Package spark. July 21, 2017

Package TBSSurvival. January 5, 2017

Package beast. March 16, 2018

Package fso. February 19, 2015

Package ICSOutlier. February 3, 2018

Package nprotreg. October 14, 2018

Package ManlyMix. November 19, 2017

Package marqlevalg. February 20, 2015

Package EnQuireR. R topics documented: February 19, Type Package Title A package dedicated to questionnaires Version 0.

Package acss. February 19, 2015

Package flexcwm. May 20, 2018

Package pairsd3. R topics documented: August 29, Title D3 Scatterplot Matrices Version 0.1.0

Package rankdist. April 8, 2018

Package logspline. February 3, 2016

Package rknn. June 9, 2015

Package OptimaRegion

Package capushe. R topics documented: April 19, Type Package

Package visualizationtools

Package qicharts. October 7, 2014

Package bdots. March 12, 2018

Package MTLR. March 9, 2019

Package bayeslongitudinal

Package naivebayes. R topics documented: January 3, Type Package. Title High Performance Implementation of the Naive Bayes Algorithm

Package TVsMiss. April 5, 2018

Package VarReg. April 24, 2018

Package tclust. May 24, 2018

Package ipft. January 4, 2018

Package waterfall. R topics documented: February 15, Type Package. Version Date Title Waterfall Charts in R

Package MPCI. October 25, 2015

The nor1mix Package. June 12, 2007

Package bayesdp. July 10, 2018

Package gsscopu. R topics documented: July 2, Version Date

Package GWRM. R topics documented: July 31, Type Package

Package lcc. November 16, 2018

Package Numero. November 24, 2018

Transcription:

Type Package Title Mixtures of Proportional Hazard Models Version 0.7-2 Date 2015-07-23 Package mixphm July 23, 2015 Fits multiple variable mixtures of various parametric proportional hazard models using the EM-Algorithm. Proportionality restrictions can be imposed on the latent groups and/or on the variables. Several survival distributions can be specified. Missing values and censored values are allowed. Independence is assumed over the single variables. License GPL-2 Imports graphics, stats, survival, lattice Depends R (>= 3.0.0) Encoding UTF-8 LazyData yes LazyLoad yes ByteCompile yes NeedsCompilation no Author Patrick Mair [cre, aut], Marcus Hudec [aut] Maintainer Patrick Mair <mair@fas.harvard.edu> Repository CRAN Date/Publication 2015-07-23 15:14:01 R topics documented: mixphm-package...................................... 2 msbic............................................ 2 phmclust........................................... 4 plot_hazard......................................... 6 screebic.......................................... 7 stableem.......................................... 8 webshop........................................... 10 WilcoxH........................................... 10 1

2 msbic Index 12 mixphm-package Mixtures of proportional hazard models Details This package fits multiple variable mixtures of various parametric proportional hazard models using the EM-Algorithm. Proportionality restrictions can be imposed on the latent groups and/or on the variables. Several survival distributions can be specified. Missing and censored values are allowed. Independence is assumed over the single variables. Package: mixphm Type: Package Version: 0.7-2 Date: 2015-07-23 License: GPL-2 Author(s) Patrick Mair, Marcus Hudec Maintainer: Patrick Mair <mair@fas.harvard.edu> References Mair, P., and Hudec, M. (2009). Multivariate Weibull mixtures with proportional hazard restrictions for dwell time based session clustering with incomplete data. Journal of the Royal Statistical Society, Series C (Applied Statistics), 58(5), 619-639. Kalbfleisch, J.D., and Prentice, R.L. (1980). The statistical analysis of failure time data. New York: Wiley. Celaux, G., and Govaert, G. (1992). A classification EM algorithm for clustering and two stochastic versions. Computational Statistics and Data Analysis, 14, 315-332. msbic PHM model selection with BIC This function fits models for different proportionality restrictions.

msbic 3 msbic(x, K, = "all", Sdist = "weibull", cutpoint = NULL, EMoption = "classification", EMstop = 0.01, maxiter = 100) Arguments x K Sdist cutpoint EMoption EMstop maxiter Data frame or matrix of dimension n*p with survival times (NA s allowed). A vector with number of mixture components. A vector with the s provided in phmclust: With "separate" no restrictions are imposed, "main.g" relates to a group main effect, "main.p" to the variables main effects. "main.gp" reflects the proportionality assumption over groups and variables. "int.gp" allows for interactions between groups and variables. If is "all", each model is fitted. Various survival distrubtions such as "weibull", "exponential", and "rayleigh". Cutpoint for censoring "classification" is based on deterministic cluster assignment, "maximization" on deterministic assignment, and "randomization" provides a posterior-based randomized cluster assignement. Stopping criterion for EM-iteration. Maximum number of iterations. Details Value Based on the output BIC matrix, model selection can be performed in terms of the number of mixture components and imposed proportionality restrictions. Returns an object of class BICmat with the following values: BICmat K Sdist Matrix with BIC values Vector with different components Vector with proportional hazard s Survival distribution See Also screebic ##Fitting 3 Weibull proportional hazard models (over groups, pages) for K=2,3 components res <- msbic(webshop, K = c(2,3), = c("main.p","main.g"), maxiter = 10) res

4 phmclust phmclust Fits mixtures of proportional hazard models This function allows for the computation of proportional hazards models with different distribution assumptions on the underlying baseline hazard. Several options for imposing proportionality restrictions on the hazards are provided. This function offers several variations of the EM-algorithm regarding the posterior computation in the M-step. phmclust(x, K, = "separate", Sdist = "weibull", cutpoint = NULL, EMstart = NA, EMoption = "classification", EMstop = 0.01, maxiter = 100) Arguments x K Sdist cutpoint EMstart EMoption EMstop maxiter Data frame or matrix of dimension n*p with survival times (NA s allowed). Number of mixture components. Imposing proportionality restrictions on the hazards: With "separate" no restrictions are imposed, "main.g" relates to a group main effect, "main.p" to variable main effects. "main.gp" reflects the proportionality assumption over groups and variables. "int.gp" allows for interactions between groups and variables. Various survival distrubtions such as "weibull", "exponential", and "rayleigh". Integer value with upper bound for observed dwell times. Above this cutpoint, values are regarded as censored. If NULL, no censoring is performed Vector of length n with starting values for group membership, NA indicates random starting values. "classification" is based on deterministic cluster assignment, "maximization" on deterministic assignment, and "randomization" provides a posterior-based randomized cluster assignement. Stopping criterion for EM-iteration. Maximum number of iterations. Details The "separate" corresponds to an ordinary mixture model. "main.g" imposes proportionality restrictions over variables (i.e., the group main effect allows for free-varying variable hazards). "main.p" imposes proportionality restrictions over groups (i.e., the variable main effect allows for free-varying group hazards). If clusters with only one observation are generated, the algorithm stops.

phmclust 5 Value Returns an object of class mws with the following values: K iter Sdist likelihood pvisit se.pvisit shape scale group posteriors npar aic bic clmean se.clmean clmed Number of components Number of EM iterations Proportionality restrictions used for estimation Assumed survival distribution Log-likelihood value for each iteration Matrix of prior probabilities due to NA structure Standard errors for priors Matrix with shape parameters Matrix with scale parameters Final deterministic cluster assignment Final probabilistic cluster assignment Number of estimated parameters Akaike information criterion Bayes information criterion Matrix with cluster means Standard errors for cluster means Matrix with cluster medians References Mair, P., and Hudec, M. (2009). Multivariate Weibull mixtures with proportional hazard restrictions for dwell time based session clustering with incomplete data. Journal of the Royal Statistical Society, Series C (Applied Statistics), 58(5), 619-639. Celaux, G., and Govaert, G. (1992). A classification EM algorithm for clustering and two stochastic versions. Computational Statistics and Data Analysis, 14, 315-332. See Also stableem, msbic ## Fitting a Weibll mixture model (3 components) is fitted with classification EM ## Observations above 600sec are regarded as censored res1 <- phmclust(webshop, K = 3, cutpoint = 600) res1 summary(res1)

6 plot_hazard ## Fitting a Rayleigh Weibull proportional hazard model (2 components, proportional over groups) res2 <- phmclust(webshop, K = 2, = "main.p", Sdist = "rayleigh") res2 summary(res2) plot_hazard Plot functions Plotting functions for hazard rates, survival times and cluster profiles. plot_hazard(x, gr.subset, var.subset, group = TRUE, xlim = NA, ylim = NA, xlab = "Survival Time", ylab = "Hazard Function", main = "Hazard Functions", type = "l", lty = 1, lwd = 1, col = NA, legpos = "right",...) plot_survival(x, gr.subset, var.subset, group = TRUE, xlim = NA, ylim = NA, xlab = "Survival Time", ylab = "Survival Function", main = "Survival Functions", type = "l", lty = 1, lwd = 1, col = NA, legpos = "right",...) plot_profile(x, = "mean", type = "b", pch = 19, lty = 1, lwd = 1, col = NA, xlab = "Variables", leglab = NA, ylab = NA, main = NA, legpos = "topright",...) Arguments x gr.subset var.subset group xlim ylim xlab ylab main leglab type lty object of class mws from phmclust Optional vector for plotting subset of clusters Optional vector for plotting subset of variables if TRUE hazard/survival plots are produced for each group, if FALSe for each variable "mean" for cluster mean profile plot and "median" for cluster median profile plot limits for x-axis limits for y-axis label for x-axis label for y-axis title of the plot label for the legend type of plot line type

screebic 7 lwd pch col legpos See Also line width type of plotting points colors; if NA it is determined in the function position of the legend; "topright","topleft","bottomright", "bottomleft","left","right","top", or "center"... Additional plot options phmclust ##Plots for mixture Weibull model with 3 components res <- phmclust(webshop, 3) ##Hazard plot for first and third group, all pages plot_hazard(res, gr.subset = c(1,3), group = TRUE, xlab = "Dwell Time") ##Survival plot for each group, first 6 pages plot_survival(res, var.subset= 1:6, group = FALSE, xlab = "Dwell Time") ##Cluster profile plot plot_profile(res, xlab = "Pages", ylab = "Mean Dwell Time", main = "Cluster Profile") screebic Scree plot of BIC s This function produces a scree plot on the basis of the BIC values in msbic. screebic(x, lty = 1, col = NA, pch = 19, type = "b", main = "BIC Screeplot", xlab = "Number of Components", ylab = "BIC", legpos = "topright",...) Arguments x lty col pch Object of class mws from msbic Line type Line colors; if NA, colors are determined automatically Value for plotting points

8 stableem type Type of plot main Plot title xlab Label for x-axis ylab Label for y-axis legpos position of the legend... Additional plot parameters See Also msbic ##Fitting all Weibull proportional hazard models for K=2,3,4 components res <- msbic(webshop, K = c(2,3,4), = "all", maxiter = 5) screebic(res) stableem Stable EM solution This function performs the clustering for different EM starting values in order to find a stable solution. stableem(x, K, numemstart = 5, = "separate", Sdist = "weibull", cutpoint = NULL, EMoption = "classification", EMstop = 0.0001, maxiter = 1000, print.likvec = TRUE) Arguments x K numemstart Sdist cutpoint Data frame or matrix of dimension n*p with survival times (NA s allowed). Number of mixture components. Number of different starting solutions Imposing proportionality restrictions on the hazards: With separate no restrictions are imposed, main.g relates to a group main effect, main.p to the variables main effects. main.gp reflects the proportionality assumption over groups and variables. int.gp allows for interactions between groups and variables. Various survival distrubtions such as weibull, exponential, and rayleigh. Integer value with upper bound for observed dwell times. Above this cutpoint, values are regarded as censored. If NULL, no censoring is performed

stableem 9 EMoption EMstop maxiter print.likvec classification is based on deterministic cluster assignment, maximization on deterministic assignment, and randomization provides a posterior-based randomized cluster assignement. Stopping criterion for EM-iteration. Maximum number of iterations. If TRUE the likelihood values for different starting solutions are printed. Details After the computation of the models for different starting solutions using the function phmclust the best model is chosen, i.e., the model with the largest likelihood value. The output values refer to this final model. Value Returns an object of class mws with the following values: K iter Sdist likelihood pvisit se.pvisit shape scale group posteriors npar aic bic clmean se.clmean clmed Number of components Number of EM iterations Method with propotionality restrictions used for estimation Assumed survival distribution Log-likelihood value for each iteration Matrix of prior probabilities due to NA structure Standard errors for priors Matrix with shape parameters Matrix with scale parameters Final deterministic cluster assignment Final probabilistic cluster assignment Number of estimated parameters Akaike information criterion Bayes information criterion Matrix with cluster means Standard errors for cluster means Matrix with cluster medians See Also phmclust,msbic ## Exponental mixture model with 2 components for 4 different starting solutions res <- stableem(webshop, K = 2, numemstart = 4, Sdist = "exponential") res summary(res)

10 WilcoxH webshop Webshop dataset for mixphm package This artificial data set represents dwell times in seconds of 333 sessions on 7 webpage categories of a webshop. Missing values indicate that the corresponding session did not visit a particular page. Format Numeric matrices of data frames with subjects as rows and variables as columns. Missing values are coded as NA (which corresponds to 0 survival time). str(webshop) WilcoxH Tests of Zero Correlations Among P Variables This function computes Wilcox H-test and the Steiger-Hakstian-Test for testing H0: R = I. WilcoxH(x, use = "pairwise.complete.obs") Arguments x use Data frame or matrix of dimension n*p with survival times (NA s allowed). Treatment of NA s for the computation of the correlation matrix (see cor()). Either "all.obs", "complete.obs", or "pairwise.complete.obs" Details This test is robust against violations of normality. Since phmclust() assumes independence across pages, this test can be used to explore the appropriateness of the data.

WilcoxH 11 Value Returns an object of class "wilcoxh" with the following values: Rmat SH.res WH.res Correlation matrix Results for Steiger-Hakstian-Test Results for Wilcox H-test References Wilcox, R. (1997). Tests of independence and zero correlations among P variables. Biometrical Journal, 2, 183-193. See Also phmclust res <- WilcoxH(webshop) res

Index Topic datasets webshop, 10 Topic hplot plot_hazard, 6 screebic, 7 Topic models msbic, 2 phmclust, 4 stableem, 8 WilcoxH, 10 Topic package mixphm-package, 2 loglik.mws (phmclust), 4 mixphm (mixphm-package), 2 mixphm-package, 2 msbic, 2, 5, 8, 9 phmclust, 4, 7, 9, 11 plot_hazard, 6 plot_profile (plot_hazard), 6 plot_survival (plot_hazard), 6 print.msbic (msbic), 2 print.mws (phmclust), 4 print.wilcoxh (WilcoxH), 10 screebic, 3, 7 stableem, 5, 8 summary.mws (phmclust), 4 webshop, 10 WilcoxH, 10 12