NONPARAMETRIC SUMMARY CURVES FOR COMPETING RISKS IN R

Size: px
Start display at page:

Download "NONPARAMETRIC SUMMARY CURVES FOR COMPETING RISKS IN R"

Transcription

1 NONPARAMETRIC SUMMARY CURVES FOR COMPETING RISKS IN R By Pawel Paczuski 1 University of Michigan November 19, 2012 Abstract In survival analysis, when a subject may fail due to one of K 2 causes, we have a situation of competing risks. An example is cause-specific mortality, where a death may occur as a result of cancer, heart disease, or other causes. Two estimators may be used to perform competing risks analysis: 1) the complement of the Kaplan-Meier estimator and 2) the cumulative incidence function. In this paper, we show how to implement these methods in R using an example dataset. These brief explanations closely follow Klein and Moeschberger (2003) Section 4.7. Contents 1. Introduction" First estimator: the complement of the Kaplan-Meier estimator" Second estimator: the cumulative incidence function" 2 2. Example and implementation in R: BMT data" The cumulative incidence function in R" Interaction plot in R" KM Table 4.8 reconstruction" Multiple groups example" 8 3. Final notes" 9 References" 10 Appendix: R Code" 11 Competing Risks in R: Summary Handout" 14 1 Send suggestions about this paper to pbpacz at umich dot edu

2 Competing Risks in R" 2/14 1. Introduction Competing risks occur when a subject may fail due to more than one cause. Klein and Moeschberger (2003), Section 4.7, describe three methods for summarizing competing risks data: 1) the complement of the Kaplan-Meier estimator, 2) the cumulative incidence function, and 3) the conditional probability function for competing risk. In this paper, we focus on the implementation of the first two using the statistical package R. For a more thorough treatment, we refer the reader to Klein and Moeschberger (2003), Sections 2.7 and 4.7. The example dataset that will be used is the Bone Marrow Transplant for Leukemia dataset referenced in Klein and Moeschberger (2003), Section 1.3. In this study, treatment failure is defined as death in remission or relapse, whichever comes first. Here, we treat both as competing risks, and are interested in the likelihood of these events happening over time. 1.1 First estimator: the complement of the Kaplan-Meier estimator One method to study competing risks uses the complement of the Kaplan-Meier estimator. When we consider one event, E, we treat occurrences of the other event(s) as censored observations, and attempt to estimate the probability of E. This estimator may be interpreted as the probability of event E occurring by time t if the risk of other event(s) is removed. We can think of it as the probability of E occurring in a hypothetical world where the other event(s) are not possible. This is rarely of clinical interest. However, this estimator is related to the cumulative incidence function, which we describe next. 1.2 Second estimator: the cumulative incidence function The cumulative incidence function, CIF, is defined in Klein and Moeschberger (4.7.1) as: where t i, i=1:k are distinct times when one of the competing risks occurs. At time t i : Y i is the number of subjects at risk, r i the number of subjects with an occurrence of the event of interest at this time, and d i the number of subjects with an occurrence of any of the other events of interest at this time.

3 Competing Risks in R" 3/14 Note that: 1) d i + r i is the number of subjects with an occurrence of any of the competing risks at t i 2) independent random censoring is not counted as one of the competing risks and only affects Y i 3) for t t 1, the CIF is: where Ŝ(t i -) is the Kaplan-Meier estimator, evaluated at just before t i, obtained by treating any one of the competing risks as an event. Thus, it may be shown that the sum of the cumulative incidences for all competing risks is 1 Ŝ(t). The CIF estimates the probability that the event of interest occurs before time t and that it occurs before any of the competing causes of failure. The variance of the CIF is estimated by (KM 4.7.2): Confidence intervals may be constructed in the standard manner. 2. Example and implementation in R: BMT data We will illustrate both estimators using the ALL group in the BMT data, as is done in KM 4.7. R code is also provided in the appendix. 2 The data is right-censored, and the two competing risks of interest are death in remission (TRM) and relapse. The package cmprsk allows us to perform the necessary calculations. The dataset is part of the package KMsurv. library(cmprsk) library(kmsurv) data(bmt) #also loads library(survival) #load bmt data ## select group 1 only (ALL patients) bmt1 <- subset(bmt, bmt$group==1) 2 As well as at

4 Competing Risks in R" 4/14 bmt1 <- bmt1[order(bmt1$t2),] In this dataset, we have: t2: Disease Free Survival Time (Time To Relapse, Death or End of Study) d1: Death Indicator d2: Relapse Indicator d3: Disease Free Survival Indicator (0 if alive, 1 if death or relapse) We create a single variable of the censored observations and the two competing risks, necessary for our implementation: bmt1$type <- bmt1$d1+bmt1$d2 > table(bmt1$type) Now, bmt1$type is 0 for the originally censored observations, 1 for TRM, and 2 for relapse. 2.1 The cumulative incidence function in R We obtain the cumulative incidence function with the cuminc() function in cmprsk. Usage is: ## cuminc(ftime, fstatus, group, cencode=0,...) where ## ftime is failure time, fstatus is failure status, group is optional group indicator ## cencode is the code for censored observations in fstatus fit.1 <- cuminc(bmt1$t2, bmt1$type, cencode=0) This obtains the fit for some default timepoints, but we can obtain it for our given timepoints with the timepoints() function ## using timepoints(w, times) ## w is a cuminc() object, times is vector of time fits.c <- timepoints(fit.1,bmt1$t2) > fits.c $est

5 Competing Risks in R" 5/ $var The result is a list. We can easily: ## plot competing risks plot(fit.1) But: ## to customize, use custom function from package > plot.cuminc(fit.1, col=1:2, curvlab=c("trm", "Relapse"), xlim=c (0,1000),xlab="Days Post Transplant", ylab="probability", main="competing Risks: \ncumulative Incidence for ALL group")

6 Competing Risks in R" 6/14 This plots the cumulative incidence for TRM and for relapse from Figures 4.12 and 4.13 in KM. We can generate a table of values with code such as: ## convert list to df, format nicely df <- as.data.frame(cbind(t(fits.c$est),t(fits.c$var))) names(df) <- c("trm CI", "relapse CI", "var(trm CI)", "var(relapse CI)") row.names(df) <- c(1:length(df[[1]])) ## can add times df$time <- sort(unique(bmt1$t2)) ## rearrange order of display and subset df <- df[,c(5,1,2)] > head(df) time trm CI relapse CI This is part of KM Table 4.8. We used unique() above when inserting the times because when obtaining model fits using cuminc() (or even survfit() and other functions), only unique times are outputted. 2.2 Interaction plot in R A very useful comparison between all competing risks is an interaction plot, which appears in KM Figure It plots the relapse CIF and the sum of the relapse and TRM CIFs to show how changes in the likelihood of one event cause changes in the probabilities of the other events. It may be generated with: ## for interaction plot, first add the sum of TRM and relapse df$sum <- df$trm+df$relapse ## interaction plot (fig 4.14) > plot(df$time, df$relapse, type="s", ylim=c(0,1), xlim=c(0,1000),xlab="days Post Transplant", ylab="probability") > lines(df$time, df$sum, type="s") > text(700, 0.1, "Relapse Probability", cex=0.7) > text(700, 0.4, "Death in Remission Probability", cex=0.7) > text(300, 0.8, "Disease Free Survival Probability", cex=0.7)

7 Competing Risks in R" 7/14 > title("interaction between the relapse and\n death in remission (KM Fig 4.14)") 2.3 KM Table 4.8 reconstruction To obtain other output from KM Table 4.8, the following code may be used: ## add 1-KME for both risks ## KME = Kaplan Meier Estimate ## TRM bmt1$death <- ifelse(bmt1$type==1,1,0) km.death <- survfit(surv(bmt1$t2, bmt1$death)~1) ## Relapse bmt1$relapse <- ifelse(bmt1$type==2,1,0) km.relapse <- survfit(surv(bmt1$t2, bmt1$relapse) ~ 1) trm.1mkm <- 1-km.death$surv relapse.1mkm <- 1-km.relapse$surv df2 <- cbind(df$time, trm.1mkm, relapse.1mkm, df$trm, df$relapse)

8 Competing Risks in R" 8/14 df2 <- as.data.frame(df2) names(df2) <- c("time", "TRM 1-KME", "Relapse 1-KME", "TRM CI", "Relapse CI") row.names(df2) <- c(1:dim(df2)[1]) ## now showing that sum of CumInc is 1-KME for original treatment failure ## find sum df2$cisum <- df2[,4]+df2[,5] kme.all <- survfit(surv(bmt1$t2, bmt1$d3) ~ 1) kme.comp <- 1-kme.all$surv ## add to dataframe df2$kme.comp <- kme.comp > head(df2) time TRM 1-KME Relapse 1-KME TRM CI Relapse CI CIsum kme.comp Multiple groups example We can also compare risks between multiple groups. The original bmt dataset has three groups. ## first, rename the groups bmt$group <- factor(bmt$group, levels=c(1:3), labels=c("all", "AML lo", "AML hi")) bmt$type <- bmt$d1+bmt$d2 > table(bmt$group, bmt$type) ALL AML lo AML hi ## obtain CIFs fit.3g <- cuminc(bmt$t2, bmt$type, bmt$group, cencode=0) ## additional output appears: > fit.3g Tests: stat pv df

9 Competing Risks in R" 9/ The test (see Gray (1988)) compares each risk between all groups. Here, we see that the TRM (type=1) CIFs do not vary significantly between the three groups, while the relapse (type=2) CIFs are significantly different among the three groups. This may be confirmed by looking at a plot: ## TRM (1) by group in black, Relapse (2) by group in red > plot.cuminc(fit.3g, col=c(1,1,1,2,2,2), xlab="days Post Transplant", ylab="probability", main="all three groups compared.\nblack: TRM by group. Red: Relapse by group") 3. Final notes Scrucca L et al (2007) implement a single function to plot and compute the CIFs. See Regression models are described in Scheike and Zhang (2008).

10 Competing Risks in R" 10/14 References 1. Gray RJ (1988). A class of K-sample tests for comparing the cumulative incidence of a competing risk, ANNALS OF STATISTICS, 16: Klein JP, Moeschberger, ML (2003). Survival analysis: Techniques for censored and truncated data. New York: Springer. 3. Scheike TH, Zhang MJ, Gerds TA (2008). Predicting Cumulative Incidence Probability by Direct Binomial Regression. BIOMETRIKA, 95: Scrucca L, Santucci A, Aversa F (2007). Competing risk analysis using R: an easy guide for clinicians. BONE MARROW TRANSPLANT, 40(4):381 7.

11 Competing Risks in R" 11/14 Appendix: R Code Also available at ## Competing Risks in R ## UM BIOS 2012 library(cmprsk) library(kmsurv) data(bmt) #also loads library(survival) #load bmt data ## select group 1 only (ALL patients) bmt1 <- subset(bmt, bmt$group==1) bmt1 <- bmt1[order(bmt1$t2),] ## in bmt1: ## d1: death indicator ## d2: relapse indicator ## d3: =0 if alive, =1 if (death or relapse) ## create single column for event types so that: ## 0 is censored ## 1 is death ## 2 is relapse [add d1 and d2 values] bmt1$type <- bmt1$d1+bmt1$d2 #length = 38 ## obtain cumulative incidence fit fit.1 <- cuminc(bmt1$t2, bmt1$type, cencode=0) ## this does not repeat times, so length = 37 fits.c <- timepoints(fit.1,bmt1$t2) fits.c ## plot competing risks plot(fit.1) ## to customize, use custom function from package plot.cuminc(fit.1, col=1:2, curvlab=c("trm", "Relapse"), xlim=c(0,1000),xlab="days Post Transplant", ylab="probability", main="competing Risks:\nCumulative Incidence for ALL group")

12 Competing Risks in R" 12/14 ## convert list to df, format nicely ## dim(df) = 37x4 df <- as.data.frame(cbind(t(fits.c$est),t(fits.c$var))) names(df) <- c("trm CI", "relapse CI", "var(trm CI)", "var(relapse CI)") row.names(df) <- c(1:length(df[[1]])) ## can add times ## unique b/c cuminc() does it this way df$time <- sort(unique(bmt1$t2)) ## rearrange order of display df <- df[,c(5,1,2)] ## for intxn plot, first add the sum of trm and relapse df$sum <- df$trm+df$relapse ## interaction plot (fig 4.14) plot(df$time, df$relapse, type="s", ylim=c(0,1), xlim=c(0,1000),xlab="days Post Transplant", ylab="probability") lines(df$time, df$sum, type="s") text(700, 0.1, "Relapse Probability", cex=0.7) text(700, 0.4, "Death in Remission Probability", cex=0.7) text(300, 0.8, "Disease Free Survival Probability", cex=0.7) title("interaction between the relapse and\n death in remission (KM Fig 4.14)") ## add 1-KME for both risks ## KME = Kaplan Meier Estimate ## TRM bmt1$death <- ifelse(bmt1$type==1,1,0) km.death <- survfit(surv(bmt1$t2, bmt1$death)~1) ## Relapse 1-KME bmt1$relapse <- ifelse(bmt1$type==2,1,0) km.relapse <- survfit(surv(bmt1$t2, bmt1$relapse) ~ 1) trm.1mkm <- 1-km.death$surv relapse.1mkm <- 1-km.relapse$surv

13 Competing Risks in R" 13/14 df2 <- cbind(df$time, trm.1mkm, relapse.1mkm, df$trm, df$relapse) df2 <- as.data.frame(df2) names(df2) <- c("time", "TRM 1-KME", "Relapse 1-KME", "TRM CI", "Relapse CI") row.names(df2) <- c(1:dim(df2)[1]) ## now showing that sum of CumInc is 1-KME for original treatment failure ## find sum df2$cisum <- df2[,4]+df2[,5] kme.all <- survfit(surv(bmt1$t2, bmt1$d3) ~ 1) kme.comp <- 1-kme.all$surv ## add to dataframe df2$kme.comp <- kme.comp ## we can also run all groups ## rename bmt$group <- factor(bmt$group, levels=c(1:3), labels=c("all", "AML lo", "AML hi")) bmt$type <- bmt$d1+bmt$d2 table(bmt$group, bmt$type) fit.3g <- cuminc(bmt$t2, bmt$type, bmt$group, cencode=0) ## and discuss test results fit.3g Tests: stat pv df For type=1 (TRM), the 3 groups are not significantly different For type=2 (Relapse), the 3 groups are significanlty different ##this may be confirmed by looking at the plot ## TRM (1) by group in black, Relapse (2) by group in red plot.cuminc(fit.3g, col=c(1,1,1,2,2,2), xlab="days Post Transplant", ylab="probability", main="all three groups compared.\nblack: TRM by group. Red: Relapse by group")

14 Competing Risks in R: Summary Handout The CIF estimates the probability that the event of interest occurs before time t and that it occurs before any of the competing causes of failure. Cumulative incidence function library(cmprsk) thebestfit <- cuminc(ftime, fstatus, group, cencode=0,...) where: ftime is failure time / fstatus is failure status (competing risks, initially censored observations) / group is optional group indicator / cencode is the code for censored observations in fstatus Custom times timepoints(thebestfit, times) where: times is a vector of times Test for equality of each risk among groups (if > 1): thebestfit$tests Plot using custom function plot.cuminc(thebestfit, main="fancy title", curvlab, ylim, xlim, wh=2,lty=1:length(x), color=1,...) Create interaction plot for two competing risks plot(ftime, risk1,...) lines(ftime, <vector of sum of risk1 and risk2>,...)

The cmprsk Package. January 1, 2007

The cmprsk Package. January 1, 2007 The cmprsk Package January 1, 2007 Version 2.1-7 Date 2006-dec-24 Title Subdistribution Analysis of Competing Risks Author Bob Gray Maintainer Bob Gray

More information

Package intccr. September 12, 2017

Package intccr. September 12, 2017 Type Package Package intccr September 12, 2017 Title Semiparametric Competing Risks Regression under Interval Censoring Version 0.2.0 Author Giorgos Bakoyannis , Jun Park

More information

Package casebase. April 29, 2017

Package casebase. April 29, 2017 Type Package Package casebase April 29, 2017 Title Fitting Flexible Smooth-in-Time Hazards and Risk Functions via Logistic and Multinomial Regression Version 0.1.0 Date 2017-4-28 Implements the casebase

More information

Extensions to the Cox Model: Stratification

Extensions to the Cox Model: Stratification Extensions to the Cox Model: Stratification David M. Rocke May 30, 2017 David M. Rocke Extensions to the Cox Model: Stratification May 30, 2017 1 / 13 Anderson Data Remission survival times on 42 leukemia

More information

Package PTE. October 10, 2017

Package PTE. October 10, 2017 Type Package Title Personalized Treatment Evaluator Version 1.6 Date 2017-10-9 Package PTE October 10, 2017 Author Adam Kapelner, Alina Levine & Justin Bleich Maintainer Adam Kapelner

More information

Package npsurv. R topics documented: October 14, Title Nonparametric Survival Analysis Version Date Author Yong Wang

Package npsurv. R topics documented: October 14, Title Nonparametric Survival Analysis Version Date Author Yong Wang Title Nonparametric Survival Analysis Version 0.4-0 Date 2017-10-13 Author Yong Wang Package npsurv October 14, 2017 Maintainer Yong Wang Depends lsei Imports methods Contains

More information

Package MIICD. May 27, 2017

Package MIICD. May 27, 2017 Type Package Package MIICD May 27, 2017 Title Multiple Imputation for Interval Censored Data Version 2.4 Depends R (>= 2.13.0) Date 2017-05-27 Maintainer Marc Delord Implements multiple

More information

Package compeir. February 19, 2015

Package compeir. February 19, 2015 Type Package Package compeir February 19, 2015 Title Event-specific incidence rates for competing risks data Version 1.0 Date 2011-03-09 Author Nadine Grambauer, Andreas Neudecker Maintainer Nadine Grambauer

More information

Package BGPhazard. R topics documented: February 11, Version Date

Package BGPhazard. R topics documented: February 11, Version Date Version 1.2.3 Date 2016-02-11 Package BGPhazard February 11, 2016 Title Markov Beta and Gamma Processes for Modeling Hazard Rates Author L. E. Nieto-Barajas and J. A. Garcia Bueno Maintainer Jose Antonio

More information

Package MTLR. March 9, 2019

Package MTLR. March 9, 2019 Type Package Package MTLR March 9, 2019 Title Survival Prediction with Multi-Task Logistic Regression Version 0.2.0 Author Humza Haider Maintainer Humza Haider URL https://github.com/haiderstats/mtlr

More information

Data Annotations in Clinical Trial Graphs Sudhir Singh, i3 Statprobe, Cary, NC

Data Annotations in Clinical Trial Graphs Sudhir Singh, i3 Statprobe, Cary, NC PharmaSUG2010 - Paper TT16 Data Annotations in Clinical Trial Graphs Sudhir Singh, i3 Statprobe, Cary, NC ABSTRACT Graphical representation of clinical data is used for concise visual presentations of

More information

Package LTRCtrees. March 29, 2018

Package LTRCtrees. March 29, 2018 Type Package Package LTRCtrees March 29, 2018 Title Survival Trees to Fit Left-Truncated and Right-Censored and Interval-Censored Survival Data Version 1.1.0 Description Recursive partition algorithms

More information

Package relsurv. R topics documented: October 5, Title Relative Survival Date Version 2.1-1

Package relsurv. R topics documented: October 5, Title Relative Survival Date Version 2.1-1 Title Relative Survival Date 2017-10-5 Version 2.1-1 Package relsurv October 5, 2017 Author Maja Pohar Perme [aut, cre], Klemen Pavlic [ctb] Maintainer Maja Pohar Perme Various

More information

Package relsurv. R topics documented: October 18, Title Relative Survival Date Version Author Maja Pohar Perme [aut, cre]

Package relsurv. R topics documented: October 18, Title Relative Survival Date Version Author Maja Pohar Perme [aut, cre] Title Relative Survival Date 2018-10-18 Version 2.2-2 Author Maja Pohar Perme [aut, cre] Package relsurv October 18, 2018 Maintainer Maja Pohar Perme Contains functions for analysing

More information

Uplift modeling for clinical trial data

Uplift modeling for clinical trial data Maciej Jaśkowski maciej.jaskowski@gmail.com Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland Szymon Jaroszewicz National Institute of Telecommunications, Warsaw, Poland Institute

More information

R Scripts and Functions Survival Short Course

R Scripts and Functions Survival Short Course R Scripts and Functions Survival Short Course MISCELLANEOUS SCRIPTS (1) Survival Curves Define specific dataset for practice: Hodgkin (Site=33011), Young (Age

More information

ggplot2 for Epi Studies Leah McGrath, PhD November 13, 2017

ggplot2 for Epi Studies Leah McGrath, PhD November 13, 2017 ggplot2 for Epi Studies Leah McGrath, PhD November 13, 2017 Introduction Know your data: data exploration is an important part of research Data visualization is an excellent way to explore data ggplot2

More information

Package nricens. R topics documented: May 30, Type Package

Package nricens. R topics documented: May 30, Type Package Type Package Package nricens May 30, 2018 Title NRI for Risk Prediction Models with Time to Event and Binary Response Data Version 1.6 Date 2018-5-30 Author Eisuke Inoue Depends survival Maintainer Eisuke

More information

AWELL-KNOWN technique for choosing a probability

AWELL-KNOWN technique for choosing a probability 248 IEEE TRANSACTIONS ON RELIABILITY VOL 57 NO 2 JUNE 2008 Parametric Model Discrimination for Heavily Censored Survival Data A Daniel Block and Lawrence M Leemis Abstract Simultaneous discrimination among

More information

Beyond the Assumption of Constant Hazard Rate in Estimating Incidence Rate on Current Status Data with Applications to Phase IV Cancer Trial

Beyond the Assumption of Constant Hazard Rate in Estimating Incidence Rate on Current Status Data with Applications to Phase IV Cancer Trial Beyond the Assumption of Constant Hazard Rate in Estimating Incidence Rate on Current Status Data with Applications to Phase IV Cancer Trial Deokumar Srivastava, Ph.D. Member Department of Biostatistics

More information

idem: Inference in Randomized Controlled Trials with Death a... Missingness

idem: Inference in Randomized Controlled Trials with Death a... Missingness idem: Inference in Randomized Controlled Trials with Death and Missingness Chenguang Wang Introduction 2016-07-04 In randomized studies involving severely ill patients, functional outcomes are often unobserved

More information

Interactive Programming Using Task in SAS Studio

Interactive Programming Using Task in SAS Studio ABSTRACT PharmaSUG 2018 - Paper QT-10 Interactive Programming Using Task in SAS Studio Suwen Li, Hoffmann-La Roche Ltd., Mississauga, ON SAS Studio is a web browser-based application with visual point-and-click

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

Package SmoothHazard

Package SmoothHazard Package SmoothHazard September 19, 2014 Title Fitting illness-death model for interval-censored data Version 1.2.3 Author Celia Touraine, Pierre Joly, Thomas A. Gerds SmoothHazard is a package for fitting

More information

PART III APPLICATIONS

PART III APPLICATIONS S. Vieira PART III APPLICATIONS Fuzz IEEE 2013, Hyderabad India 1 Applications Finance Value at Risk estimation based on a PFS model for density forecast of a continuous response variable conditional on

More information

Fitting latency models using B-splines in EPICURE for DOS

Fitting latency models using B-splines in EPICURE for DOS Fitting latency models using B-splines in EPICURE for DOS Michael Hauptmann, Jay Lubin January 11, 2007 1 Introduction Disease latency refers to the interval between an increment of exposure and a subsequent

More information

Package FHtest. November 8, 2017

Package FHtest. November 8, 2017 Type Package Package FHtest November 8, 2017 Title Tests for Right and Interval-Censored Survival Data Based on the Fleming-Harrington Class Version 1.4 Date 2017-11-8 Author Ramon Oller, Klaus Langohr

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

TTEDesigner User s Manual

TTEDesigner User s Manual TTEDesigner User s Manual John D. Cook Department of Biostatistics, Box 447 The University of Texas, M. D. Anderson Cancer Center 1515 Holcombe Blvd., Houston, Texas 77030, USA cook@mdanderson.org September

More information

Package rpst. June 6, 2017

Package rpst. June 6, 2017 Type Package Title Recursive Partitioning Survival Trees Version 1.0.0 Date 2017-6-6 Author Maintainer Package rpst June 6, 2017 An implementation of Recursive Partitioning Survival Trees

More information

Comparison of Methods for Analyzing and Interpreting Censored Exposure Data

Comparison of Methods for Analyzing and Interpreting Censored Exposure Data Comparison of Methods for Analyzing and Interpreting Censored Exposure Data Paul Hewett Ph.D. CIH Exposure Assessment Solutions, Inc. Gary H. Ganser Ph.D. West Virginia University Comparison of Methods

More information

nquery Sample Size & Power Calculation Software Validation Guidelines

nquery Sample Size & Power Calculation Software Validation Guidelines nquery Sample Size & Power Calculation Software Validation Guidelines Every nquery sample size table, distribution function table, standard deviation table, and tablespecific side table has been tested

More information

Solution to Tumor growth in mice

Solution to Tumor growth in mice Solution to Tumor growth in mice Exercise 1 1. Import the data to R Data is in the file tumorvols.csv which can be read with the read.csv2 function. For a succesful import you need to tell R where exactly

More information

Preliminary Figures for Renormalizing Illumina SNP Cell Line Data

Preliminary Figures for Renormalizing Illumina SNP Cell Line Data Preliminary Figures for Renormalizing Illumina SNP Cell Line Data Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................

More information

HELP AND USERS GUIDE

HELP AND USERS GUIDE HELP AND USERS GUIDE CONTENTS What is BMTbase? 4 How do I access BMTbase? 4 What can BMTbase do? 4 What can I do with the report? 4 What data is included? 4 How do I run a report? 5 What is a report? 5

More information

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Introduction to Minitab The interface for Minitab is very user-friendly, with a spreadsheet orientation. When you first launch Minitab, you will see

More information

Package rereg. May 30, 2018

Package rereg. May 30, 2018 Title Recurrent Event Regression Version 1.1.4 Package rereg May 30, 2018 A collection of regression models for recurrent event process and failure time. Available methods include these from Xu et al.

More information

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Minitab 17 commands Prepared by Jeffrey S. Simonoff Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save

More information

Pooling Clinical Data: Key points and Pitfalls. October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit

Pooling Clinical Data: Key points and Pitfalls. October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit Pooling Clinical Data: Key points and Pitfalls October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit Introduction Are there any pre-defined rules to pool clinical data? Are there any pre-defined

More information

Package risksetroc. February 20, 2015

Package risksetroc. February 20, 2015 Version 1.0.4 Date 2012-04-13 Package risksetroc February 20, 2015 Title Riskset ROC curve estimation from censored survival data Author Patrick J. Heagerty , packaging by Paramita

More information

Package SPREDA. November 25, 2018

Package SPREDA. November 25, 2018 Type Package Package SPREDA November 25, 2018 Title Statistical Package for Reliability Data Analysis Version 1.1 Date 2018-11-25 Author Yili Hong, Yimeng Xie, and Zhibing Xu Maintainer Yili Hong

More information

Nonparametric Estimation of Distribution Function using Bezier Curve

Nonparametric Estimation of Distribution Function using Bezier Curve Communications for Statistical Applications and Methods 2014, Vol. 21, No. 1, 105 114 DOI: http://dx.doi.org/10.5351/csam.2014.21.1.105 ISSN 2287-7843 Nonparametric Estimation of Distribution Function

More information

Kaplan-Meier Survival Plotting Macro %NEWSURV Jeffrey Meyers, Mayo Clinic, Rochester, Minnesota

Kaplan-Meier Survival Plotting Macro %NEWSURV Jeffrey Meyers, Mayo Clinic, Rochester, Minnesota PharmaSUG 2014 - Paper BB13 Kaplan-Meier Survival Plotting Macro %NEWSURV Jeffrey Meyers, Mayo Clinic, Rochester, Minnesota 1.0 ABSTRACT The research areas of pharmaceuticals and oncology clinical trials

More information

SAS Instructions Entering the data and plotting survival curves

SAS Instructions Entering the data and plotting survival curves SAS Instructions Entering the data and plotting survival curves Entering the data The first part of most SAS programs consists in creating a dataset. This is done through the DATA statement. You can either

More information

Package Canopy. April 8, 2017

Package Canopy. April 8, 2017 Type Package Package Canopy April 8, 2017 Title Accessing Intra-Tumor Heterogeneity and Tracking Longitudinal and Spatial Clonal Evolutionary History by Next-Generation Sequencing Version 1.2.0 Author

More information

Package DSBayes. February 19, 2015

Package DSBayes. February 19, 2015 Type Package Title Bayesian subgroup analysis in clinical trials Version 1.1 Date 2013-12-28 Copyright Ravi Varadhan Package DSBayes February 19, 2015 URL http: //www.jhsph.edu/agingandhealth/people/faculty_personal_pages/varadhan.html

More information

Introduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w

Introduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No.677 C.B.C.L Paper No.8

More information

Cluster Randomization Create Cluster Means Dataset

Cluster Randomization Create Cluster Means Dataset Chapter 270 Cluster Randomization Create Cluster Means Dataset Introduction A cluster randomization trial occurs when whole groups or clusters of individuals are treated together. Examples of such clusters

More information

The dblcens Package. August 7, Title Compute the NPMLE of distribution from doubly censored data. d d011ch... 4.

The dblcens Package. August 7, Title Compute the NPMLE of distribution from doubly censored data. d d011ch... 4. The dblcens Package August 7, 2005 Title Compute the NPMLE of distribution from doubly censored data Version 1.1.3 Author Mai Zhou, Li Lee, Kun Chen. Description Use EM algorithm to compute the NPMLE of

More information

Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008.

Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008. Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008. Introduction There is much that is unknown regarding

More information

Package spcadjust. September 29, 2016

Package spcadjust. September 29, 2016 Version 1.1 Date 2015-11-20 Title Functions for Calibrating Control Charts Package spcadjust September 29, 2016 Author Axel Gandy and Jan Terje Kvaloy . Maintainer

More information

Risk Score Imputation tutorial (Hsu 2009)

Risk Score Imputation tutorial (Hsu 2009) Risk Score Imputation tutorial (Hsu 2009) Nikolas S. Burkoff **, Paul Metcalfe *, Jonathan Bartlett * and David Ruau * * AstraZeneca, B&I, Advanced Analytics Centre, UK ** Tessella, 26 The Quadrant, Abingdon

More information

STAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015

STAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015 STAT 113: Lab 9 Colin Reimer Dawson Last revised November 10, 2015 We will do some of the following together. The exercises with a (*) should be done and turned in as part of HW9. Before we start, let

More information

Unified Methods for Censored Longitudinal Data and Causality

Unified Methods for Censored Longitudinal Data and Causality Mark J. van der Laan James M. Robins Unified Methods for Censored Longitudinal Data and Causality Springer Preface v Notation 1 1 Introduction 8 1.1 Motivation, Bibliographic History, and an Overview of

More information

RCASPAR: A package for survival time analysis using high-dimensional data

RCASPAR: A package for survival time analysis using high-dimensional data RCASPAR: A package for survival time analysis using high-dimensional data Douaa AS Mugahid October 30, 2017 1 Introduction RCASPAR allows the utilization of high-dimensional biological data as covariates

More information

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr.

CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA. By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. CANCER PREDICTION USING PATTERN CLASSIFICATION OF MICROARRAY DATA By: Sudhir Madhav Rao &Vinod Jayakumar Instructor: Dr. Michael Nechyba 1. Abstract The objective of this project is to apply well known

More information

Statistical Modeling with Spline Functions Methodology and Theory

Statistical Modeling with Spline Functions Methodology and Theory This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H. Hansen University of California at Los Angeles Jianhua Z. Huang University of Pennsylvania

More information

Package survivalmpl. December 11, 2017

Package survivalmpl. December 11, 2017 Package survivalmpl December 11, 2017 Title Penalised Maximum Likelihood for Survival Analysis Models Version 0.2 Date 2017-10-13 Author Dominique-Laurent Couturier, Jun Ma, Stephane Heritier, Maurizio

More information

SAS/STAT 14.1 User s Guide. Customizing the Kaplan-Meier Survival Plot

SAS/STAT 14.1 User s Guide. Customizing the Kaplan-Meier Survival Plot SAS/STAT 14.1 User s Guide Customizing the Kaplan-Meier Survival Plot This document is an individual chapter from SAS/STAT 14.1 User s Guide. The correct bibliographic citation for this manual is as follows:

More information

Package FCGR. October 13, 2015

Package FCGR. October 13, 2015 Type Package Title Fatigue Crack Growth in Reliability Version 1.0-0 Date 2015-09-29 Package FCGR October 13, 2015 Author Antonio Meneses , Salvador Naya ,

More information

Package gems. March 26, 2017

Package gems. March 26, 2017 Type Package Title Generalized Multistate Simulation Model Version 1.1.1 Date 2017-03-26 Package gems March 26, 2017 Author Maintainer Luisa Salazar Vizcaya Imports

More information

Package RcmdrPlugin.survival

Package RcmdrPlugin.survival Type Package Package RcmdrPlugin.survival October 21, 2017 Title R Commander Plug-in for the 'survival' Package Version 1.2-0 Date 2017-10-21 Author John Fox Maintainer John Fox Depends

More information

Brief Guide on Using SPSS 10.0

Brief Guide on Using SPSS 10.0 Brief Guide on Using SPSS 10.0 (Use student data, 22 cases, studentp.dat in Dr. Chang s Data Directory Page) (Page address: http://www.cis.ysu.edu/~chang/stat/) I. Processing File and Data To open a new

More information

Comparison of Optimization Methods for L1-regularized Logistic Regression

Comparison of Optimization Methods for L1-regularized Logistic Regression Comparison of Optimization Methods for L1-regularized Logistic Regression Aleksandar Jovanovich Department of Computer Science and Information Systems Youngstown State University Youngstown, OH 44555 aleksjovanovich@gmail.com

More information

Package survminer. November 17, 2017

Package survminer. November 17, 2017 Type Package Title Drawing Survival Curves using 'ggplot2' Version 0.4.1 Date 2017-11-17 Package survminer November 17, 2017 Description Contains the function 'ggsurvplot()' for drawing easily beautiful

More information

An introduction to SPSS

An introduction to SPSS An introduction to SPSS To open the SPSS software using U of Iowa Virtual Desktop... Go to https://virtualdesktop.uiowa.edu and choose SPSS 24. Contents NOTE: Save data files in a drive that is accessible

More information

Working with Composite Endpoints: Constructing Analysis Data Pushpa Saranadasa, Merck & Co., Inc., Upper Gwynedd, PA

Working with Composite Endpoints: Constructing Analysis Data Pushpa Saranadasa, Merck & Co., Inc., Upper Gwynedd, PA PharmaSug2016- Paper HA03 Working with Composite Endpoints: Constructing Analysis Data Pushpa Saranadasa, Merck & Co., Inc., Upper Gwynedd, PA ABSTRACT A composite endpoint in a Randomized Clinical Trial

More information

Advanced Logical Thinking Skills (2)

Advanced Logical Thinking Skills (2) Mei-Writing Academic Writing II(A) - Lecture 3 November 12, 2013 Advanced Logical Thinking Skills (2) A Logical Explanation of Causal Relation by Paul W. L. Lai Group Discussion: Think about your thesis

More information

Extensions of the dlnm package

Extensions of the dlnm package Extensions of the dlnm package Antonio Gasparrini London School of Hygiene & Tropical Medicine, UK dlnm version 2.3.2, 17-1-16 Contents 1 Preamble 2 2 Data 2 3 The matrix of exposure histories 3 4 Applications

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Correctly Compute Complex Samples Statistics

Correctly Compute Complex Samples Statistics PASW Complex Samples 17.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample

More information

Modelling Personalized Screening: a Step Forward on Risk Assessment Methods

Modelling Personalized Screening: a Step Forward on Risk Assessment Methods Modelling Personalized Screening: a Step Forward on Risk Assessment Methods Validating Prediction Models Inmaculada Arostegui Universidad del País Vasco UPV/EHU Red de Investigación en Servicios de Salud

More information

Creating Forest Plots Using SAS/GRAPH and the Annotate Facility

Creating Forest Plots Using SAS/GRAPH and the Annotate Facility PharmaSUG2011 Paper TT12 Creating Forest Plots Using SAS/GRAPH and the Annotate Facility Amanda Tweed, Millennium: The Takeda Oncology Company, Cambridge, MA ABSTRACT Forest plots have become common in

More information

A Comparison of Modeling Scales in Flexible Parametric Models. Noori Akhtar-Danesh, PhD McMaster University

A Comparison of Modeling Scales in Flexible Parametric Models. Noori Akhtar-Danesh, PhD McMaster University A Comparison of Modeling Scales in Flexible Parametric Models Noori Akhtar-Danesh, PhD McMaster University Hamilton, Canada daneshn@mcmaster.ca Outline Backgroundg A review of splines Flexible parametric

More information

Dealing with Data in Excel 2013/2016

Dealing with Data in Excel 2013/2016 Dealing with Data in Excel 2013/2016 Excel provides the ability to do computations and graphing of data. Here we provide the basics and some advanced capabilities available in Excel that are useful for

More information

Individual Covariates

Individual Covariates WILD 502 Lab 2 Ŝ from Known-fate Data with Individual Covariates Today s lab presents material that will allow you to handle additional complexity in analysis of survival data. The lab deals with estimation

More information

Towards a Survival Analysis of Database Framework Usage in Java Projects

Towards a Survival Analysis of Database Framework Usage in Java Projects Towards a Survival Analysis of Database Framework Usage in Java Projects Mathieu Goeminne and Tom Mens Software Engineering Lab, University of Mons, Belgium Email: { first. last } @ umons.ac.be Abstract

More information

Edge-exchangeable graphs and sparsity

Edge-exchangeable graphs and sparsity Edge-exchangeable graphs and sparsity Tamara Broderick Department of EECS Massachusetts Institute of Technology tbroderick@csail.mit.edu Diana Cai Department of Statistics University of Chicago dcai@uchicago.edu

More information

RSM Split-Plot Designs & Diagnostics Solve Real-World Problems

RSM Split-Plot Designs & Diagnostics Solve Real-World Problems RSM Split-Plot Designs & Diagnostics Solve Real-World Problems Shari Kraber Pat Whitcomb Martin Bezener Stat-Ease, Inc. Stat-Ease, Inc. Stat-Ease, Inc. 221 E. Hennepin Ave. 221 E. Hennepin Ave. 221 E.

More information

Submission Guidelines

Submission Guidelines Submission Guidelines Clinical Trial Results invites the submission of phase I, II, and III clinical trials for publication in a brief print format, with full trials results online. We encourage the submission

More information

E-Campus Inferential Statistics - Part 2

E-Campus Inferential Statistics - Part 2 E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence

More information

book 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition

book 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition book 2014/5/6 15:21 page v #3 Contents List of figures List of tables Preface to the second edition Preface to the first edition xvii xix xxi xxiii 1 Data input and output 1 1.1 Input........................................

More information

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology 9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example

More information

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys Unit 7 Statistics AFM Mrs. Valentine 7.1 Samples and Surveys v Obj.: I will understand the different methods of sampling and studying data. I will be able to determine the type used in an example, and

More information

Package ranger. November 10, 2015

Package ranger. November 10, 2015 Type Package Title A Fast Implementation of Random Forests Version 0.3.0 Date 2015-11-10 Author Package November 10, 2015 Maintainer A fast implementation of Random Forests,

More information

Abstract Various classes of distance metrics between two point process realizations. observed on a common metric space are outlined.

Abstract Various classes of distance metrics between two point process realizations. observed on a common metric space are outlined. Annals of the Institute of Statistical Mathematics manuscript No. (will be inserted by the editor) On distances between point patterns and their applications Jorge Mateu Frederic P Schoenberg David M Diez

More information

An Empirical Study of Operating Systems Errors. Jing Huang

An Empirical Study of Operating Systems Errors. Jing Huang An Empirical Study of Operating Systems Errors Jing Huang Background previous research manual inspection of logs, testing, and surveys because static analysis is applied uniformly to the entire kernel

More information

Are We Integrating Biologic Advances in Multiple Myeloma Into Clinical Practice?

Are We Integrating Biologic Advances in Multiple Myeloma Into Clinical Practice? Are We Integrating Biologic Advances in Multiple Myeloma Into Clinical Practice? Minimal Residual Disease: A Measurable and Relevant Endpoint in Treatment Xavier Leleu Service d Hématologie et Thérapie

More information

Support Vector Machines: Brief Overview" November 2011 CPSC 352

Support Vector Machines: Brief Overview November 2011 CPSC 352 Support Vector Machines: Brief Overview" Outline Microarray Example Support Vector Machines (SVMs) Software: libsvm A Baseball Example with libsvm Classifying Cancer Tissue: The ALL/AML Dataset Golub et

More information

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016 Resampling Methods Levi Waldron, CUNY School of Public Health July 13, 2016 Outline and introduction Objectives: prediction or inference? Cross-validation Bootstrap Permutation Test Monte Carlo Simulation

More information

Ranjan Maitra and Ivan P. Ramler

Ranjan Maitra and Ivan P. Ramler Supplement to A k-mean-directions Algorithm for Fast Clustering of Data on the Sphere published in the Journal of Computational and Graphical Statistics Ranjan Maitra and Ivan P. Ramler S-1. ADDITIONAL

More information

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created

More information

Biostat Methods STAT 5820/6910 Handout #9 Meta-Analysis Examples

Biostat Methods STAT 5820/6910 Handout #9 Meta-Analysis Examples Biostat Methods STAT 5820/6910 Handout #9 Meta-Analysis Examples Example 1 A RCT was conducted to consider whether steroid therapy for expectant mothers affects death rate of premature [less than 37 weeks]

More information

Package Icens. R topics documented: May 3, Title NPMLE for Censored and Truncated Data

Package Icens. R topics documented: May 3, Title NPMLE for Censored and Truncated Data Title NPMLE for Censored and Truncated Data Package Icens May 3, 2018 Many functions for computing the NPMLE for censored and truncated data. Version 1.52.0 Author R. Gentleman and Alain Vandal Maintainer

More information

PSS weighted analysis macro- user guide

PSS weighted analysis macro- user guide Description and citation: This macro performs propensity score (PS) adjusted analysis using stratification for cohort studies from an analytic file containing information on patient identifiers, exposure,

More information

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function

More information

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture IV: Quantitative Comparison of Models

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture IV: Quantitative Comparison of Models Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture IV: Quantitative Comparison of Models A classic mathematical model for enzyme kinetics is the Michaelis-Menten equation:

More information

plot(seq(0,10,1), seq(0,10,1), main = "the Title", xlim=c(1,20), ylim=c(1,20), col="darkblue");

plot(seq(0,10,1), seq(0,10,1), main = the Title, xlim=c(1,20), ylim=c(1,20), col=darkblue); R for Biologists Day 3 Graphing and Making Maps with Your Data Graphing is a pretty convenient use for R, especially in Rstudio. plot() is the most generalized graphing function. If you give it all numeric

More information

Updates and Errata for Statistical Data Analytics (1st edition, 2015)

Updates and Errata for Statistical Data Analytics (1st edition, 2015) Updates and Errata for Statistical Data Analytics (1st edition, 2015) Walter W. Piegorsch University of Arizona c 2018 The author. All rights reserved, except where previous rights exist. CONTENTS Preface

More information

Table Of Contents. Table Of Contents

Table Of Contents. Table Of Contents Statistics Table Of Contents Table Of Contents Basic Statistics... 7 Basic Statistics Overview... 7 Descriptive Statistics Available for Display or Storage... 8 Display Descriptive Statistics... 9 Store

More information