Estimating survival from Gray s flexible model. Outline. I. Introduction. I. Introduction. I. Introduction

Size: px
Start display at page:

Download "Estimating survival from Gray s flexible model. Outline. I. Introduction. I. Introduction. I. Introduction"

Transcription

1 Estimating survival from s flexible model Zdenek Valenta Department of Medical Informatics Institute of Computer Science Academy of Sciences of the Czech Republic I. Introduction Outline II. Semi parametric survival models III. s model introduction IV. Survival estimates based on semi-parametric models V. Estimating survival from s model (with examples) VI. Impact of misspecifying the survival model simulation study results VII. Discussion I. Introduction Let Y be a random variable capturing time to occurrence of a certain event of interest. The hazard function h(y) is at time y formally defined as follows: h(y) = lim y P (y Y < y + y Y y), () y where P (.) denotes conditional probability, that an event of interest would occur immediately after time y, given it did not prior to this time. It follows from () that the hazard function may only take non negative values. I. Introduction Let F (Y ) denote a cumulative distribution function of the random variable Y, i.e. F (Y ) = P (Y y). We assume that Y is absolutely continuous with density f(y). The expression () may be then written as: P (y Y < y + y Y y) h(y) = lim y y P (y Y < y + y) = lim y P (Y y) y d = P (Y y) dy F (y) = = f(y) S(y), f(y) P (Y y) where S(y) denotes the value of the survival function at time y. ()

2 I. Introduction We define a cumulative hazard function H(t) at time t as: H(t) = t h(y)dy (3) It follows from () that S(.) and H(.) capture equivalent information: H(t) = ln (S(t)) (4) Furthermore, it follows from (4) that we can determine the value of the survival function S(t) at time t whenever we are able to evaluate the cumulative hazard function H(t): S(t) = exp { H(t)} (5) II. Semi parametric survival models Multiplicative models: PH model: h(y Z) = h (y) exp (β Z) (6) s flexible model: h(y Z) = h (y) exp {β(y) Z} (7) Additive models: s linear model: h(y Z) = h (y) + β(y) Z (8) II. Semi parametric survival models Note: s linear model (8) may be embedded in the class of multiplicative models: s model: h(y Z) = h (y) + β(y) Z exp (h(y Z)) = exp { h (y) + β(y) Z } h (y Z) = h (y) exp { β(y) Z } The class of multiplicative models represented by the PH and s flexible model includes the whole class of models proposed by. (9) III. s model introduction Let us recall the definition of s flexible model: (7): h(y Z) = h (y) exp {β(y) Z} s model uses penalized B-splines for modelling timevarying effects β(y). B-splines allow for flexible modelling of the covariate effects β(y) and the hazard function over time. In the context of s model using piecewise-constant time-varying regression coefficients the β(y) remain constant for y [τ j, τ j ). We can thus write β(y) = β j = (β j, β j,..., β jp ), where p denotes the number of model covariates and j =,..., M + indexes the intervals on time axis. Here τ j denote the knots that allow for a change of the regression coefficients β j.

3 Piecewise-constant vs. quadratic penalized splines Piecewise-constant vs. cubic penalized splines Intervention Age Diabetes Mellitus Intervention Age Diabetes Mellitus Intervention Age 4 3 Diabetes Mellitus 5 5 Intervention Age Diabetes Mellitus IV. Survival estimates based on semi-parametric models PH model: S(t Z) = exp { t } h (y) exp (β Z) dy = exp { H (t) exp (β Z)} = [S (t)] exp(β Z), () where S (t) represents baseline survival function estimate at time t. IV. Survival estimates based on semi-parametric models s linear model: h(y Z) = h (y) + β(y) Z h(y Z) = β(y) Z, while β(y) = (h (y), β(y)) a Z = (, Z). () Survival function estimates based on s model use cumulative regression coefficients B(t), where B i (t) = t β i (y)dy. Estimating survival based on s model may then proceed as follows: { S (t Z) = exp B(t) Z } ()

4 V. Estimating survival from s model Survival function estimate based on s model a using piecewise constant penalized splines may be obtained as follows: { M+ S (t Z) = exp H j (t) exp ( β jz )}, (3) j= where Z denotes p-dimensional vector of patient s characteristics, and H j (t) = I (u t) dh (u) (4) [τ j,τ j ) represents a contribution to the cumulative baseline hazard function H (t) on the interval [τ j, τ j ), j =,..., M +. V. Estimating survival from s model Derivation of confidence limits for the survival function estimate based on s model uses the Delta method. Recall the Taylor formulae for a function f(x) of a random variable X with expectation µ: Delta method: f(x) = n k= f (k) (µ) (X µ) k + R n (5) k! Var(f(X)) Var [f(µ) + f (µ)(x µ)] = [f (µ)] Var(X) (6) a Valenta Z et al, Statistics in Medicine. V. Estimating survival from s model coxspline package in R If X is a random vector the Delta method G(X) takes the form: Var(G(X)) G (µ) Var(X) G(µ), (7) where G(µ) is a column vector of first partial derivatives of G. Confidence limits estimates a were derived for a log and log(- log) transformed survival function S(t) and are reported simultaneously in R. a Valenta Z et al, Statistics in Medicine.

5 cox.spline R-routine R-function gsurv.r cox.spline R-routine (cont.) R-function gsurv.r (cont.)

6 Example : survival estimates from s model with 95% C.L. (log-transf.) Example : survival estimates from s model with 95% C.L. (log-log-transf.) Estimated survival with 95% CL varying hazards data Increasing hazards (.,,) Decreasing hazards (,,.) Constant hazard (,,) Estimated survival with 95% CL varying hazards data Increasing hazards (.,,) Decreasing hazards (,,.) Constant hazard (,,) Follow up time (in days) Hazard change points are.6 and.8 years Follow up time (in days) Hazard change points are.6 and.8 years Example : Secondary prevention trial of CHD in Litomerice men after MI ( PH model results) Example : Secondary prevention trial of CHD in Litomerice men after MI ( s model results) Men 45 years of age Men 5 years of age Men 45 years of age Men 5 years of age 's survival probability Litomerice Study Litomerice Study 's survival probability Litomerice study Litomerice study Men 56 years of age Men 6 years of age Men 56 years of age Men 6 years of age Litomerice Study Litomerice Study Litomerice study Litomerice study Follow up time in days Follow up time in days

7 V. Estimating survival from s model Implementation of the coxspline package for R statistical system is available from the Dr. s website (Harvard University and Dana-Farber Cancer Institute, Boston, USA). Web address: Package coxspline, version.-, implements s model in R, including the survival function estimation using R-function gsurv.r. Current version of the coxspline package is compatible with the latest release of R.3.. (6-6-): VI. Impact of misspecifying the survival model a In three simulation studies we generated right-censored survival data that would satisfy exactly one of the semi-parametric survival models under consideration (i.e.,, ). The data obtained were subsequently analyzed using each of the three models considered. The performance of each model was assessed using the and Mean Square Error (MSE) of the estimated (conditional) survival distribution. a Valenta Z et al, Model misspecification effect in univariable regression models for rightcensored survival data, Proceedings of the Joint Statistical Meeting of the American Statistical Society. VI. Impact of misspecifying the survival model of the survival estimator Ŝ(t Z): ) (Ŝ(t Z) = n s Ŝ (i) (t Z) S(t Z) (8) n s Mean Square Error of the estimated survival Ŝ(t Z): i= i= ) MSE (Ŝ(t Z) = n s (Ŝ(i) ) (t Z) S(t Z) (9) n s -variance trade-off: ) MSE (Ŝ(t Z) = var(ŝ) + (Ŝ) () s model with constant β Legend Percentile of z % % 5% 5% 75% 9% 99% MSE Relative to 's model MSE Relative to 's model

8 s model with time-varying β(t) PH model Legend Legend Percentile of z % % 5% 5% 75% 9% 99% MSE Relative to 's model MSE Relative to 's model MSE Relative to 's model < median survival time....3 Percentile of z % % 5% 5% 75% 9% 99% MSE Relative to 's model < median....3 s model with time-varying β(t) VII. Discussion time > time > time >.5..5 Legend When the data satisfied s linear model, both s a s model rendered biased survival estimates. They have, however, often shown a lower MSE than the native model for the data at hand. When analyzing PH model data using the s routine, we observed no dramatic increase in bias and MSE relative to native model, while using the same criteria the survival estimates based on s model appeared to be highly distorted. MSE Relative to 's model time > MSE Relative to 's model time > censoring limit > Percentile of z % % 5% 5% 75% 9% 99% When the data followed s model with time-varying covariate effects, both s and s model rendered in terms of bias and MSE highly unreliable estimates of the conditional survival distribution. In other words, there was no alternative to using the native model in this instance

9 References [] DR: Regression Models and Life Tables (with discussion), Journal of the Royal Statistical Society, 97, Vol. 34, pp. 87. [] OO: A linear regression model for the analysis of life times, Statistics in Medicine, 989, Vol. 8, pp [3] RJ: Flexible methods for analyzing survival data using splines, with application to breast cancer prognosis, Journal of the American Statistical Association, 99, Vol. 87, pp Thank you for your attention! [4] RJ: Spline-based tests in survival analysis, Biometrics, 994, Vol. 5., pp [5] Valenta Z and Weissfeld LA: Estimation of the Survival Function for s Piecewise-Constant -Varying Coefficients Model, Statistics in Medicine,, Vol. (5), pp

100 Myung Hwan Na log-hazard function. The discussion section of Abrahamowicz, et al.(1992) contains a good review of many of the papers on the use of

100 Myung Hwan Na log-hazard function. The discussion section of Abrahamowicz, et al.(1992) contains a good review of many of the papers on the use of J. KSIAM Vol.3, No.2, 99-106, 1999 SPLINE HAZARD RATE ESTIMATION USING CENSORED DATA Myung Hwan Na Abstract In this paper, the spline hazard rate model to the randomly censored data is introduced. The

More information

Assessing the Quality of the Natural Cubic Spline Approximation

Assessing the Quality of the Natural Cubic Spline Approximation Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,

More information

Package ICsurv. February 19, 2015

Package ICsurv. February 19, 2015 Package ICsurv February 19, 2015 Type Package Title A package for semiparametric regression analysis of interval-censored data Version 1.0 Date 2014-6-9 Author Christopher S. McMahan and Lianming Wang

More information

Splines and penalized regression

Splines and penalized regression Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,

More information

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function

More information

A Comparison of Modeling Scales in Flexible Parametric Models. Noori Akhtar-Danesh, PhD McMaster University

A Comparison of Modeling Scales in Flexible Parametric Models. Noori Akhtar-Danesh, PhD McMaster University A Comparison of Modeling Scales in Flexible Parametric Models Noori Akhtar-Danesh, PhD McMaster University Hamilton, Canada daneshn@mcmaster.ca Outline Backgroundg A review of splines Flexible parametric

More information

CoxFlexBoost: Fitting Structured Survival Models

CoxFlexBoost: Fitting Structured Survival Models CoxFlexBoost: Fitting Structured Survival Models Benjamin Hofner 1 Institut für Medizininformatik, Biometrie und Epidemiologie (IMBE) Friedrich-Alexander-Universität Erlangen-Nürnberg joint work with Torsten

More information

Comparison of Methods for Analyzing and Interpreting Censored Exposure Data

Comparison of Methods for Analyzing and Interpreting Censored Exposure Data Comparison of Methods for Analyzing and Interpreting Censored Exposure Data Paul Hewett Ph.D. CIH Exposure Assessment Solutions, Inc. Gary H. Ganser Ph.D. West Virginia University Comparison of Methods

More information

Statistical Modeling with Spline Functions Methodology and Theory

Statistical Modeling with Spline Functions Methodology and Theory This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H. Hansen University of California at Los Angeles Jianhua Z. Huang University of Pennsylvania

More information

Additive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June

Additive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June for the Austrian housing market, June 14 2012 Ao. Univ. Prof. Dr. Fachbereich Stadt- und Regionalforschung Technische Universität Wien Dr. Strategic Risk Management Bank Austria UniCredit, Wien Inhalt

More information

Analyzing Longitudinal Data Using Regression Splines

Analyzing Longitudinal Data Using Regression Splines Analyzing Longitudinal Data Using Regression Splines Zhang Jin-Ting Dept of Stat & Appl Prob National University of Sinagpore August 18, 6 DSAP, NUS p.1/16 OUTLINE Motivating Longitudinal Data Parametric

More information

Multivariate probability distributions

Multivariate probability distributions Multivariate probability distributions September, 07 STAT 0 Class Slide Outline of Topics Background Discrete bivariate distribution 3 Continuous bivariate distribution STAT 0 Class Slide Multivariate

More information

Chapter 6 Normal Probability Distributions

Chapter 6 Normal Probability Distributions Chapter 6 Normal Probability Distributions 6-1 Review and Preview 6-2 The Standard Normal Distribution 6-3 Applications of Normal Distributions 6-4 Sampling Distributions and Estimators 6-5 The Central

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity The truth is never linear! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! But often

More information

Unified Methods for Censored Longitudinal Data and Causality

Unified Methods for Censored Longitudinal Data and Causality Mark J. van der Laan James M. Robins Unified Methods for Censored Longitudinal Data and Causality Springer Preface v Notation 1 1 Introduction 8 1.1 Motivation, Bibliographic History, and an Overview of

More information

A review of spline function selection procedures in R

A review of spline function selection procedures in R Matthias Schmid Department of Medical Biometry, Informatics and Epidemiology University of Bonn joint work with Aris Perperoglou on behalf of TG2 of the STRATOS Initiative September 1, 2016 Introduction

More information

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model Topic 16 - Other Remedies Ridge Regression Robust Regression Regression Trees Outline - Fall 2013 Piecewise Linear Model Bootstrapping Topic 16 2 Ridge Regression Modification of least squares that addresses

More information

Clustering Lecture 5: Mixture Model

Clustering Lecture 5: Mixture Model Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics

More information

Nonparametric Estimation of Distribution Function using Bezier Curve

Nonparametric Estimation of Distribution Function using Bezier Curve Communications for Statistical Applications and Methods 2014, Vol. 21, No. 1, 105 114 DOI: http://dx.doi.org/10.5351/csam.2014.21.1.105 ISSN 2287-7843 Nonparametric Estimation of Distribution Function

More information

An Introduction to the Bootstrap

An Introduction to the Bootstrap An Introduction to the Bootstrap Bradley Efron Department of Statistics Stanford University and Robert J. Tibshirani Department of Preventative Medicine and Biostatistics and Department of Statistics,

More information

Package MIICD. May 27, 2017

Package MIICD. May 27, 2017 Type Package Package MIICD May 27, 2017 Title Multiple Imputation for Interval Censored Data Version 2.4 Depends R (>= 2.13.0) Date 2017-05-27 Maintainer Marc Delord Implements multiple

More information

Spline-based self-controlled case series method

Spline-based self-controlled case series method (215),,, pp. 1 25 doi:// Spline-based self-controlled case series method YONAS GHEBREMICHAEL-WELDESELASSIE, HEATHER J. WHITAKER, C. PADDY FARRINGTON Department of Mathematics and Statistics, The Open University,

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Assignment No: 2. Assessment as per Schedule. Specifications Readability Assignments

Assignment No: 2. Assessment as per Schedule. Specifications Readability Assignments Specifications Readability Assignments Assessment as per Schedule Oral Total 6 4 4 2 4 20 Date of Performance:... Expected Date of Completion:... Actual Date of Completion:... ----------------------------------------------------------------------------------------------------------------

More information

Approximation of 3D-Parametric Functions by Bicubic B-spline Functions

Approximation of 3D-Parametric Functions by Bicubic B-spline Functions International Journal of Mathematical Modelling & Computations Vol. 02, No. 03, 2012, 211-220 Approximation of 3D-Parametric Functions by Bicubic B-spline Functions M. Amirfakhrian a, a Department of Mathematics,

More information

Package simsurv. May 18, 2018

Package simsurv. May 18, 2018 Type Package Title Simulate Survival Data Version 0.2.2 Date 2018-05-18 Package simsurv May 18, 2018 Maintainer Sam Brilleman Description Simulate survival times from standard

More information

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

Mixed Model-Based Hazard Estimation

Mixed Model-Based Hazard Estimation IN 1440-771X IBN 0 7326 1080 X Mixed Model-Based Hazard Estimation T. Cai, Rob J. Hyndman and M.P. and orking Paper 11/2000 December 2000 DEPRTMENT OF ECONOMETRIC ND BUINE TTITIC UTRI Mixed model-based

More information

DECISION SCIENCES INSTITUTE. Exponentially Derived Antithetic Random Numbers. (Full paper submission)

DECISION SCIENCES INSTITUTE. Exponentially Derived Antithetic Random Numbers. (Full paper submission) DECISION SCIENCES INSTITUTE (Full paper submission) Dennis Ridley, Ph.D. SBI, Florida A&M University and Scientific Computing, Florida State University dridley@fsu.edu Pierre Ngnepieba, Ph.D. Department

More information

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals BIOSTATS 540 Fall 017 8. SUPPLEMENT Normal, T, Chi Square, F and Sums of Normals Page 1 of Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals Topic 1. Normal Distribution.. a. Definition..

More information

Nonparametric regression using kernel and spline methods

Nonparametric regression using kernel and spline methods Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested

More information

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica Boston-Keio Workshop 2016. Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica... Mihoko Minami Keio University, Japan August 15, 2016 Joint work with

More information

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100.

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100. Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records

More information

Last time... Coryn Bailer-Jones. check and if appropriate remove outliers, errors etc. linear regression

Last time... Coryn Bailer-Jones. check and if appropriate remove outliers, errors etc. linear regression Machine learning, pattern recognition and statistical data modelling Lecture 3. Linear Methods (part 1) Coryn Bailer-Jones Last time... curse of dimensionality local methods quickly become nonlocal as

More information

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016 Resampling Methods Levi Waldron, CUNY School of Public Health July 13, 2016 Outline and introduction Objectives: prediction or inference? Cross-validation Bootstrap Permutation Test Monte Carlo Simulation

More information

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this

More information

2014 Stat-Ease, Inc. All Rights Reserved.

2014 Stat-Ease, Inc. All Rights Reserved. What s New in Design-Expert version 9 Factorial split plots (Two-Level, Multilevel, Optimal) Definitive Screening and Single Factor designs Journal Feature Design layout Graph Columns Design Evaluation

More information

dbets - diffusion Breakpoint Estimation Software

dbets - diffusion Breakpoint Estimation Software dbets - diffusion Breakpoint Estimation Software http://glimmer.rstudio.com/dbets/dbets/ This document describes the use of the online application dbets to estimate the diffusion (DIA) test breakpoints

More information

Nonparametric Approaches to Regression

Nonparametric Approaches to Regression Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)

More information

MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen

MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen Lecture 2: Feature selection Feature Selection feature selection (also called variable selection): choosing k < d important

More information

Smoothing and Forecasting Mortality Rates with P-splines. Iain Currie. Data and problem. Plan of talk

Smoothing and Forecasting Mortality Rates with P-splines. Iain Currie. Data and problem. Plan of talk Smoothing and Forecasting Mortality Rates with P-splines Iain Currie Heriot Watt University Data and problem Data: CMI assured lives : 20 to 90 : 1947 to 2002 Problem: forecast table to 2046 London, June

More information

Curve fitting using linear models

Curve fitting using linear models Curve fitting using linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark September 28, 2012 1 / 12 Outline for today linear models and basis functions polynomial regression

More information

Package intccr. September 12, 2017

Package intccr. September 12, 2017 Type Package Package intccr September 12, 2017 Title Semiparametric Competing Risks Regression under Interval Censoring Version 0.2.0 Author Giorgos Bakoyannis , Jun Park

More information

Multiple imputation using chained equations: Issues and guidance for practice

Multiple imputation using chained equations: Issues and guidance for practice Multiple imputation using chained equations: Issues and guidance for practice Ian R. White, Patrick Royston and Angela M. Wood http://onlinelibrary.wiley.com/doi/10.1002/sim.4067/full By Gabrielle Simoneau

More information

How to use the rbsurv Package

How to use the rbsurv Package How to use the rbsurv Package HyungJun Cho, Sukwoo Kim, Soo-heang Eo, and Jaewoo Kang April 30, 2018 Contents 1 Introduction 1 2 Robust likelihood-based survival modeling 2 3 Algorithm 2 4 Example: Glioma

More information

Computational Physics PHYS 420

Computational Physics PHYS 420 Computational Physics PHYS 420 Dr Richard H. Cyburt Assistant Professor of Physics My office: 402c in the Science Building My phone: (304) 384-6006 My email: rcyburt@concord.edu My webpage: www.concord.edu/rcyburt

More information

An introduction to multi-armed bandits

An introduction to multi-armed bandits An introduction to multi-armed bandits Henry WJ Reeve (Manchester) (henry.reeve@manchester.ac.uk) A joint work with Joe Mellor (Edinburgh) & Professor Gavin Brown (Manchester) Plan 1. An introduction to

More information

Statistical Modeling with Spline Functions Methodology and Theory

Statistical Modeling with Spline Functions Methodology and Theory This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H Hansen University of California at Los Angeles Jianhua Z Huang University of Pennsylvania Charles

More information

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG WHAT IS IT? The Generalized Regression platform was introduced in JMP Pro 11 and got much better in version

More information

Package survivalmpl. December 11, 2017

Package survivalmpl. December 11, 2017 Package survivalmpl December 11, 2017 Title Penalised Maximum Likelihood for Survival Analysis Models Version 0.2 Date 2017-10-13 Author Dominique-Laurent Couturier, Jun Ma, Stephane Heritier, Maurizio

More information

Package rereg. May 30, 2018

Package rereg. May 30, 2018 Title Recurrent Event Regression Version 1.1.4 Package rereg May 30, 2018 A collection of regression models for recurrent event process and failure time. Available methods include these from Xu et al.

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

Topics in Machine Learning

Topics in Machine Learning Topics in Machine Learning Gilad Lerman School of Mathematics University of Minnesota Text/slides stolen from G. James, D. Witten, T. Hastie, R. Tibshirani and A. Ng Machine Learning - Motivation Arthur

More information

Nonparametric Survey Regression Estimation in Two-Stage Spatial Sampling

Nonparametric Survey Regression Estimation in Two-Stage Spatial Sampling Nonparametric Survey Regression Estimation in Two-Stage Spatial Sampling Siobhan Everson-Stewart, F. Jay Breidt, Jean D. Opsomer January 20, 2004 Key Words: auxiliary information, environmental surveys,

More information

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010

Lecture 8. Divided Differences,Least-Squares Approximations. Ceng375 Numerical Computations at December 9, 2010 Lecture 8, Ceng375 Numerical Computations at December 9, 2010 Computer Engineering Department Çankaya University 8.1 Contents 1 2 3 8.2 : These provide a more efficient way to construct an interpolating

More information

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked Plotting Menu: QCExpert Plotting Module graphs offers various tools for visualization of uni- and multivariate data. Settings and options in different types of graphs allow for modifications and customizations

More information

so f can now be rewritten as a product of g(x) = x 2 and the previous piecewisedefined

so f can now be rewritten as a product of g(x) = x 2 and the previous piecewisedefined Version PREVIEW HW 01 hoffman 575) 1 This print-out should have 9 questions. Multiple-choice questions may continue on the next column or page find all choices before answering. FuncPcwise01a 001 10.0

More information

Extreme Value Theory in (Hourly) Precipitation

Extreme Value Theory in (Hourly) Precipitation Extreme Value Theory in (Hourly) Precipitation Uli Schneider Geophysical Statistics Project, NCAR GSP Miniseries at CSU November 17, 2003 Outline Project overview Extreme value theory 101 Applying extreme

More information

arxiv: v1 [stat.me] 2 Jun 2017

arxiv: v1 [stat.me] 2 Jun 2017 Inference for Penalized Spline Regression: Improving Confidence Intervals by Reducing the Penalty arxiv:1706.00865v1 [stat.me] 2 Jun 2017 Ning Dai School of Statistics University of Minnesota daixx224@umn.edu

More information

A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM

A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM Jayawant Mandrekar, Daniel J. Sargent, Paul J. Novotny, Jeff A. Sloan Mayo Clinic, Rochester, MN 55905 ABSTRACT A general

More information

Package SmoothHazard

Package SmoothHazard Package SmoothHazard September 19, 2014 Title Fitting illness-death model for interval-censored data Version 1.2.3 Author Celia Touraine, Pierre Joly, Thomas A. Gerds SmoothHazard is a package for fitting

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 3: Distributions Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture Examine data in graphical form Graphs for looking at univariate distributions

More information

The partial Package. R topics documented: October 16, Version 0.1. Date Title partial package. Author Andrea Lehnert-Batar

The partial Package. R topics documented: October 16, Version 0.1. Date Title partial package. Author Andrea Lehnert-Batar The partial Package October 16, 2006 Version 0.1 Date 2006-09-21 Title partial package Author Andrea Lehnert-Batar Maintainer Andrea Lehnert-Batar Depends R (>= 2.0.1),e1071

More information

Data transformation in multivariate quality control

Data transformation in multivariate quality control Motto: Is it normal to have normal data? Data transformation in multivariate quality control J. Militký and M. Meloun The Technical University of Liberec Liberec, Czech Republic University of Pardubice

More information

Lecture 7: Splines and Generalized Additive Models

Lecture 7: Splines and Generalized Additive Models Lecture 7: and Generalized Additive Models Computational Statistics Thierry Denœux April, 2016 Introduction Overview Introduction Simple approaches Polynomials Step functions Regression splines Natural

More information

Recursive Estimation

Recursive Estimation Recursive Estimation Raffaello D Andrea Spring 28 Problem Set : Probability Review Last updated: March 6, 28 Notes: Notation: Unless otherwise noted, x, y, and z denote random variables, p x denotes the

More information

Modelling Personalized Screening: a Step Forward on Risk Assessment Methods

Modelling Personalized Screening: a Step Forward on Risk Assessment Methods Modelling Personalized Screening: a Step Forward on Risk Assessment Methods Validating Prediction Models Inmaculada Arostegui Universidad del País Vasco UPV/EHU Red de Investigación en Servicios de Salud

More information

Chapter 7: Dual Modeling in the Presence of Constant Variance

Chapter 7: Dual Modeling in the Presence of Constant Variance Chapter 7: Dual Modeling in the Presence of Constant Variance 7.A Introduction An underlying premise of regression analysis is that a given response variable changes systematically and smoothly due to

More information

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 14: Introduction to hypothesis testing (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 10 Hypotheses 2 / 10 Quantifying uncertainty Recall the two key goals of inference:

More information

Analysis of Imputation Methods for Missing Data. in AR(1) Longitudinal Dataset

Analysis of Imputation Methods for Missing Data. in AR(1) Longitudinal Dataset Int. Journal of Math. Analysis, Vol. 5, 2011, no. 45, 2217-2227 Analysis of Imputation Methods for Missing Data in AR(1) Longitudinal Dataset Michikazu Nakai Innovation Center for Medical Redox Navigation,

More information

Package gsscopu. R topics documented: July 2, Version Date

Package gsscopu. R topics documented: July 2, Version Date Package gsscopu July 2, 2015 Version 0.9-3 Date 2014-08-24 Title Copula Density and 2-D Hazard Estimation using Smoothing Splines Author Chong Gu Maintainer Chong Gu

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 13: The bootstrap (v3) Ramesh Johari ramesh.johari@stanford.edu 1 / 30 Resampling 2 / 30 Sampling distribution of a statistic For this lecture: There is a population model

More information

nquery Sample Size & Power Calculation Software Validation Guidelines

nquery Sample Size & Power Calculation Software Validation Guidelines nquery Sample Size & Power Calculation Software Validation Guidelines Every nquery sample size table, distribution function table, standard deviation table, and tablespecific side table has been tested

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

QUESTIONS 1 10 MAY BE DONE WITH A CALCULATOR QUESTIONS ARE TO BE DONE WITHOUT A CALCULATOR. Name

QUESTIONS 1 10 MAY BE DONE WITH A CALCULATOR QUESTIONS ARE TO BE DONE WITHOUT A CALCULATOR. Name QUESTIONS 1 10 MAY BE DONE WITH A CALCULATOR QUESTIONS 11 5 ARE TO BE DONE WITHOUT A CALCULATOR Name 2 CALCULATOR MAY BE USED FOR 1-10 ONLY Use the table to find the following. x -2 2 5-0 7 2 y 12 15 18

More information

Lecture on Modeling Tools for Clustering & Regression

Lecture on Modeling Tools for Clustering & Regression Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into

More information

Fathom Dynamic Data TM Version 2 Specifications

Fathom Dynamic Data TM Version 2 Specifications Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other

More information

Modeling Criminal Careers as Departures From a Unimodal Population Age-Crime Curve: The Case of Marijuana Use

Modeling Criminal Careers as Departures From a Unimodal Population Age-Crime Curve: The Case of Marijuana Use Modeling Criminal Careers as Departures From a Unimodal Population Curve: The Case of Marijuana Use Donatello Telesca, Elena A. Erosheva, Derek A. Kreader, & Ross Matsueda April 15, 2014 extends Telesca

More information

Test 3 review SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Test 3 review SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Test 3 review SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Approximate the coordinates of each turning point by graphing f(x) in the standard viewing

More information

Feature Selection for Image Retrieval and Object Recognition

Feature Selection for Image Retrieval and Object Recognition Feature Selection for Image Retrieval and Object Recognition Nuno Vasconcelos et al. Statistical Visual Computing Lab ECE, UCSD Presented by Dashan Gao Scalable Discriminant Feature Selection for Image

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. These methods refit a model of interest to samples formed from the training set,

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

A Cumulative Averaging Method for Piecewise Polynomial Approximation to Discrete Data

A Cumulative Averaging Method for Piecewise Polynomial Approximation to Discrete Data Applied Mathematical Sciences, Vol. 1, 16, no. 7, 331-343 HIKARI Ltd, www.m-hiari.com http://dx.doi.org/1.1988/ams.16.5177 A Cumulative Averaging Method for Piecewise Polynomial Approximation to Discrete

More information

PS Geometric Modeling Homework Assignment Sheet I (Due 20-Oct-2017)

PS Geometric Modeling Homework Assignment Sheet I (Due 20-Oct-2017) Homework Assignment Sheet I (Due 20-Oct-2017) Assignment 1 Let n N and A be a finite set of cardinality n = A. By definition, a permutation of A is a bijective function from A to A. Prove that there exist

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial

More information

STATA November 2000 BULLETIN ApublicationtopromotecommunicationamongStatausers

STATA November 2000 BULLETIN ApublicationtopromotecommunicationamongStatausers STATA November 2000 TECHNICAL STB-58 BULLETIN ApublicationtopromotecommunicationamongStatausers Editor Associate Editors H. Joseph Newton Nicholas J. Cox, University of Durham Department of Statistics

More information

Simulation studies. Patrick Breheny. September 8. Monte Carlo simulation Example: Ridge vs. Lasso vs. Subset

Simulation studies. Patrick Breheny. September 8. Monte Carlo simulation Example: Ridge vs. Lasso vs. Subset Simulation studies Patrick Breheny September 8 Patrick Breheny BST 764: Applied Statistical Modeling 1/17 Introduction In statistics, we are often interested in properties of various estimation and model

More information

Section 4 Matching Estimator

Section 4 Matching Estimator Section 4 Matching Estimator Matching Estimators Key Idea: The matching method compares the outcomes of program participants with those of matched nonparticipants, where matches are chosen on the basis

More information

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University 9/1/2005 Beta-Regression with SPSS 1 Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University (email: Michael.Smithson@anu.edu.au) SPSS Nonlinear Regression syntax

More information

The Bootstrap and Jackknife

The Bootstrap and Jackknife The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter

More information

A Measure for Assessing Functions of Time-Varying Effects in Survival Analysis

A Measure for Assessing Functions of Time-Varying Effects in Survival Analysis Open Journal of Statistics, 2014, 4, 977-998 Published Online December 2014 in SciRes. http://www.scirp.org/journal/ojs http://dx.doi.org/10.4236/ojs.2014.411092 A Measure for Assessing Functions of Time-Varying

More information

Cross-validation. Cross-validation is a resampling method.

Cross-validation. Cross-validation is a resampling method. Cross-validation Cross-validation is a resampling method. It refits a model of interest to samples formed from the training set, in order to obtain additional information about the fitted model. For example,

More information

Convexity Theory and Gradient Methods

Convexity Theory and Gradient Methods Convexity Theory and Gradient Methods Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Convex Functions Optimality

More information

Simple Linear Interpolation Explains All Usual Choices in Fuzzy Techniques: Membership Functions, t-norms, t-conorms, and Defuzzification

Simple Linear Interpolation Explains All Usual Choices in Fuzzy Techniques: Membership Functions, t-norms, t-conorms, and Defuzzification Simple Linear Interpolation Explains All Usual Choices in Fuzzy Techniques: Membership Functions, t-norms, t-conorms, and Defuzzification Vladik Kreinovich, Jonathan Quijas, Esthela Gallardo, Caio De Sa

More information

Spatial Outlier Detection

Spatial Outlier Detection Spatial Outlier Detection Chang-Tien Lu Department of Computer Science Northern Virginia Center Virginia Tech Joint work with Dechang Chen, Yufeng Kou, Jiang Zhao 1 Spatial Outlier A spatial data point

More information

SLIDING WINDOW FOR RELATIONS MAPPING

SLIDING WINDOW FOR RELATIONS MAPPING SLIDING WINDOW FOR RELATIONS MAPPING Dana Klimesova Institute of Information Theory and Automation, Prague, Czech Republic and Czech University of Agriculture, Prague klimes@utia.cas.c klimesova@pef.czu.cz

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Topics Validation of biomedical models Data-splitting Resampling Cross-validation

More information