Modelling Personalized Screening: a Step Forward on Risk Assessment Methods

Size: px
Start display at page:

Download "Modelling Personalized Screening: a Step Forward on Risk Assessment Methods"

Transcription

1 Modelling Personalized Screening: a Step Forward on Risk Assessment Methods Validating Prediction Models Inmaculada Arostegui Universidad del País Vasco UPV/EHU Red de Investigación en Servicios de Salud en Enfermedades Crónicas - REDISSEC Basque Center for Applied Mathematics - BCAM 38th Annual Conference of the ISCB Vigo, 9-13 July 2017 I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 1 / 29

2 Outline 1 Introduction and Motivation 2 CPRs: Validation process 3 Application to ecopd evolution 4 Discussion I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 2 / 29

3 Introduction and Motivation Prediction models and clinical practice Prediction on the prognosis of a disease is necessary for screening, prevention and choice of treatment The probabilities of diagnosis and prognostic outcomes are conditioning decision-making process Evidence-based medicine applies the scientific method to medical practice Towards shared decision-making on choices for diagnostic tests and therapeutic interventions Clinical prediction rules may provide the evidence-based input for shared decision-making in clinical practice I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 3 / 29

4 Introduction and Motivation Motivating data: The IRYSS COPD Study COPD is a leading chronic condition in many countries Exacerbation of COPD (ecopd) often requires assessment in an ED and hospitalization Severe exacerbations lead to death or intubation Moderate exacerbations require an adjustment of the therapy Exacerbations play a major role in the burden of COPD, its evolution, and its cost Physicians must rely largely on their experience and the patient s personal criteria for gauging how an ecopd will evolve A clinical prediction rule for ecopd evolution would allow physicians to make better informed decisions about treatment Goal The development of clinical prediction rules (scores) for risk stratification of patients with ecopd I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 4 / 29

5 Introduction and Motivation Goal A method for the development of validated clinical prediction rules (scores) for risk stratification and to make them available as easy to use tools for clinical decision-making process scores development validated easy to use tools stratification I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 5 / 29

6 CPRs: Validation process Step-by-step process General overview 1 Modeling: Model development and validation 2 Scoring: Score development and validation 3 Stratification: Score categorization and validation I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 6 / 29

7 CPRs: Validation process Modeling: Development Model development and validation In general: Outcome k predictors Model In our case: Binary outcome Continuous and categorical predictors Logistic regression model Selection of predictors Model discrimination: Area under the receiver operating characteristic (ROC) curve (AUC) Model calibration: Calibration plot & H-L test I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 7 / 29

8 Modeling: Validation CPRs: Validation process Model development and validation 1 Predictors: Relationship predictor-outcome Missing values 2 Selection of predictors: Stability of the predictors with internal bootstrap validation 3 Overestimation of the AUC: Same data were used for modeling (logistic regression) and discrimination (AUC) purposes Consequently, AUC is biased Optimism correction for the AUC is proposed: bootstrap bias-correction method Harrell, Split validation: Application to a different sample I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 8 / 29

9 Predictors CPRs: Validation process Model development and validation Relationship predictor-outcome (logistic function) Linear Non linear Smooth functions (GAM) Categorize predictor: Look for optimal categorization Missing values Ignore (drop out subjects) Imputation techniques Consider missing category I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 9 / 29

10 Density Density N = 1997 Bandwidth = N = 2000 Bandwidth = CPRs: Validation process Model development and validation Selection of predictors: Step 1 STEP 1: Variable selection Derivation sample Variables with p-value <0.20 (X 1,, X n ) Subsample 1 Model 1 (β 11,, β n1 ) Generation of 2000 bootstrap samples*... Subsample 2000 Model 2000 (β 12000,, β n2000 ) If 0 β i CI 80% =(p 10 p 90 ) βi X i was not considered for the Step 2.0 If 0 β i CI 80% =(p 10 p 90 ) βi X i was considered for the Step 2.0 *Bootstrap samples: subsamples with replacement (of the same size as the derivation sample) I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 10 / 29

11 CPRs: Validation process Selection of predictors: Step 2 Model development and validation STEP 2: Model building Step 2.j : j=1,.. Subsample 1 Model 1 (β 11,, β n1 ) Risk factors associated with the outcome in Step 2.j-1 (X rj,, X sj ) 1 r j <s j n Generation of 2000 NEW boostraps... Subsample 2000 Model 2000 (β 12000,, β n2000 ) If 0 β i CI 95% =(p 2,5 p 97,5 ) βi X i was not considered for the Step 2.j+1 If 0 β i CI 95% =(p 2,5 p 97,5 ) βi X i was considered for the Step 2.j+1 Step 2.j is repeated since all the variables in the model verify 0 β ici 95% i {r j,, s j } FINAL MODEL I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 11 / 29

12 AUC correction CPRs: Validation process Model development and validation Step 1 Fit the logistic regression model on the basis of the original sample {(x i, y i )} N i=1 and compute the corresponding AUC, ÂUC app. Step 2 For b = 1,..., B, generate the bootstrap resample (b.r) {(x ib, y ib )}N i=1 by drawing a random sample of size N with replacement from the original sample. Step 3 Fit the logistic regression model to the bootstrap resample and compute the corresponding AUC, ÂUCb boot. Step 4 Obtain the predicted probabilities for the original sample based on the fitted logistic regression model obtained in Step 3 and compute the AUC, ÂUC b o. The optimism O of the original AUC is calculated as follows O = 1 B B (ÂUCb boot ÂUCb o) b=1 and the bias corrected AUC is then computed as ÂUC app O. I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 12 / 29

13 CPRs: Validation process Score development and validation Scoring: Development Step1: Estimate the parameters of the model f (y) = β 0 + β 1 X β nx n Step2: Determine reference values for each category j of each predictor X i (W ij ) Dichotomous predictor: reference values are 0/1 Continuous predictor (X i ): Categorize in k contiguous classes (X i1, X i2,, X ik ) Step3: Determine the reference value of the base category for each predictor (W iref ) Step4: Set the number of regression units that reflects 1 point in the score (B) Step5: Weight each category of each predictor by its significance level (b j ) p > 0.1 b ij = < p < 0.1 b ij = < p < 0.05 b ij = < p < 0.01 b ij = 1.2 p < b ij = 1.4 Step6: Determine the number of points for each category of each predictor (S ij ) S ij = b ij β i (W ij W iref ) B Sullivan et al., Statistics in Medicine, I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 13 / 29

14 Scoring: Validation CPRs: Validation process Score development and validation 1 Comparing AUC(model) vs. AUC(score): DeLong test DeLong et al., Biometrics, Optimism correction for the AUC: Bootstrap bias-correction of the overestimation Harrell, I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 14 / 29

15 CPRs: Validation process Score categorization Stratification: Categorization method Let Y be a dichotomous response variable and X the continuous score which we want to categorize Look for the vector of k optimal cut points v = (x 1,..., x k ) by using genetic algorithms The aim is to maximize the AUC of the model P(Y = 1 X catk ) = exp(β 0 + k l=1 β l1 {Xcatk =l}) 1 + exp(β 0 + k l=1 β l1 {Xcatk =l}) The arguments used in developing the genetic algorithm: AUC function to be maximized k number of parameters to be estimated Range of the score X in which we look for the cut points X Catk the categorized score taking k + 1 values (l = 0,..., k) Barrio et al., Statistical Methods in Medical Research, I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 15 / 29

16 Risk stratification CPRs: Validation process Score categorization Continuous score: X After categorization: X Catk (k = 4) 4 risk categories: low - moderate - high - very high Comparing AUC(X Cat4 ) vs. AUC(X): DeLong test Optimism correction for the AUC: Modified Harrell s proposal Evaluation of the integrated discrimination improvement (IDI) Steyerberg et al., Epidemiology, I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 16 / 29

17 Application to ecopd evolution Data Description of the IRYSS-COPD Study Prospective cohort of patients with ecopd (n = 2487) Outcome: Short-term mortality Potential predictors: 16 clinical variables collected from medical records and direct interview (age, baseline FEV1%, dyspnea,comorbidities, arterial blood gasses,...) Goal The development of a clinical prediction rule for short-term mortality of patients with ecopd Quintana et al., BMC Health Services Research, I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 17 / 29

18 Application to ecopd evolution Methods Modeling Scoring Stratification Implementation I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 18 / 29

19 Application to ecopd evolution Results Model development and validation AUC (Model) = 0.85 CI95% = ( ) H-L test: p = I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 19 / 29

20 Application to ecopd evolution Results Scoring: development and validation Score: 0 27 AUC (Score) = 0.84 CI95% = ( ) DeLong test(score vs. model): p = I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 20 / 29

21 Application to ecopd evolution Results Scoring: development and validation I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 21 / 29

22 Risk stratification Application to ecopd evolution Results Subsample 2 AUC (Score) = 0.84 CI95% = ( ) AUC (Categorical Score) = 0.84 CI95% = ( ) DeLong test(categorical vs. score): p = I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 22 / 29

23 Risk stratification Application to ecopd evolution Results I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 23 / 29

24 Application to ecopd evolution Computer tool: PrEveCOPD Implementation: PrEveCOPD App Windows (under installation and web-application) Available at: I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 24 / 29

25 Application to ecopd evolution Computer tool: PrEveCOPD Implementation: PrEveCOPD App Android: Available at Google Play I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 25 / 29

26 Validation step-by-step Discussion 1 Modeling: Proper validation of a prediction model can lead to better and more stable discrimination ability 2 Scoring: A prediction model can be summarized into a valid and easy to obtain clinical prediction rule (score) 3 Stratification: Categorization of the score allows for valid stratification of patients by risk 4 Implementation: An easy to use computer application can guide the medical decision process in clinical practice I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 26 / 29

27 Discussion Conclusions 1 The proposed methodology as a whole allows for valid stratification of patients with ecopd by their risk of short-term mortality 2 The PrEveCOPD computer tool can guide medical decision process at patient s ED arrival I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 27 / 29

28 Discussion Is it finished? External validation The CPR performs well across samples from different but related source populations (transportability) 1 Relatedness of original (derivation) and new (validation) samples 2 Assessment of the CPR s performance in the new study 3 Interpretation of the results: Correction of poor performance if necessary External validation is missing! Waiting for a new sample I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 28 / 29

29 Discussion Thank you! I. Arostegui (UPV/EHU) SY2:Validating Prediction Models 29 / 29

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Topics Validation of biomedical models Data-splitting Resampling Cross-validation

More information

(R) / / / / / / / / / / / / Statistics/Data Analysis

(R) / / / / / / / / / / / / Statistics/Data Analysis (R) / / / / / / / / / / / / Statistics/Data Analysis help incroc (version 1.0.2) Title incroc Incremental value of a marker relative to a list of existing predictors. Evaluation is with respect to receiver

More information

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications

More information

Statistical Analysis Using Combined Data Sources: Discussion JPSM Distinguished Lecture University of Maryland

Statistical Analysis Using Combined Data Sources: Discussion JPSM Distinguished Lecture University of Maryland Statistical Analysis Using Combined Data Sources: Discussion 2011 JPSM Distinguished Lecture University of Maryland 1 1 University of Michigan School of Public Health April 2011 Complete (Ideal) vs. Observed

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

PART III APPLICATIONS

PART III APPLICATIONS S. Vieira PART III APPLICATIONS Fuzz IEEE 2013, Hyderabad India 1 Applications Finance Value at Risk estimation based on a PFS model for density forecast of a continuous response variable conditional on

More information

8/3/2017. Contour Assessment for Quality Assurance and Data Mining. Objective. Outline. Tom Purdie, PhD, MCCPM

8/3/2017. Contour Assessment for Quality Assurance and Data Mining. Objective. Outline. Tom Purdie, PhD, MCCPM Contour Assessment for Quality Assurance and Data Mining Tom Purdie, PhD, MCCPM Objective Understand the state-of-the-art in contour assessment for quality assurance including data mining-based techniques

More information

Statistical Matching using Fractional Imputation

Statistical Matching using Fractional Imputation Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:

More information

* * * * * * * * * * * * * * * ** * **

* * * * * * * * * * * * * * * ** * ** Generalized additive models Trevor Hastie and Robert Tibshirani y 1 Introduction In the statistical analysis of clinical trials and observational studies, the identication and adjustment for prognostic

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

proc: display and analyze ROC curves

proc: display and analyze ROC curves proc: display and analyze ROC curves Tools for visualizing, smoothing and comparing receiver operating characteristic (ROC curves). (Partial) area under the curve (AUC) can be compared with statistical

More information

Facts about long term conditions. Our approach

Facts about long term conditions. Our approach Introduction my mhealth create digital solutions for the management of long term conditions. Evidence based, and highly secure, my mhealth provide a suite of solutions to manage patients with COPD, Asthma,

More information

mhealth & integrated care

mhealth & integrated care mhealth & integrated care 2nd Shiraz International mhealth Congress February 22th, 23th 2017, Shiraz - Iran Nick Guldemond Associate Professor Integrated Care & Technology Roadmap 1 Healthcare paradigm

More information

Subgroup identification for dose-finding trials via modelbased recursive partitioning

Subgroup identification for dose-finding trials via modelbased recursive partitioning Subgroup identification for dose-finding trials via modelbased recursive partitioning Marius Thomas, Björn Bornkamp, Heidi Seibold & Torsten Hothorn ISCB 2017, Vigo July 10, 2017 Motivation Characterizing

More information

Fast or furious? - User analysis of SF Express Inc

Fast or furious? - User analysis of SF Express Inc CS 229 PROJECT, DEC. 2017 1 Fast or furious? - User analysis of SF Express Inc Gege Wen@gegewen, Yiyuan Zhang@yiyuan12, Kezhen Zhao@zkz I. MOTIVATION The motivation of this project is to predict the likelihood

More information

An imputation approach for analyzing mixed-mode surveys

An imputation approach for analyzing mixed-mode surveys An imputation approach for analyzing mixed-mode surveys Jae-kwang Kim 1 Iowa State University June 4, 2013 1 Joint work with S. Park and S. Kim Ouline Introduction Proposed Methodology Application to Private

More information

JMP Clinical. Release Notes. Version 5.0

JMP Clinical. Release Notes. Version 5.0 JMP Clinical Version 5.0 Release Notes Creativity involves breaking out of established patterns in order to look at things in a different way. Edward de Bono JMP, A Business Unit of SAS SAS Campus Drive

More information

Learning and Evaluating Classifiers under Sample Selection Bias

Learning and Evaluating Classifiers under Sample Selection Bias Learning and Evaluating Classifiers under Sample Selection Bias Bianca Zadrozny IBM T.J. Watson Research Center, Yorktown Heights, NY 598 zadrozny@us.ibm.com Abstract Classifier learning methods commonly

More information

RADIOMICS: potential role in the clinics and challenges

RADIOMICS: potential role in the clinics and challenges 27 giugno 2018 Dipartimento di Fisica Università degli Studi di Milano RADIOMICS: potential role in the clinics and challenges Dr. Francesca Botta Medical Physicist Istituto Europeo di Oncologia (Milano)

More information

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,

More information

Integrating a mobile health setup in a chronic disease management network

Integrating a mobile health setup in a chronic disease management network Integrating a mobile health setup in a chronic disease management network Hang Ding, Derek, Ireland, Rajiv Jayasena, Jamie Curmi, and Mohan Karunanithi. Presenter: Hang Ding (hang.ding@csiro.au) THE AUSTRALIAN

More information

in this course) ˆ Y =time to event, follow-up curtailed: covered under ˆ Missing at random (MAR) a

in this course) ˆ Y =time to event, follow-up curtailed: covered under ˆ Missing at random (MAR) a Chapter 3 Missing Data 3.1 Types of Missing Data ˆ Missing completely at random (MCAR) ˆ Missing at random (MAR) a ˆ Informative missing (non-ignorable non-response) See 1, 38, 59 for an introduction to

More information

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016 Resampling Methods Levi Waldron, CUNY School of Public Health July 13, 2016 Outline and introduction Objectives: prediction or inference? Cross-validation Bootstrap Permutation Test Monte Carlo Simulation

More information

RESAMPLING METHODS. Chapter 05

RESAMPLING METHODS. Chapter 05 1 RESAMPLING METHODS Chapter 05 2 Outline Cross Validation The Validation Set Approach Leave-One-Out Cross Validation K-fold Cross Validation Bias-Variance Trade-off for k-fold Cross Validation Cross Validation

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. These methods refit a model of interest to samples formed from the training set,

More information

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1 Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that

More information

Lecture 1: Statistical Reasoning 2. Lecture 1. Simple Regression, An Overview, and Simple Linear Regression

Lecture 1: Statistical Reasoning 2. Lecture 1. Simple Regression, An Overview, and Simple Linear Regression Lecture Simple Regression, An Overview, and Simple Linear Regression Learning Objectives In this set of lectures we will develop a framework for simple linear, logistic, and Cox Proportional Hazards Regression

More information

Regulatory Aspects of Digital Healthcare Solutions

Regulatory Aspects of Digital Healthcare Solutions Regulatory Aspects of Digital Healthcare Solutions TÜV SÜD Product Service GmbH Dr. Markus Siebert Rev. 02 / 2017 02.05.2017 TÜV SÜD Product Service GmbH Slide 1 Contents Digital solutions as Medical Device

More information

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017 Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last

More information

Training Guide on. Searching the Cochrane Library-

Training Guide on. Searching the Cochrane Library- Training Guide on Searching the Cochrane Library- www.thecochranelibrary.com November 2012 This guide is prepared by Daphne Grey and Ziba Nadimi, members of Clinical Librarians and Information Skills Trainers

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. 1/44 Cross-validation and the Bootstrap In the section we discuss two resampling

More information

Global Telemedicine Market (Telehome and TeleHospital): Size, Trends & Forecasts ( ) March 2017

Global Telemedicine Market (Telehome and TeleHospital): Size, Trends & Forecasts ( ) March 2017 Global Telemedicine Market (Telehome and TeleHospital): Size, Trends & Forecasts (2017-2021) March 2017 Global Telemedicine Market Report Scope of the Report The report entitled Global Telemedicine Market:

More information

Office of Human Research

Office of Human Research Office of Human Research JeffTrial End-User Training Document Regulatory Coordinator Training for Non-Oncology personnel Office of Human Research 8/16/2013 Ver. 1.0 Contents The REG Role: Completing Basic

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients

Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients 1 Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients 1,2 Keyue Ding, Ph.D. Nov. 8, 2014 1 NCIC Clinical Trials Group, Kingston, Ontario, Canada 2 Dept. Public

More information

Nonparametric Approaches to Regression

Nonparametric Approaches to Regression Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)

More information

2017 Partners in Excellence Executive Overview, Targets, and Methodology

2017 Partners in Excellence Executive Overview, Targets, and Methodology 2017 Partners in Excellence Executive Overview, s, and Methodology Overview The Partners in Excellence program forms the basis for HealthPartners financial and public recognition for medical or specialty

More information

Model-based Recursive Partitioning for Subgroup Analyses

Model-based Recursive Partitioning for Subgroup Analyses EBPI Epidemiology, Biostatistics and Prevention Institute Model-based Recursive Partitioning for Subgroup Analyses Torsten Hothorn; joint work with Heidi Seibold and Achim Zeileis 2014-12-03 Subgroup analyses

More information

Mark Pearson. Peteris Zilgavis

Mark Pearson. Peteris Zilgavis Elizabeth Kuiper Director European Affairs, EFPIA Mark Pearson Deputy Director, Employment Labour and Social Affairs, OECD Peteris Zilgavis Head of Unit Health and Well- Being DG CONNECT, European Commission

More information

Statistical Consulting Topics Using cross-validation for model selection. Cross-validation is a technique that can be used for model evaluation.

Statistical Consulting Topics Using cross-validation for model selection. Cross-validation is a technique that can be used for model evaluation. Statistical Consulting Topics Using cross-validation for model selection Cross-validation is a technique that can be used for model evaluation. We often fit a model to a full data set and then perform

More information

EVIDENCE SEARCHING IN EBM. By: Masoud Mohammadi

EVIDENCE SEARCHING IN EBM. By: Masoud Mohammadi EVIDENCE SEARCHING IN EBM By: Masoud Mohammadi Steps in EBM Auditing the outcome Defining the question or problem Applying the results Searching for the evidence Critically appraising the literature Clinical

More information

Dimension Reduction for Big Data Analysis. Dan Shen. Department of Mathematics & Statistics University of South Florida.

Dimension Reduction for Big Data Analysis. Dan Shen. Department of Mathematics & Statistics University of South Florida. Dimension Reduction for Big Data Analysis Dan Shen Department of Mathematics & Statistics University of South Florida danshen@usf.edu October 24, 2014 1 Outline Multiscale weighted PCA for Image Analysis

More information

Future Integrated Sensors in Wireless Health. William J. Kaiser UCLA Wireless Health Institute (WHI)

Future Integrated Sensors in Wireless Health. William J. Kaiser UCLA Wireless Health Institute (WHI) Future Integrated Sensors in Wireless Health William J. Kaiser UCLA Wireless Health Institute (WHI) Wireless Health Home Glucometer Body Area and Local Area Wireless Clinic Exercise equipment Weight Scale

More information

Module I: Clinical Trials a Practical Guide to Design, Analysis, and Reporting 1. Fundamentals of Trial Design

Module I: Clinical Trials a Practical Guide to Design, Analysis, and Reporting 1. Fundamentals of Trial Design Module I: Clinical Trials a Practical Guide to Design, Analysis, and Reporting 1. Fundamentals of Trial Design Randomized the Clinical Trails About the Uncontrolled Trails The protocol Development The

More information

CTSI Module 8 Workshop Introduction to Biomedical Informatics, Part V

CTSI Module 8 Workshop Introduction to Biomedical Informatics, Part V CTSI Module 8 Workshop Introduction to Biomedical Informatics, Part V Practical Tools: Data Processing & Analysis William Hsu, PhD Assistant Professor Medical Imaging Informatics Group Dept of Radiological

More information

Acknowledgments. Acronyms

Acknowledgments. Acronyms Acknowledgments Preface Acronyms xi xiii xv 1 Basic Tools 1 1.1 Goals of inference 1 1.1.1 Population or process? 1 1.1.2 Probability samples 2 1.1.3 Sampling weights 3 1.1.4 Design effects. 5 1.2 An introduction

More information

Estimating survival from Gray s flexible model. Outline. I. Introduction. I. Introduction. I. Introduction

Estimating survival from Gray s flexible model. Outline. I. Introduction. I. Introduction. I. Introduction Estimating survival from s flexible model Zdenek Valenta Department of Medical Informatics Institute of Computer Science Academy of Sciences of the Czech Republic I. Introduction Outline II. Semi parametric

More information

High Value Reports in HCT Status Update Feb 2016

High Value Reports in HCT Status Update Feb 2016 High Value Reports in HCT Status Update 2015 Feb 2016 1 Highlights of SCTOD expectations Collect data (and specimens) ALL allogeneic HCTs with a U.S. recipient or donor Related donor-recipient repository

More information

Business Models for ehealth Initial Overview of Results Brussels 17 November 2009 SIMPHS Validation Workshop

Business Models for ehealth Initial Overview of Results Brussels 17 November 2009 SIMPHS Validation Workshop Business Models for ehealth Initial Overview of Results Brussels 17 November 2009 SIMPHS Validation Workshop Project Objectives The project wants to assess the size of the ehealth market in Europe and

More information

NVAB EBM Workshop June 2018 Paul Smits, Frank van Dijk, 31a-5

NVAB EBM Workshop June 2018 Paul Smits, Frank van Dijk, 31a-5 NVAB EBM Workshop June 2018 Paul Smits, Frank van Dijk, 31a-5 Seguridad y Salud Ocupacional OnLine Cómo buscar información confiable Gert van der Laan, Frank van Dijk International OEDC Congress Antalya,

More information

COPYRIGHTED MATERIAL CONTENTS

COPYRIGHTED MATERIAL CONTENTS PREFACE ACKNOWLEDGMENTS LIST OF TABLES xi xv xvii 1 INTRODUCTION 1 1.1 Historical Background 1 1.2 Definition and Relationship to the Delta Method and Other Resampling Methods 3 1.2.1 Jackknife 6 1.2.2

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

Predicting Diabetes using Neural Networks and Randomized Optimization

Predicting Diabetes using Neural Networks and Randomized Optimization Predicting Diabetes using Neural Networks and Randomized Optimization Kunal Sharma GTID: ksharma74 CS 4641 Machine Learning Abstract This paper analysis the following randomized optimization techniques

More information

Information Services & Systems. The Cochrane Library. An introductory guide. Sarah Lawson Information Specialist (NHS Support)

Information Services & Systems. The Cochrane Library. An introductory guide. Sarah Lawson Information Specialist (NHS Support) Information Services & Systems The Cochrane Library An introductory guide Sarah Lawson Information Specialist (NHS Support) sarah.lawson@kcl.ac.uk April 2010 Contents 1. Coverage... 3 2. Planning your

More information

2 AN IMPROVED NONLINEAR IMPUTATION/TRANSFORMATION METHOD Frank E Harrell Jr Division of Biometry, Duke University Medical Center, Durham NC USA Regres

2 AN IMPROVED NONLINEAR IMPUTATION/TRANSFORMATION METHOD Frank E Harrell Jr Division of Biometry, Duke University Medical Center, Durham NC USA Regres 1 An Improved Nonlinear Imputation/Transformation Method Frank E. Harrell Jr Clinical Biostatistics Division of Biometry and The Heart Center Duke University Medical Center Box 3363 Durham NC 27710 feh@biostat.mc.duke.edu

More information

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu

More information

Media centre Electromagnetic fields and public health: mobile phones

Media centre Electromagnetic fields and public health: mobile phones Media centre Electromagnetic fields and public health: mobile phones Fact sheet N 193 Reviewed October 2014 Key facts Mobile phone use is ubiquitous with an estimated 6.9 billion subscriptions globally.

More information

Evaluating Machine-Learning Methods. Goals for the lecture

Evaluating Machine-Learning Methods. Goals for the lecture Evaluating Machine-Learning Methods Mark Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from

More information

INTRO TO RANDOM FOREST BY ANTHONY ANH QUOC DOAN

INTRO TO RANDOM FOREST BY ANTHONY ANH QUOC DOAN INTRO TO RANDOM FOREST BY ANTHONY ANH QUOC DOAN MOTIVATION FOR RANDOM FOREST Random forest is a great statistical learning model. It works well with small to medium data. Unlike Neural Network which requires

More information

Medical Information. Objectives 3/9/2016. Literature Search : PubMed. Know. Evaluation 2. Medical informatics Literature search : PubMed PICO Approach

Medical Information. Objectives 3/9/2016. Literature Search : PubMed. Know. Evaluation 2. Medical informatics Literature search : PubMed PICO Approach Medical Information Literature Search : PubMed Bordin Sapsomboon 9 Mar 2016 http://www.si.mahidol.ac.th/simi bordin.sap@mahidol.ac.th Objectives Know Medical informatics Literature search : PubMed PICO

More information

EBSCO Publishing Health Library Editorial Policy

EBSCO Publishing Health Library Editorial Policy EBSCO Publishing Health Library Editorial Policy Introduction EBSCO Publishing is a leader in publishing health and medical information on the Internet. While we make every effort to ensure that our content

More information

Clustering patients into subgroups differing in optimal treatment alternative: QUINT. Elise Dusseldorp Singapore, March 25, 2014

Clustering patients into subgroups differing in optimal treatment alternative: QUINT. Elise Dusseldorp Singapore, March 25, 2014 Clustering patients into subgroups differing in optimal treatment alternative: QUINT Elise Dusseldorp Singapore, March 25, 2014 Starting point: Two treatment alternatives Treatment A Treatment B Professional:

More information

Seminars of Software and Services for the Information Society

Seminars of Software and Services for the Information Society DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI Master of Science in Engineering in Computer Science (MSE-CS) Seminars in Software and Services for the Information Society

More information

PSS weighted analysis macro- user guide

PSS weighted analysis macro- user guide Description and citation: This macro performs propensity score (PS) adjusted analysis using stratification for cohort studies from an analytic file containing information on patient identifiers, exposure,

More information

Multiple imputation using chained equations: Issues and guidance for practice

Multiple imputation using chained equations: Issues and guidance for practice Multiple imputation using chained equations: Issues and guidance for practice Ian R. White, Patrick Royston and Angela M. Wood http://onlinelibrary.wiley.com/doi/10.1002/sim.4067/full By Gabrielle Simoneau

More information

Overview of the CohortMethod package. Martijn Schuemie

Overview of the CohortMethod package. Martijn Schuemie Overview of the CohortMethod package Martijn Schuemie CohortMethod is part of the OHDSI Methods Library Estimation methods Cohort Method New-user cohort studies using large-scale regression s for propensity

More information

Searching the Cochrane Library

Searching the Cochrane Library Searching the Cochrane Library To book your place on the course contact the library team: www.epsom-sthelier.nhs.uk/lis E: hirsonlibrary@esth.nhs.uk T: 020 8296 2430 Learning objectives At the end of this

More information

Serial Data, Smoothing, & Mathematical Modeling. Kin 304W Week 9: July 2, 2013

Serial Data, Smoothing, & Mathematical Modeling. Kin 304W Week 9: July 2, 2013 Serial Data, Smoothing, & Mathematical Modeling Kin 304W Week 9: July 2, 2013 1 Outline Serial Data What is it? How do we smooth serial data? Moving averages (unweighted and weighted) Signal averaging

More information

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Fitting latency models using B-splines in EPICURE for DOS

Fitting latency models using B-splines in EPICURE for DOS Fitting latency models using B-splines in EPICURE for DOS Michael Hauptmann, Jay Lubin January 11, 2007 1 Introduction Disease latency refers to the interval between an increment of exposure and a subsequent

More information

Pooling Clinical Data: Key points and Pitfalls. October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit

Pooling Clinical Data: Key points and Pitfalls. October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit Pooling Clinical Data: Key points and Pitfalls October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit Introduction Are there any pre-defined rules to pool clinical data? Are there any pre-defined

More information

Louis Fourrier Fabien Gaie Thomas Rolf

Louis Fourrier Fabien Gaie Thomas Rolf CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted

More information

Free for All! Assessing User Data Exposure to Advertising Libraries on Android

Free for All! Assessing User Data Exposure to Advertising Libraries on Android Free for All! Assessing User Data Exposure to Advertising Libraries on Android Soteris Demetriou, Whitney Merrill, Wei Yang, Aston Zhang, Carl Gunter University of Illinois at Urbana - Champaign Approach

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Eric Medvet 16/3/2017 1/77 Outline Machine Learning: what and why? Motivating example Tree-based methods Regression trees Trees aggregation 2/77 Teachers Eric Medvet Dipartimento

More information

Learning Objectives. Outline. Lung Cancer Workshop VIII 8/2/2012. Nicholas Petrick 1. Methodologies for Evaluation of Effects of CAD On Users

Learning Objectives. Outline. Lung Cancer Workshop VIII 8/2/2012. Nicholas Petrick 1. Methodologies for Evaluation of Effects of CAD On Users Methodologies for Evaluation of Effects of CAD On Users Nicholas Petrick Center for Devices and Radiological Health, U.S. Food and Drug Administration AAPM - Computer Aided Detection in Diagnostic Imaging

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation NINA ORWITZ OCTOBER 30 TH, 2017 Outline Types of missing data Simple methods for dealing with missing data Single and multiple imputation R example Missing data is a complex

More information

Implementation of an app-based neuromuscular training programme to prevent ankle sprains: a process evaluation using the RE-AIM Framework

Implementation of an app-based neuromuscular training programme to prevent ankle sprains: a process evaluation using the RE-AIM Framework 6 chapter 6 Implementation of an app-based neuromuscular training programme to prevent ankle sprains: a process evaluation using the RE-AIM Framework Ingrid Vriend Iris Coehoorn Evert Verhagen Br J Sports

More information

Clinical Database applications in hospital

Clinical Database applications in hospital Clinical Database applications in hospital Mo Sun, Ye Lin, Roger Yim Lee sun2m, lin2y, lee1ry@cmich.edu Department of Computer Science Central Michigan University Abstract Database applications are used

More information

Machine learning techniques for binary classification of microarray data with correlation-based gene selection

Machine learning techniques for binary classification of microarray data with correlation-based gene selection Machine learning techniques for binary classification of microarray data with correlation-based gene selection By Patrik Svensson Master thesis, 15 hp Department of Statistics Uppsala University Supervisor:

More information

Anomaly Detection. You Chen

Anomaly Detection. You Chen Anomaly Detection You Chen 1 Two questions: (1) What is Anomaly Detection? (2) What are Anomalies? Anomaly detection refers to the problem of finding patterns in data that do not conform to expected behavior

More information

Duane Bender, Professor, Mohawk College MOBILE HEALTH: THE PROMISE AND THE PROGRESS

Duane Bender, Professor, Mohawk College MOBILE HEALTH: THE PROMISE AND THE PROGRESS Duane Bender, Professor, Mohawk College duane.bender@mohawkcollege.ca MOBILE HEALTH: THE PROMISE AND THE PROGRESS Mohawk MEDIC not-for-profit applied research facility focused on digital health approaching

More information

Midas+ Live: Strategic Performance Management. Justin Lanning Vice President Business Development, Midas+

Midas+ Live: Strategic Performance Management. Justin Lanning Vice President Business Development, Midas+ Midas+ Live: Strategic Performance Management Justin Lanning Vice President Business Development, Midas+ 1 Increasing Demands 2 Original Midas+ Strategy Original Strategy Enhance CPMS and DataVision with

More information

stepwisecm: Stepwise Classification of Cancer Samples using High-dimensional Data Sets

stepwisecm: Stepwise Classification of Cancer Samples using High-dimensional Data Sets stepwisecm: Stepwise Classification of Cancer Samples using High-dimensional Data Sets Askar Obulkasim Department of Epidemiology and Biostatistics, VU University Medical Center P.O. Box 7075, 1007 MB

More information

The partial Package. R topics documented: October 16, Version 0.1. Date Title partial package. Author Andrea Lehnert-Batar

The partial Package. R topics documented: October 16, Version 0.1. Date Title partial package. Author Andrea Lehnert-Batar The partial Package October 16, 2006 Version 0.1 Date 2006-09-21 Title partial package Author Andrea Lehnert-Batar Maintainer Andrea Lehnert-Batar Depends R (>= 2.0.1),e1071

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Isabelle Guyon Notes written by: Johann Leithon. Introduction The process of Machine Learning consist of having a big training data base, which is the input to some learning

More information

Random Forest A. Fornaser

Random Forest A. Fornaser Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University

More information

Regression. Dr. G. Bharadwaja Kumar VIT Chennai

Regression. Dr. G. Bharadwaja Kumar VIT Chennai Regression Dr. G. Bharadwaja Kumar VIT Chennai Introduction Statistical models normally specify how one set of variables, called dependent variables, functionally depend on another set of variables, called

More information

SNUMedinfo at TREC CDS track 2014: Medical case-based retrieval task

SNUMedinfo at TREC CDS track 2014: Medical case-based retrieval task SNUMedinfo at TREC CDS track 2014: Medical case-based retrieval task Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of Korea wakeup06@empas.com, jinchoi@snu.ac.kr

More information

Physician Quality Reporting System Program Year Group Practice Reporting Option (GPRO) Web Interface XML Specification

Physician Quality Reporting System Program Year Group Practice Reporting Option (GPRO) Web Interface XML Specification Centers for Medicare & Medicaid Services CMS expedited Life Cycle (XLC) Physician Quality Reporting System Program Year 2013 Group Practice Reporting Option (GPRO) Web Interface XML Specification Version:

More information

Chapter 3: Supervised Learning

Chapter 3: Supervised Learning Chapter 3: Supervised Learning Road Map Basic concepts Evaluation of classifiers Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Summary 2 An example

More information

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at Performance Evaluation of Ensemble Method Based Outlier Detection Algorithm Priya. M 1, M. Karthikeyan 2 Department of Computer and Information Science, Annamalai University, Annamalai Nagar, Tamil Nadu,

More information

Chances & Challenges of ehealth

Chances & Challenges of ehealth 4 th European Hospital Conference (EHC) Welcome - Céad Míle Fáilte Chances & Challenges of ehealth Mr Gerry O Dwyer Group CEO of the South/South West Hospital Group, Ireland President of the European Association

More information

Medicaid: Beyond the Silos Series Health and Housing Integration August 7, Arizona Environment

Medicaid: Beyond the Silos Series Health and Housing Integration August 7, Arizona Environment Medicaid: Beyond the Silos Series Health and Housing Integration August 7, 2015 Arizona Environment Medicaid expansion adopted in 2013. Legislature filed a law suit challenging legality of hospital assessment

More information

Smooth Isotonic Regression: A New Method to Calibrate Predictive Models

Smooth Isotonic Regression: A New Method to Calibrate Predictive Models Smooth Isotonic Regression: A New Method to Calibrate Predictive Models Xiaoqian Jiang, PhD, Melanie Osl, PhD, Jihoon Kim, MSc, Lucila Ohno-Machado, MD, PhD Division of Biomedical Informatics, Department

More information

ITMAT/CHPS mhealth Service Introduction. Mauricio Novelo Project director, mhealth mobile device program, ITMAT/CHPS

ITMAT/CHPS mhealth Service Introduction. Mauricio Novelo Project director, mhealth mobile device program, ITMAT/CHPS ITMAT/CHPS mhealth Service Introduction Mauricio Novelo Project director, mhealth mobile device program, ITMAT/CHPS mhealth service Center for Human Phenomic Science (CHPS; formerly CTRC/GCRC), Institute

More information

!"# $ # # $ $ % $ &% $ '"# $ ()&*&)+(( )+(( )

!# $ # # $ $ % $ &% $ '# $ ()&*&)+(( )+(( ) !"# # # % &% '"# ) !#, ' "# " "# -. / # 0 0 0 0 0 "0 "# " # 1 #! " " 0 0 0 0 0 0 2# 0 # # 3 ' 4 56 7-56 87 9# 5 6 7 6 & 0 " : 9 ; 4 " #! 0 - '% # % "# " "# " < 4 "! % " % 4 % % 9# 4 56 87 = 4 > 0 " %!#

More information

2018 Partners in Excellence Executive Overview, Targets, and Methodology

2018 Partners in Excellence Executive Overview, Targets, and Methodology 2018 Partners in Excellence Executive Overview, s, and Methodology Overview The Partners in Excellence program forms the basis for HealthPartners financial and public recognition for medical or specialty

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION Introduction CHAPTER 1 INTRODUCTION Mplus is a statistical modeling program that provides researchers with a flexible tool to analyze their data. Mplus offers researchers a wide choice of models, estimators,

More information

Package PTE. October 10, 2017

Package PTE. October 10, 2017 Type Package Title Personalized Treatment Evaluator Version 1.6 Date 2017-10-9 Package PTE October 10, 2017 Author Adam Kapelner, Alina Levine & Justin Bleich Maintainer Adam Kapelner

More information