On a Data-Mining Concept to Analyse Epidemiologic Data Using SAS Software. Hans-Peter Altenburg Mannheim

Size: px
Start display at page:

Download "On a Data-Mining Concept to Analyse Epidemiologic Data Using SAS Software. Hans-Peter Altenburg Mannheim"

Transcription

1 On a Data-Mining Concept to Analyse Epidemiologic Data Using SAS Software Hans-Peter Altenburg Mannheim hpa-ma@gmx.de

2 Outline Introduction Data Base: EPIC Cohort Study Problem to be solved / Task Concept for Program Organization Statistical epidemiological Background Realization within the SAS System Example

3 Introduction Task: On the base of an european-wide distributed cohort Find relationships and new results (findings) between life style and dietary intake and the development of (Cancer-) diseases.

4 Examples Dietary intake Intake of fat or meat Processed / salted meat or food Smoking (active / passive) stomach / colon cancer colon, breast, prostate carcinoma stomach carcinoma lung cancer underlying biological mechanisms are more complex than one might expect from chemoprevention studies ( one molecule, one effect ) EPIC-Study

5 Introduction SAS Objective Build a flexible tool to detectand to analyse relationships on the base of epidemiologic and statistical analysis criteria.

6 Introduction Examination criteria Risk measures of epidemioly Hazard ratio / relative risk Odds Ratio Attributable risk...

7 Data Base

8 I.A.R.C W.H.O Europe Against Cancer European Commission EPIC European Prospective Investigation into Cancer and Nutrition Study Design:Cohort Study

9 EPIC-Studie European Prospective Investigation into Cancer und Nutrition Cohort Study European wide: 9 countries, 29 centres ~ participants ~ variables (dependant on tumor type / analysis / objective)

10

11 EPIC: Background Relation Dietary intake Cancer Problem: etiologic importance of single food intake components as well as their quantitative contribution for cancer development e.g. influence of the fat proportion in the food, intake of vitamins, etc.

12 Task Development concept to analyse data and search for new and /or novel relationships Base: EPIC cohort SAS system Start: Analysis of fruits and vegetables intake and Lung cancer

13 SAS Task Whole concept should be flexibly adaptable Cohort study implies: Changes in Data Data management Variable names (for instance main objective criterion) e.g.caselung caselun, casepanc, caselymp, etc. Formats Inclusion / Exclusion of variables / Objects...

14 SAS Task Examples for exclusion criteria Prevalent cancer cases Centre Malmö participants (no dietary data) Country Greece (no dietary data) Persons with extreme EI/ER Ratio... EI/ER: Energy Intake/Energy Requirement

15 Changes in: Objective variable SAS Task e.g. initially Lung cancer, followed by stomach Ca, Lymphoma, etc. e.g. reference group: first quantile (group) or lowest (consumption) group influence / effect variables Adjustment variables Stratification Grouping in the analysis step...

16 Basic: Statistical Procedures Tools of descriptive epidemiology: Such as disease incidence or similar measures, graphics, etc. Analytic analysis tools: Determination of Hazard Ratios Using the Cox Proportional Hazard Model

17 Statistical Analysis Estimation procedure for the Hazard Ratios: Cox Proportional Hazard Regression Dependant Variable: Age Analysis / Mining Process: Flexible handling of distinct factors for the statistical modelling Data preparation / management Stratification Adjustment

18 Statistical Analysis Flexible Handling of Stratification: centres / countries / Northern / South Europe / Gender Adjustments: Follow up time primäry / secundary influence variables weight, height, physical activity, Consumption of additional dietary components, etc. smoking...

19 Example time dependant Cox Model Time scale: age Objective variable: caselun Time dependant covariable: length of follow-up Stratification: country Adjustment for: energy intake and sex PROC PHREG DATA=puk.stan; Model (age,agexit)*caselun(0)= qg07 qener sex inf ins / RISKLIMITS ; inf = (agexit < age + 1); ins = (age + 1 <= agexit < age + 2); STRATA country ; RUN ;

20 Concept / Organization Environment / File structure: per objective variable: Lyon delivers a Data Mart from an Oracle DB with the variables / formats required for a specified analysis per objective variable: the data are stored in a sas data table (rectangular structure) in a specific directory for all data structures: unified format structure (Lyon formats and own additional formats) per objective variable / analysis objective:there exists a special autoexec.sas file, where all macro variables are preset

21 Example autoexecp.sas z.b. Pancreas Carcinoma: /* */ %LET calibr=panc ; * lung panc lymp ; LIBNAME &calibr "D:\EPIC_&calibr\&..." ; /* */ LIBNAME hpafmt 'D:\EPIC_&CALIBR\HPA_FORMATS' ; LIBNAME lyonfmt 'D:\EPIC_&CALIBR\FORMATS' ; /* */ OPTIONS pagesize=56 linesize=80 nonumber nodate nocenter probsig=1 FMTSEARCH=(lyonfmt hpafmt) MAUTOSOURCE SASAUTOS=('D:\sasdat\m', sasautos) ; %INCLUDE 'D:\sasdat\m\mgopt.mac' ; DM 'PGM ; ZOOM ON ; ' ; /* */ %INCLUDE "D:\EPIC_&calibr\&P\&ds_info..sas" ;

22 Data Mining Process: Problems Cox Model Analysis for many variants and / or combinations according Objective criterion influence / effect variables Stratification Adjustments...

23 SAS Program: Macros / variables /* Hazard Ratios Heavy Drinkers BMI, diabetes, Smoking (Y/N), groups for Men, Women and gender stratified stratified for cntr_c adjusted (details see below) */ %LET dset=&calibr..&dsname ; /* */ OPTIONS source2 MPRINT ; %INCLUDE 'D:\sasdat\mepi\PH_std.sas' ; %INCLUDE 'D:\sasdat\mepi\HR_table.sas' ; * Vars ; %LET v_wght=obesity ovweight bmi_over25 ; %LET v_khk =diabetes ; %LET v_consvar=hdrinker ;

24 SAS Program: Variables / Output data sets * Smoking Vars ; %LET v_sm = smoker exsmoker ; * nosmoker ; %LET v_avnbcig = avncig_c0 avncig_c1 avncig_c2 avncig_c3 avncig_c4 avncig_c5 ; * avncig_c0 ; * Age var ; %LET agevar=age ; * ; Name conventions ; %LET adjn =PadjV ; /* Name of data set */ %LET adjdn =BDSm ; /* Name of data set adj vars */ %LET adjvt =_ ; /* text adj var */ %LET adjt =&adjvt ; /* title text add adj var */ %LET repn =Report0406 ; /* Name of report directrory */ %LET e_name=av ; * dset name for estvalues, rtf doc etc. ; %LET dsn_out=&e_name.&adjn ; * output name (results dset) ;

25 SAS Program: Adjustment / Strata /* Adjustment Variables in Data Set: */ %LET adjadd =Hdrinker bmi_over25 diabetes ; /* Use upper case letters only for IARC Vars, e.g. HEIGHT_C WEIGHT_C ; %LET adjust1= &v_sm ; %LET adjust= &adjadd &adjust1 ; /* HEIGHT_C WEIGHT_C QENER ener_fat enernfat */ * Reference text: ; %LET reference=nonconsumers ; * Nonconsumers 1st Quintile ; %LET var_text=selected Food Groups ; * for Headline summary table ; %LET keepvars=sex agexit ; * country smoke_r e_south ; /* Keep Variables */ /* Strata */ %LET strata1=cntr_c ; %LET strata2=cntr_c ; %LET strata3=sex cntr_c ; /* Data Title */ %LET dtitle1=men only ; %LET dtitle2=women only ; %LET dtitle3=gender Strata ;

26 SAS Program: Title / Footnote /* Project */ %LET project1=&dtitle1 / adj. for: &adjvt - &adjt ; %LET project2=&dtitle2 / adj. for: &adjvt - &adjt ; %LET project3=&dtitle3 / adj. for: &adjvt - &adjt ; /* Data Set for summary table */ %LET estvalues1=&calibr..&dsn_out._&adjdn._m ; %LET estvalues2=&calibr..&dsn_out._&adjdn._f ; %LET estvalues3=&calibr..&dsn_out._&adjdn._g_strata ; /* Rtf File */ %LET rtffile1=d:\epic_&calibr\&repn\&dsn_out._&adjdn._m.rtf ; %LET rtffile2=d:\epic_&calibr\&repn\&dsn_out._&adjdn._f.rtf ; %LET rtffile3=d:\epic_&calibr\&repn\&dsn_out._&adjdn._g_strata.rtf ; /* */ /* */ /* */

27 SAS Program: Introductory text / Analysis ODS RTF FILE="&rtffile1" ; * < Men ; %LET est_cum=&estvalues1 ; %LET dtitle=&dtitle1 ; DATA _NULL_ ; FILE PRINT ; PUT / 'DATA ANALYSIS' /// "&Ca_text" // "&ds_recv" /// "&dtitle" /// /// "Hazard Ratios" // "Project: &project1" /// 'adjusted for:' /// "&adjust" / ; RUN ; %LET where_c=%str(sex=1) ; * < Men ; %LET strata =&strata1 ; %INCLUDE "D:\EPIC_&calibr\&P\&P_QG\&P.sas" ; * ; /* */ ODS RTF CLOSE ;

28 SAS Program: Introductory text / Analysis ODS RTF FILE="&rtffile3" ; * < Gender Strata --- ; %LET est_cum=&estvalues3 ; %LET dtitle=&dtitle3 ; DATA _NULL_ ; FILE PRINT ; PUT / 'DATA ANALYSIS' /// "&Ca_text" // "&ds_recv" /// "&dtitle" /// /// "Hazard Ratios" // "Project: &project3" /// 'adjusted for:' /// "&adjust" / ; RUN ; %LET where_c= ; * < Gender Strata --- ; %LET strata =&strata3 ; %INCLUDE "D:\EPIC_&calibr\&P\&P_QG\&P_dummy.sas" ; * ; /* */ ODS RTF CLOSE ;

29 Cox-Model SAS Macro Call /* Dummy call without any effect / influence variables */ %LET t_leer =no alc Vars ; * < ; %LET q_leer = ; * < ; %LET qqq =&q_leer ; %LET ttt =&t_leer ; %PH_std(&dset,&where_c,&keepvars,&agevar,,&cavar,0,,&strata,,, &qqq,&adjust, inf ins, %STR(inf = ( agexit < age + 1) ; ins = (age + 1 <= agexit < age + 2) ; ),outph,&est_cum,est,estph,0.05,&ttt )

30 Cox-Model SAS Macro Call /* with effect variables */ %LET t14 =Alcoholic Beverages ; * < ; %LET q14 =alcb_c1 alcb_c2 alcb_c3 alcb_c4 alcb_c5 ; * < ; %LET qqq =&q14 ; %LET ttt =&t14 ; %PH_std(&dset,&where_c,&keepvars,&agevar,,&cavar,0,,&strata,,, &qqq,&adjust, inf ins, %STR(inf = ( agexit < age + 1) ; ins = (age + 1 <= agexit < age + 2) ; ),outph,&est_cum,est,estph,0.05,&ttt )

31 SAS Program Final Summary Table OPTIONS LINESIZE=220 PAGESIZE=130 ; %LET ttext0=overview Hazard Ratios ; %LET ttext2=&var_text - Reference: &reference ; %LET ttext3=adjusting Variables: &adjust ; /* Overview Table 1 */ %HR_table(&estvalues1,&est_g_strat, %STR("WEIGHT_C" "HEIGHT_C" "smkd_1" "smkd_2" "smkd_3" "smkd_4" "smkd_5" "giveup1" "giveup2" "giveup3" ), &ttext0: &project1,&ttext2,&ttext3) * ;

32 Output Summary Table Heavy Drinkers only Hazard Ratios for drinkers in classes Reference: Non-drinkers (Class 0) Strata: centres (Def. B) Adjusted for: BMI>25, smokers, exsmokers Men only Variable/ Hazard Lower Upper Class No. Ratio 95%-CL 95%-CL Effect Statistically / Risk Meaningful drinker c Decreasing drinker c Decreasing bmi_over Decreasing smoker Increasing exsmoker Decreasing Women only drinker c Increasing drinker c Increasing bmi_over Increasing smoker Increasing Yes! exsmoker Increasing

33 Output Summary Table (2) Heavy Drinkers only Hazard Ratios for drinkers in classes Reference: Non-drinkers (Class 0) Strata: centres (Def. B) Adjusted for: BMI>25, smokers, exsmokers Variable/ Hazard Lower Upper Effect Statistically Class No. Ratio 95%-CL 95%-CL / Risk Meaningful Gender Strata drinker c Decreasing drinker c Increasing bmi_over Decreasing smoker Increasing Yes! exsmoker Increasing

34 Summary With corresponding preparation and planning SAS allows the realization of a Data Mining Analysis for epidemiologic questions. SAS components used (SAS Vs 8): Data Step / Macro Statements STAT Procedures ODS The Cox Model examples presented here can easily transferred to other STAT procedures such as logistic regression!

35 References 1. Breslow, N.E. / Day, N.E.: Statistical Methods in Cancer Research II: The analysis of cohort studies. Lyon, IARC Clayton, D. and Hills, M.: Statistical Models in Epidemiology. Oxford, Oxford University Press AB Miller, H-P Altenburg, et.al.: Fruits and Vegetables and Lung Cancer: Findings from the European Prospective Investigation into Cancer and Nutrition. International Journal of Cancer, 108, 2004, Newman, S.C.: Biostatistical Methods in Epidemiology. New York, Wiley 2001

36 Any questions or comments? Thank you very much for your attention!

The SAS %BLINPLUS Macro

The SAS %BLINPLUS Macro The SAS %BLINPLUS Macro Roger Logan and Donna Spiegelman April 10, 2012 Abstract The macro %blinplus corrects for measurement error in one or more model covariates logistic regression coefficients, their

More information

Calculating measures of biological interaction

Calculating measures of biological interaction European Journal of Epidemiology (2005) 20: 575 579 Ó Springer 2005 DOI 10.1007/s10654-005-7835-x METHODS Calculating measures of biological interaction Tomas Andersson 1, Lars Alfredsson 1,2, Henrik Ka

More information

Pros and Cons of Interactive SAS Mode vs. Batch Mode Irina Walsh, ClinOps, LLC, San Francisco, CA

Pros and Cons of Interactive SAS Mode vs. Batch Mode Irina Walsh, ClinOps, LLC, San Francisco, CA Pros and Cons of Interactive SAS Mode vs. Batch Mode Irina Walsh, ClinOps, LLC, San Francisco, CA ABSTRACT It is my opinion that SAS programs can be developed in either interactive or batch mode and produce

More information

PSS weighted analysis macro- user guide

PSS weighted analysis macro- user guide Description and citation: This macro performs propensity score (PS) adjusted analysis using stratification for cohort studies from an analytic file containing information on patient identifiers, exposure,

More information

The SAS MEDIATE Macro

The SAS MEDIATE Macro 1 The SAS MEDIATE Macro Ellen Hertzmark, Mathew Pazaris, and Donna Spiegelman January 17, 2018 Abstract The %MEDIATE macro calculates the point and interval estimates, as well as a p-value, for the percent

More information

PharmaSUG China. model to include all potential prognostic factors and exploratory variables, 2) select covariates which are significant at

PharmaSUG China. model to include all potential prognostic factors and exploratory variables, 2) select covariates which are significant at PharmaSUG China A Macro to Automatically Select Covariates from Prognostic Factors and Exploratory Factors for Multivariate Cox PH Model Yu Cheng, Eli Lilly and Company, Shanghai, China ABSTRACT Multivariate

More information

Splitting the follow-up C&H 6

Splitting the follow-up C&H 6 Splitting the follow-up C&H 6 Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk www.biostat.ku.dk/~bxc PhD-course in Epidemiology, Department

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

Chapter 1: Introduction to SAS

Chapter 1: Introduction to SAS Chapter 1: Introduction to SAS SAS programs: A sequence of statements in a particular order. Rules for SAS statements: 1. Every SAS statement ends in a semicolon!!!; 2. Upper/lower case does not matter

More information

Genetic Analysis. Page 1

Genetic Analysis. Page 1 Genetic Analysis Page 1 Genetic Analysis Objectives: 1) Set up Case-Control Association analysis and the Basic Genetics Workflow 2) Use JMP tools to interact with and explore results 3) Learn advanced

More information

The SAS MEDIATE Macro

The SAS MEDIATE Macro The SAS MEDIATE Macro Ellen Hertzmark, Mathew Pazaris, and Donna Spiegelman June 6, 2012 Abstract The %MEDIATE macro calculates the point and interval estimates of the percent of treatment (exposure) effect

More information

Answer keys for Assignment 16: Principles of data collection

Answer keys for Assignment 16: Principles of data collection Answer keys for Assignment 16: Principles of data collection (The correct answer is underlined in bold text) 1. Supportive supervision is essential for a good data collection process 2. Which one of the

More information

Summary Table for Displaying Results of a Logistic Regression Analysis

Summary Table for Displaying Results of a Logistic Regression Analysis PharmaSUG 2018 - Paper EP-23 Summary Table for Displaying Results of a Logistic Regression Analysis Lori S. Parsons, ICON Clinical Research, Medical Affairs Statistical Analysis ABSTRACT When performing

More information

Maintaining Formats when Exporting Data from SAS into Microsoft Excel

Maintaining Formats when Exporting Data from SAS into Microsoft Excel Maintaining Formats when Exporting Data from SAS into Microsoft Excel Nate Derby & Colleen McGahan Stakana Analytics, Seattle, WA BC Cancer Agency, Vancouver, BC Club des Utilisateurs SAS de Québec 11/1/16

More information

Guidelines for Organizing SAS Code and Project Files

Guidelines for Organizing SAS Code and Project Files Basic Organizational Ideas Guidelines for Organizing SAS Code and Project Files Nate Derby Stakana Analytics Seattle, WA Club des Utilisateurs SAS de Québec 11/1/16 Nate Derby Organizing SAS Files 1 /

More information

Using the Health Indicators database to help students research Canadian health issues

Using the Health Indicators database to help students research Canadian health issues Assignment Using the Health Indicators database to help students research Canadian health issues Joel Yan, Statistics Canada, joel.yan@statcan.ca, 1-800-465-1222 With input from Brenda Wannell, Health

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 21, 2017 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

Fitting latency models using B-splines in EPICURE for DOS

Fitting latency models using B-splines in EPICURE for DOS Fitting latency models using B-splines in EPICURE for DOS Michael Hauptmann, Jay Lubin January 11, 2007 1 Introduction Disease latency refers to the interval between an increment of exposure and a subsequent

More information

The SAS LGTPHCURV9 Macro

The SAS LGTPHCURV9 Macro The SAS LGTPHCURV9 Macro Ruifeng Li, Ellen Hertzmark, Mary Louie, Linlin Chen, and Donna Spiegelman July 3, 2011 Abstract The %LGTPHCURV9 macro fits restricted cubic splines to unconditional logistic,

More information

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Paper SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Qixuan Chen, University of Michigan, Ann Arbor, MI Brenda Gillespie, University of Michigan, Ann Arbor, MI ABSTRACT This paper

More information

3. SOURCES OF DATA. 3.1 Data Management Activities

3. SOURCES OF DATA. 3.1 Data Management Activities 3. SOURCES OF DATA Data for the central database will come from a variety of sources and include the following. Basic clinical information and data obtained from outcome assessments performed at each site,

More information

Ontario Cancer Profiles User Help File

Ontario Cancer Profiles User Help File Ontario Cancer Profiles User Help File Contents Introduction... 2 Module 1 Tool Overview and Layout... 3 Overview of the tool... 3 Highlights vs. selections... 6 Data suppression or unreliable estimates...

More information

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 3 John Koval Department of Epidemiology and Biostatistics University of Western Ontario What we will do today We will learn to use use SAS to 1. read

More information

The SAS RELRISK9 Macro

The SAS RELRISK9 Macro The SAS RELRISK9 Macro Sally Skinner, Ruifeng Li, Ellen Hertzmark, and Donna Spiegelman November 15, 2012 Abstract The %RELRISK9 macro obtains relative risk estimates using PROC GENMOD with the binomial

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 27, 2018 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

Acknowledgments. Acronyms

Acknowledgments. Acronyms Acknowledgments Preface Acronyms xi xiii xv 1 Basic Tools 1 1.1 Goals of inference 1 1.1.1 Population or process? 1 1.1.2 Probability samples 2 1.1.3 Sampling weights 3 1.1.4 Design effects. 5 1.2 An introduction

More information

What We Eat in America, NHANES

What We Eat in America, NHANES by and Age, in the United States, 2011-2012 and age Percent reporting 3 Energy Protein Carbohydrate Total sugars Dietary fiber Total Saturated Monounsaturated Polyunsaturated (years) % (SE) % (SE) % (SE)

More information

Exercise 3: Multivariable analysis in R part 1: Logistic regression

Exercise 3: Multivariable analysis in R part 1: Logistic regression Exercise 3: Multivariable analysis in R part 1: Logistic regression At the end of this exercise you should be able to: a. Know how to use logistic regression in R b. Know how to properly remove factors

More information

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO

More information

MISSING DATA AND MULTIPLE IMPUTATION

MISSING DATA AND MULTIPLE IMPUTATION Paper 21-2010 An Introduction to Multiple Imputation of Complex Sample Data using SAS v9.2 Patricia A. Berglund, Institute For Social Research-University of Michigan, Ann Arbor, Michigan ABSTRACT This

More information

Statistical Tests for Variable Discrimination

Statistical Tests for Variable Discrimination Statistical Tests for Variable Discrimination University of Trento - FBK 26 February, 2015 (UNITN-FBK) Statistical Tests for Variable Discrimination 26 February, 2015 1 / 31 General statistics Descriptional:

More information

Wonderful Unix SAS Scripts: SAS Quickies for Dummies Raoul Bernal, Amgen Inc., Thousand Oaks, CA

Wonderful Unix SAS Scripts: SAS Quickies for Dummies Raoul Bernal, Amgen Inc., Thousand Oaks, CA PharmaSUG2010 - Paper CC25 Wonderful Unix SAS Scripts: SAS Quickies for Dummies Raoul Bernal, Amgen Inc., Thousand Oaks, CA ABSTRACT Repeated simple SAS jobs can be executed online and in real-time to

More information

Summary of the impact of the inclusion of mobile phone numbers into the NSW Population Health Survey in 2012

Summary of the impact of the inclusion of mobile phone numbers into the NSW Population Health Survey in 2012 University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2015 Summary of the impact of the inclusion of

More information

/********************************************/ /* Evaluating the PS distribution!!! */ /********************************************/

/********************************************/ /* Evaluating the PS distribution!!! */ /********************************************/ SUPPLEMENTAL MATERIAL: Example SAS code /* This code demonstrates estimating a propensity score, calculating weights, */ /* evaluating the distribution of the propensity score by treatment group, and */

More information

Seminar Series: CTSI Presents

Seminar Series: CTSI Presents Biostatistics, Epidemiology & Research Design (BERD) Howard Cabral, PhD, MPH Christine Chaisson, MPH Seminar Series: CTSI Presents November 20, 2014 Demystifying SAS Macros BUSPH Data Coordinating Center

More information

(R) / / / / / / / / / / / / Statistics/Data Analysis

(R) / / / / / / / / / / / / Statistics/Data Analysis (R) / / / / / / / / / / / / Statistics/Data Analysis help incroc (version 1.0.2) Title incroc Incremental value of a marker relative to a list of existing predictors. Evaluation is with respect to receiver

More information

Data-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp )

Data-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp ) Applied Longitudinal Data Analysis Page 1 Data-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp. 357-467) Purpose of the Exercise This data-analytic

More information

Lab 1: Introduction to Data

Lab 1: Introduction to Data 1 Lab 1: Introduction to Data Some define Statistics as the field that focuses on turning information into knowledge. The first step in that process is to summarize and describe the raw information the

More information

Enterprise Miner Version 4.0. Changes and Enhancements

Enterprise Miner Version 4.0. Changes and Enhancements Enterprise Miner Version 4.0 Changes and Enhancements Table of Contents General Information.................................................................. 1 Upgrading Previous Version Enterprise Miner

More information

SAS Programs SAS Lecture 4 Procedures. Aidan McDermott, April 18, Outline. Internal SAS formats. SAS Formats

SAS Programs SAS Lecture 4 Procedures. Aidan McDermott, April 18, Outline. Internal SAS formats. SAS Formats SAS Programs SAS Lecture 4 Procedures Aidan McDermott, April 18, 2006 A SAS program is in an imperative language consisting of statements. Each statement ends in a semi-colon. Programs consist of (at least)

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining José Hernández-Orallo Dpto. de Sistemas Informáticos y Computación Universidad Politécnica de Valencia, Spain jorallo@dsic.upv.es Roma, 14-15th May 2009 1 Outline Motivation.

More information

Stat 5100 Handout #14.a SAS: Logistic Regression

Stat 5100 Handout #14.a SAS: Logistic Regression Stat 5100 Handout #14.a SAS: Logistic Regression Example: (Text Table 14.3) Individuals were randomly sampled within two sectors of a city, and checked for presence of disease (here, spread by mosquitoes).

More information

Correctly Compute Complex Samples Statistics

Correctly Compute Complex Samples Statistics PASW Complex Samples 17.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample

More information

Automated Macros to Extract Data from the National (Nationwide) Inpatient Sample (NIS)

Automated Macros to Extract Data from the National (Nationwide) Inpatient Sample (NIS) Paper 3327-2015 Automated Macros to Extract Data from the National (Nationwide) Inpatient Sample (NIS) Ravi Gaddameedi, California State University, Eastbay, CA; Usha Kreaden, Intuitive Surgical, Sunnyvale,

More information

How to Use the Cancer-Rates.Info/NJ

How to Use the Cancer-Rates.Info/NJ How to Use the Cancer-Rates.Info/NJ Web- Based Incidence and Mortality Mapping and Inquiry Tool to Obtain Statewide and County Cancer Statistics for New Jersey Cancer Incidence and Mortality Inquiry System

More information

Enterprise Miner Tutorial Notes 2 1

Enterprise Miner Tutorial Notes 2 1 Enterprise Miner Tutorial Notes 2 1 ECT7110 E-Commerce Data Mining Techniques Tutorial 2 How to Join Table in Enterprise Miner e.g. we need to join the following two tables: Join1 Join 2 ID Name Gender

More information

Methods for Estimating Change from NSCAW I and NSCAW II

Methods for Estimating Change from NSCAW I and NSCAW II Methods for Estimating Change from NSCAW I and NSCAW II Paul Biemer Sara Wheeless Keith Smith RTI International is a trade name of Research Triangle Institute 1 Course Outline Review of NSCAW I and NSCAW

More information

Epidemiological analysis PhD-course in epidemiology

Epidemiological analysis PhD-course in epidemiology Epidemiological analysis PhD-course in epidemiology Lau Caspar Thygesen Associate professor, PhD 9. oktober 2012 Multivariate tables Agenda today Age standardization Missing data 1 2 3 4 Age standardization

More information

Epidemiological analysis PhD-course in epidemiology. Lau Caspar Thygesen Associate professor, PhD 25 th February 2014

Epidemiological analysis PhD-course in epidemiology. Lau Caspar Thygesen Associate professor, PhD 25 th February 2014 Epidemiological analysis PhD-course in epidemiology Lau Caspar Thygesen Associate professor, PhD 25 th February 2014 Age standardization Incidence and prevalence are strongly agedependent Risks rising

More information

Maintenance of NTDB National Sample

Maintenance of NTDB National Sample Maintenance of NTDB National Sample National Sample Project of the National Trauma Data Bank (NTDB), the American College of Surgeons Draft March 2007 ii Contents Section Page 1. Introduction 1 2. Overview

More information

STATA 13 INTRODUCTION

STATA 13 INTRODUCTION STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA

More information

UTILIZING DATA FROM VARIOUS DATA PARTNERS IN A DISTRIBUTED MANNER

UTILIZING DATA FROM VARIOUS DATA PARTNERS IN A DISTRIBUTED MANNER UTILIZING DATA FROM VARIOUS DATA PARTNERS IN A DISTRIBUTED MANNER Documentation of SAS Packages for the Distributed Regression Analysis Software Application Prepared by the Sentinel Operations Center June

More information

Interactive Programming Using Task in SAS Studio

Interactive Programming Using Task in SAS Studio ABSTRACT PharmaSUG 2018 - Paper QT-10 Interactive Programming Using Task in SAS Studio Suwen Li, Hoffmann-La Roche Ltd., Mississauga, ON SAS Studio is a web browser-based application with visual point-and-click

More information

A Comparison of Modeling Scales in Flexible Parametric Models. Noori Akhtar-Danesh, PhD McMaster University

A Comparison of Modeling Scales in Flexible Parametric Models. Noori Akhtar-Danesh, PhD McMaster University A Comparison of Modeling Scales in Flexible Parametric Models Noori Akhtar-Danesh, PhD McMaster University Hamilton, Canada daneshn@mcmaster.ca Outline Backgroundg A review of splines Flexible parametric

More information

Smoking and Missingness: Computer Syntax 1

Smoking and Missingness: Computer Syntax 1 Smoking and Missingness: Computer Syntax 1 Computer Syntax SAS code is provided for the logistic regression imputation described in this article. This code is listed in parts, with description provided

More information

The SAS INT2WAY Macro

The SAS INT2WAY Macro The SAS INT2WAY Macro Ellen Hertzmark and Donna Spiegelman October 4, 2010 Abstract The %INT2WAY macro is a SAS macro that constructs all the 2- way interactions among a set of variables. It also makes

More information

Creating Complex Graphics for Survival Analyses with the SAS System

Creating Complex Graphics for Survival Analyses with the SAS System Creating Complex Graphics for Survival Analyses with the SAS System Steven E. Elkin, MBH Consulting, Inc., New York, NY William Mietlowski, Novartis Pharmaceuticals Corp., East Hanover, NJ Kevin McCague,

More information

Run your reports through that last loop to standardize the presentation attributes

Run your reports through that last loop to standardize the presentation attributes PharmaSUG2011 - Paper TT14 Run your reports through that last loop to standardize the presentation attributes Niraj J. Pandya, Element Technologies Inc., NJ ABSTRACT Post Processing of the report could

More information

HILDA PROJECT TECHNICAL PAPER SERIES No. 2/08, February 2008

HILDA PROJECT TECHNICAL PAPER SERIES No. 2/08, February 2008 HILDA PROJECT TECHNICAL PAPER SERIES No. 2/08, February 2008 HILDA Standard Errors: A Users Guide Clinton Hayes The HILDA Project was initiated, and is funded, by the Australian Government Department of

More information

SAS Programming Basics

SAS Programming Basics SAS Programming Basics SAS Programs SAS Programs consist of three major components: Global statements Procedures Data steps SAS Programs Global Statements Procedures Data Step Notes Data steps and procedures

More information

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors Week 10: Autocorrelated errors This week, I have done one possible analysis and provided lots of output for you to consider. Case study: predicting body fat Body fat is an important health measure, but

More information

The partial Package. R topics documented: October 16, Version 0.1. Date Title partial package. Author Andrea Lehnert-Batar

The partial Package. R topics documented: October 16, Version 0.1. Date Title partial package. Author Andrea Lehnert-Batar The partial Package October 16, 2006 Version 0.1 Date 2006-09-21 Title partial package Author Andrea Lehnert-Batar Maintainer Andrea Lehnert-Batar Depends R (>= 2.0.1),e1071

More information

A Mass Symphony: Directing the Program Logs, Lists, and Outputs

A Mass Symphony: Directing the Program Logs, Lists, and Outputs PharmaSUG2011 Paper CC24 ABSTRACT A Mass Symphony: Directing the Program Logs, Lists, and Outputs Tom Santopoli, Octagon Research Solutions, Inc., Wayne, PA When executing programs in SAS, it is efficient

More information

Texting distracted driving behaviour among European drivers: influence of attitudes, subjective norms and risk perception

Texting distracted driving behaviour among European drivers: influence of attitudes, subjective norms and risk perception Texting distracted driving behaviour among European drivers: influence of attitudes, subjective norms and risk perception Alain Areal Authors: Carlos Pires Prevenção Rodoviária Portuguesa, Lisboa, Portugal

More information

SAS Display Manager Windows. For Windows

SAS Display Manager Windows. For Windows SAS Display Manager Windows For Windows Computers with SAS software SSCC Windows Terminal Servers (Winstat) Linux Servers (linstat) Lab computers DoIT Info Labs (as of June 2014) In all Labs with Windows

More information

Getting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA

Getting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA SESUG 2012 Paper HW-01 Getting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA ABSTRACT Learning the basics of PROC REPORT can help the new SAS user avoid hours of headaches.

More information

Dealing with Data Gradients: Backing Out & Calibration

Dealing with Data Gradients: Backing Out & Calibration Dealing with Data Gradients: Backing Out & Calibration Nathaniel Osgood MIT 15.879 April 25, 2012 ABM Modeling Process Overview A Key Deliverable! ODD: Overview & high-level design components ODD: Design

More information

Correctly Compute Complex Samples Statistics

Correctly Compute Complex Samples Statistics SPSS Complex Samples 15.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample

More information

Review of PC-SAS Batch Programming

Review of PC-SAS Batch Programming Ronald J. Fehd September 21, 2007 Abstract This paper presents an overview of issues of usage of PC-SAS R in a project directory. Topics covered include directory structures, how to start SAS in a particular

More information

ODS DOCUMENT, a practical example. Ruurd Bennink, OCS Consulting B.V., s-hertogenbosch, the Netherlands

ODS DOCUMENT, a practical example. Ruurd Bennink, OCS Consulting B.V., s-hertogenbosch, the Netherlands Paper CC01 ODS DOCUMENT, a practical example Ruurd Bennink, OCS Consulting B.V., s-hertogenbosch, the Netherlands ABSTRACT The ODS DOCUMENT destination (in short ODS DOCUMENT) is perhaps the most underutilized

More information

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200; Randy s SAS hints, updated Feb 6, 2014 1. Always begin your programs with internal documentation. * ***************** * Program =test1, Randy Ellis, first version: March 8, 2013 ***************; 2. Don

More information

Chapter 13 Multivariate Techniques. Chapter Table of Contents

Chapter 13 Multivariate Techniques. Chapter Table of Contents Chapter 13 Multivariate Techniques Chapter Table of Contents Introduction...279 Principal Components Analysis...280 Canonical Correlation...289 References...298 278 Chapter 13. Multivariate Techniques

More information

PharmaSUG China 2018 Paper AD-62

PharmaSUG China 2018 Paper AD-62 PharmaSUG China 2018 Paper AD-62 Decomposition and Reconstruction of TLF Shells - A Simple, Fast and Accurate Shell Designer Chengeng Tian, dmed Biopharmaceutical Co., Ltd., Shanghai, China ABSTRACT Table/graph

More information

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 KEY SKILLS: Organize a data set into a frequency distribution. Construct a histogram to summarize a data set. Compute the percentile for a particular

More information

A Cross-national Comparison Using Stacked Data

A Cross-national Comparison Using Stacked Data A Cross-national Comparison Using Stacked Data Goal In this exercise, we combine household- and person-level files across countries to run a regression estimating the usual hours of the working-aged civilian

More information

Teaching students quantitative methods using resources from the British Birth Cohorts

Teaching students quantitative methods using resources from the British Birth Cohorts Centre for Longitudinal Studies, Institute of Education Teaching students quantitative methods using resources from the British Birth Cohorts Assessment of Cognitive Development through Childhood CognitiveExercises.doc:

More information

Once the data warehouse is assembled, its customers will likely

Once the data warehouse is assembled, its customers will likely Clinical Data Warehouse Development with Base SAS Software and Common Desktop Tools Patricia L. Gerend, Genentech, Inc., South San Francisco, California ABSTRACT By focusing on the information needed by

More information

Package samplesizecmh

Package samplesizecmh Package samplesizecmh Title Power and Sample Size Calculation for the Cochran-Mantel-Haenszel Test Date 2017-12-13 Version 0.0.0 Copyright Spectrum Health, Grand Rapids, MI December 21, 2017 Calculates

More information

Creating PDF documents including hyper-links, bookmarks and a table of contents with the SAS software. Lex Jansen NV Organon Oss, The Netherlands

Creating PDF documents including hyper-links, bookmarks and a table of contents with the SAS software. Lex Jansen NV Organon Oss, The Netherlands Creating PDF documents including hyper-links, bookmarks and a table of contents with the SAS software. Lex Jansen NV Organon Oss, The Netherlands Contents Data flow Why PDF? Types of appendices From SAS

More information

ECLT 5810 SAS Programming - Introduction

ECLT 5810 SAS Programming - Introduction ECLT 5810 SAS Programming - Introduction Why SAS? Able to process data set(s). Easy to handle multiple variables. Generate useful basic analysis Summary statistics Graphs Many companies and government

More information

Types of Data Mining

Types of Data Mining Data Mining and The Use of SAS to Deploy Scoring Rules South Central SAS Users Group Conference Neil Fleming, Ph.D., ASQ CQE November 7-9, 2004 2W Systems Co., Inc. Neil.Fleming@2WSystems.com 972 733-0588

More information

SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA

SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA Paper SIB-113 SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA ABSTRACT Edward Tufte has championed the idea of using "small multiples" as an effective way to present

More information

SAS Training BASE SAS CONCEPTS BASE SAS:

SAS Training BASE SAS CONCEPTS BASE SAS: SAS Training BASE SAS CONCEPTS BASE SAS: Dataset concept and creating a dataset from internal data Capturing data from external files (txt, CSV and tab) Capturing Non-Standard data (date, time and amounts)

More information

Centers for Disease Control and Prevention National Center for Health Statistics

Centers for Disease Control and Prevention National Center for Health Statistics Wireless-Only and Wireless-Mostly Households: A growing challenge for telephone surveys Stephen Blumberg sblumberg@cdc.gov Julian Luke jluke@cdc.gov Centers for Disease Control and Prevention National

More information

Chapter 17: INTERNATIONAL DATA PRODUCTS

Chapter 17: INTERNATIONAL DATA PRODUCTS Chapter 17: INTERNATIONAL DATA PRODUCTS After the data processing and data analysis, a series of data products were delivered to the OECD. These included public use data files and codebooks, compendia

More information

Experimental epidemiology analyses with R and R commander. Lars T. Fadnes Centre for International Health University of Bergen

Experimental epidemiology analyses with R and R commander. Lars T. Fadnes Centre for International Health University of Bergen Experimental epidemiology analyses with R and R commander Lars T. Fadnes Centre for International Health University of Bergen 1 Click to add an outline 2 How to install R commander? - install.packages("rcmdr",

More information

Updates and Errata for Statistical Data Analytics (1st edition, 2015)

Updates and Errata for Statistical Data Analytics (1st edition, 2015) Updates and Errata for Statistical Data Analytics (1st edition, 2015) Walter W. Piegorsch University of Arizona c 2018 The author. All rights reserved, except where previous rights exist. CONTENTS Preface

More information

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400 Statistics (STAT) 1 Statistics (STAT) STAT 1200: Introductory Statistical Reasoning Statistical concepts for critically evaluation quantitative information. Descriptive statistics, probability, estimation,

More information

Procedure for Stamping Source File Information on SAS Output Elizabeth Molloy & Breda O'Connor, ICON Clinical Research

Procedure for Stamping Source File Information on SAS Output Elizabeth Molloy & Breda O'Connor, ICON Clinical Research Procedure for Stamping Source File Information on SAS Output Elizabeth Molloy & Breda O'Connor, ICON Clinical Research ABSTRACT In the course of producing a report for a clinical trial numerous drafts

More information

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2 SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Department of MathemaGcs and StaGsGcs Phone: 4-3620 Office: Parker 364- A E- mail: carpedm@auburn.edu Web: hup://www.auburn.edu/~carpedm/stat6110

More information

Paper Abstract. Introduction. SAS Version 7/8 Web Tools. Using ODS to Create HTML Formatted Output. Background

Paper Abstract. Introduction. SAS Version 7/8 Web Tools. Using ODS to Create HTML Formatted Output. Background Paper 43-25 The International Studies Project : SAS Version 7/8 Web Tools To The Rescue Lilin She, UNC-CH, Department Of Biostatistics, Chapel Hill, NC Jeffrey M. Abolafia, UNC-CH, Department Of Biostatistics,

More information

EXST SAS Lab Lab #8: More data step and t-tests

EXST SAS Lab Lab #8: More data step and t-tests EXST SAS Lab Lab #8: More data step and t-tests Objectives 1. Input a text file in column input 2. Output two data files from a single input 3. Modify datasets with a KEEP statement or option 4. Prepare

More information

A SAS Macro for Covariate Specification in Linear, Logistic, or Survival Regression

A SAS Macro for Covariate Specification in Linear, Logistic, or Survival Regression Paper 1223-2017 A SAS Macro for Covariate Specification in Linear, Logistic, or Survival Regression Sai Liu and Margaret R. Stedman, Stanford University; ABSTRACT Specifying the functional form of a covariate

More information

Tables & Figures Abstracts ANSC 5307

Tables & Figures Abstracts ANSC 5307 Tables & Figures Abstracts ANSC 5307 Components of Tables & Figures 1. Stand alone Should not need text to explain what s in the table Should not repeat values from the tables or figures verbatim in the

More information

Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies

Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies Gandhi R Bhattarai PhD, OptumInsight, Rocky Hill, CT ABSTRACT Measuring the change in outcomes between

More information

A Feasibility and Acceptability Study of the Provision

A Feasibility and Acceptability Study of the Provision A Feasibility and Acceptability Study of the Provision of mhealth Interventions for Behavior Change in Prehypertensive subjects in Argentina, Guatemala, and Peru. Med e Tel 2012 Beratarrechea A 1, Fernandez

More information

Concept Note. Scope and purpose of the First Meeting: Objectives Expected outcomes Suggested participants Logistics and registration Discussion papers

Concept Note. Scope and purpose of the First Meeting: Objectives Expected outcomes Suggested participants Logistics and registration Discussion papers First Meeting of UN Funds, Programmes and Agencies on the Implementation of the Political Declaration of the High-level Meeting of the General Assembly on the Prevention and Control of NCDs (New York,

More information

National Child Measurement Programme 2017/18. IT System User Guide part 5. Progress and Data Quality Monitoring.

National Child Measurement Programme 2017/18. IT System User Guide part 5. Progress and Data Quality Monitoring. National Child Measurement Programme 2017/18 IT System User Guide part 5 Progress and Data Quality Monitoring. Published September 2017 Version 4.0 Introduction 3 Who Should Read This Guidance? 3 How Will

More information

Module I: Clinical Trials a Practical Guide to Design, Analysis, and Reporting 1. Fundamentals of Trial Design

Module I: Clinical Trials a Practical Guide to Design, Analysis, and Reporting 1. Fundamentals of Trial Design Module I: Clinical Trials a Practical Guide to Design, Analysis, and Reporting 1. Fundamentals of Trial Design Randomized the Clinical Trails About the Uncontrolled Trails The protocol Development The

More information

Questionnaire 3. (only to be filled out when submitting blood and stool sample) This box will be filled out by the practice team

Questionnaire 3. (only to be filled out when submitting blood and stool sample) This box will be filled out by the practice team Questionnaire 3 (only to be filled out when submitting blood and stool sample) Date This box will be filled out by the practice team Patient-ID Barcode on labels Dear participant, We are pleased that you

More information