Two useful macros to nudge SAS to serve you

Size: px
Start display at page:

Download "Two useful macros to nudge SAS to serve you"

Transcription

1 Two useful macros to nudge SAS to serve you David Izrael, Michael P. Battaglia, Abt Associates Inc., Cambridge, MA Abstract This paper offers two macros that augment the power of two SAS procedures: LOGISTIC and UNIVARIATE. PROC LOGISTIC calculates, among other statistics, several measures that reflect the predictive ability of a logistic regression model. Those are: percent concordant; discordant; and tied pairs, as well as four rank correlation indexes: Somers D; Gamma; Tau-a; and c. The procedure displays them in the Association of Predicted Probabilities and Observed Responses table. In the presence of survey weights, however, the procedure computes those measures ignoring the weights. This makes it difficult for survey researchers to use PROC LOGISTIC for assessment of the predictive ability of a model, because survey weights are commonly used to analyze survey data. The first macro we offer takes the survey weights into account when computing the mentioned Association Parameters and compares the unweighted measures with the ones calculated by the macro. PROC UNIVARIATE provides five methods for computing quantile statistics. However, these may not be enough if a researcher wants to match SAS statistical computation results with those from other statistical packages or use SAS to reproduce statistical computations done in another package. For instance, S-PLUS computes quantiles using a different approach. Our second macro computes quantiles following the algorithm used in S- PLUS and compares its results with respective quantiles produced by PROC UNIVARIATE. Macro I: to Compute the Weighted Association of Predicted Probabilities and Observed Responses Table. Introduction The Association of Predicted Probabilities and Observed Responses table lists several measures of association to help a researcher assess the quality of a logistic model. PROC LOGISTIC computes the percentage of concordant, discordant, and tied observations and the number of observation pairs upon which the percentages are based [1]. If a response variable is set to 1 in case of event and 0 in case of non-event, then for all pairs of observations with different values of the response variable, a pair is concordant if an event observation has a higher predicted probability than a non-event observation; a pair is discordant if an event observation has a lower predicted probability than a non-event observation; and if the predicted probabilities are equal for a pair, it is a tie [2]. PROC LOGISTIC computes percent concordant, discordant, and tied pairs along with the total number of pairs. The four rank correlation indexes in the table are computed from the numbers of concordant and discordant pairs of observations by the following formulae: where Somers D = (nc nd) / t (1) Gamma = (nc nd) / (nc + nd) (2) Tau-a = (nc nd) /.5N(N-1) (3) c = (nc +.5(t nc nd)) / t (4) N is the total number of observations in the input data set. t is the total number of pairs with different response values nc is the number of concordant pairs. nd is the number of discordant pairs [1]. In a relative sense, a model with higher values for these indexes has better predictive ability than a model with lower values for these indexes [2]. It turns out, however, that in the presence of survey weight the LOGISTIC procedure does not work as expected with regard to computation of Association Measures. To test this, we fitted data from an actual survey to the model with just two predictors in both unweighted and weighted cases. To obtain more detail than rounded results, we extracted the calculated measures using ODS: ods listing close; ods output Association=assocu; proc logistic descending data=analytic; class indep1 indep2 ; model response= indep1 indep2 ; 1

2 ods listing; proc print data=assocu noobs; title3 "Unweighted Association Measures"; ods listing close; ods output Association=assocw; proc logistic descending data=analytic; class indep1 indep2; model response= indep1 indep2; weight wgt/norm; ods listing; proc print data=assocw noobs; title3 "Weighted Association Measures"; The following output shows a complete identity of weighted and unweighted measures, which casts doubt upon the procedure s ability to correctly compute Association Parameters in the presence of survey weights. Unweighted Association Measures Label1 cvalue1 nvalue1 Label2 Value2 nvalue2 Percent Concordant Somers' D Percent Discordant Gamma Percent Tied Tau-a Pairs c Weighted Association Measures probability for the event observation (response is 1) be w i and p_hat i and for the non-event observation (response is 0) be w j and p_hat j. If p_hat i is greater than p_hat j, then, following the definition given in the introduction, the pair will be concordant and its weighted representation w i * w j will be added to the weighted total of concordant pairs. In the same vein, if p_hat i is lower than p_hat j, then the pair will be discordant and its weighted representation w i * w j will be added to the weighted total of discordant pairs. Finally, if the pair is neither concordant nor discordant, the product w i * w j will be added to the weighted total of tied pairs. Denoting W E as the total weighted number of event responses and W N as the total weighted number of non-event responses, the total weighted number of pairs is calculated as W E *W N. Based upon the weighted totals accumulated after E*N iterations, the macro calculates the respective percents and the correlation indexes by formulae (1) (4). This macro reports the correctly calculated weighted measures immediately after the official Association of Predicted Probabilities and Observed Responses table. Exhibit 1 demonstrates the beginning and the end of the listing of the macro that was run over the survey data set. The logistic model used in the example has 12 categorical independent variables expl1 expl12, dependent variable effect (1,0), weight wgt, and is called by the following statement: %wtappor ( ds = survey outds=, weight= wgt, model = expl1-expl12, depvar = effect ); Label1 cvalue1 nvalue1 Label2 Value2 nvalue2 Percent Concordant Somers' D Percent Discordant Gamma Percent Tied Tau-a Pairs c Although with an increase in the number of predictors in the model a certain difference between unweighted and weighted measures emerges, the official weighted measures are by no means what we could expect and use for model assessment. Macro WTAPPOR. As may be seen from Exhibit 1, there are measurable differences between the official measures and those calculated by the macro WTAPPOR. Note that the official weighted number of pairs is a product of unweighted (E*N) frequencies of event and non-event responses and 4260 respectively, whereas the weighted number of pairs calculated and used by the macro is a product of total normalized weights (W E *W N ) for event and non-event sets and respectively. The macro itself is presented in Exhibit 2. It is well commented and easy to use. We offer here the macro WTAPPOR that does take survey weights into account. The macro uses the same formulae (1) (4) but in a weighted form. Let the number of event responses in a sample be E and the number of non-event responses be N. The total unweighted number of pairs being considered is E*N. Let us consider the ij-th pair of observations, and let the weight and the predicted 2

3 Exhibit 1. Association of Predicted Probabilities and Observed Responses The LOGISTIC Procedure Model Information Data Set WORK.ANALYTIC Response Variable effect Positive Effect Number of Response Levels 2 Number of Observations Weight Variable wgt Final Weight Sum of Weights Link Function Logit Optimization Technique Fisher's scoring Response Profile Ordered Total Total Value effect Frequency Weight NOTE: Weights are normalized to the actual sample size Official table Association of Predicted Probabilities and Observed Responses Percent Concordant 64.8 Somers' D Percent Discordant 34.6 Gamma Percent Tied 0.6 Tau-a Pairs c Table calculated by the Macro Association of Predicted Probabilities and Observed Responses using normalized weight WGT Weighted Percent Concordant 66.6 Weighted Somers' D Weighted Percent Discordant 33.3 Weighted Gamma Weighted Percent Tied 0.1 Weighted Tau-a Weighted Pairs Weighted c Exhibit 2. Macro WTAPPOR %macro WTAPPOR (ds =, /* INPUT DATA SET */ outds =,/* OUTPUT DATA SET WITH MEASURES IF BLANK, JUST /*** FIT DATA BY LOGISTIC MODEL TO GET PREDICTED PROBABILITIES ***/ DISPLAYING RESULT */ weight =,/* SURVEY WEIGHT */ model =, /* STRING WITH EXPLANATORY VAR's. ALL MUST BE CATEGORICAL*/ depvar =, /* DEPENDENT VARIABLE, 1-EVENT, 0- NON-EVENT */ ) ; proc logistic descending data=&ds; weight &weight./norm; /*USING NORMALIZED WEIGHT*/ class &model; model &depvar= &model; output out=_probs(keep=&depvar &weight _p_hat) predicted=_p_hat proc sql noprint; /* TOTAL WEIGHTED NUMBER OF RECORDS*/ select sum(&weight) into: tot wgt from _probs; /* TOTAL UNWEIGHTED NUMBER OF RECORDS */ select count(*) into: tot unw from _probs; select count(*) into: tot nev from _probs where &depvar=0; quit; proc summary noprint nway; var &weight; output out=_out sum=_sumw0; /* TOTAL UNWEIGHTED NUMBER OF NON - EVENTS */ /* NORMALIZE WEIGHT */ data _probs1(rename=(_p_hat=_p_hat1 &weight=_w1)) /* EVENT DATA SET */ _probs0(rename=(_p_hat=_p_hat0 &weight=_w0)); /* NON EVENT DATA SET*/ set _probs; if _n_=1 then set _out; &weight= &weight.*&_tot_unw./_sumw0;/*normalization*/ _concord=0; _discord=0; _tie=0;***-> INITIALIZE MEASURES; if &depvar=1 then do; keep _p_hat &weight _concord _discord _tie ; output _probs1;end; else do; keep _p_hat &weight; output _probs0; end /* WEIGHTED TOTAL OF EVENTS */ 3

4 proc summary data=_probs1 noprint nway; var _w1; output out=_total1 sum=_total1; /* WEIGHTS TOTAL OF NON-EVENTS */ proc summary data=_probs0 noprint nway; var _w0; output out=_total0 sum=_total0; data _total; /* DATA SET WITH WEIGHTED TOTAL */ merge _total1 _total0; _total_p =_total1*_total0; %macro cummsr; %do i=1 %to & tot nev; /* COMPARE EACH EVENT OBSERVATION WITH EACH NON-EVENT OBSERVATION */ data _probs1; set _probs1; if _n_=1 then set _probs0(firstobs=&_i obs=&_i); if _p_hat1<_p_hat0 then /* ACCRUE DISCORD*/ _discord=_discord+_w0*_w1; else if _p_hat1>_p_hat0 then _concord=_concord+_w0*_w1; /*ACCRUE CONCORD */ else _tie=_tie+_w0*_w1; /* ACCRUE TIES */ drop _p_hat0 _w0; %mend cummsr; %cummsr; /* SUM ACCORDANCE, CONCORDANCE AND TIES*/ /* THROUGH THE WHOLE DATA SET */ proc summary data=_probs1 noprint nway; var _concord _discord _tie; output out=_out sum=_concord _discord _tie; /* CALCULATION OF PERCENTAGE AND MEASURES */ /* BY FORMULAE 1 4 */ data &outds _out(keep=wgt_:); merge _out _total; Wgt_Percent_Concordant = round(_concord*100/_total_p,.01); Wgt_Percent_Discordant = round(_discord*100/_total_p,.01); Wgt_Percent_Tied = round(_tie*100/_total_p,.01); Wgt_Pairs = _total_p; Wgt_Somers_D = (_concord - _discord) / _total_p; Wgt_Gamma = (_concord - _discord) / (_concord + _discord); Wgt_Tau_a = (_concord - _discord) /(.5*&_tot_unw.*(&_tot_unw - 1)); Wgt_c=(_concord +.5*(_total_p - _concord - _discord))/_total_p; /* DISPLAY RESULTS AFER OFICIAL TABLE */ data null ; set out; file print ls=80 ps=59; put Association of Predicted Probabilities and Observed Responses ; put using normalized weight &weight ; put; put Weighted Percent Concordant Wgt_Percent_Concordant 5.2 " Weighted Somers' D " Wgt_Somers_D 6.4; put Weighted Percent Discordant Wgt_Percent_Discordant 5.2 " Weighted Gamma " Wgt_Gamma 6.4 ; put Weighted Percent Tied Wgt_Percent_Tied 5.2 " Weighted Tau-a " Wgt_Tau_a 6.4 ; put Weighted Pairs Wgt_Pairs 10. " Weighted c " Wgt_c 6.4 ; %mend wtappor; Summary. The presented macro, WTAPPOR, is a valuable instrument for a survey researcher to assess the quality of a logistic model when survey weights are present. The macro gives appreciably different measures of association from those calculated by PROC LOGISTIC. Macro II: Are five methods to compute quantiles enough? If not, get a sixth one. Introduction The reader will remember that using PCTLDEF= option in PROC UNIVARIATE, one can specify one of five methods for computing quantile statistics. Following the definitions in [3], let n be the number of nonmissing values for a variable and let x 1,,x n represent the ordered values 4

5 of the variable. For the tth percentile, let p=t/100. For definitions 1, 2, 3, and 5 below, let np = j + g, where j is the integer part and g is the fractional part of np. For definition 4, let (n+1)p = j + g. Then, the tth percentile, y, is defined as follows: PCTLDEF = 1 weighted average at x np y = (1 g) x j + gx j+1, where x 0 is taken to be x 1 PCTLDEF = 2 observation numbered closest to np y = x i, where i is the integer part of np + ½ if g ½. If g = ½, then y= x j if j is even, or y = x j+1 if j is odd. PCTLDEF = 3 empirical distribution function, y = x j if g = 0, y= x j+1 if g > 0 PCTLDEF = 4 weighted average aimed at x p(n+1), y = (1 g)x j + gx j+1, where x n+1 is taken to be x n PCTLDEF = 5 empirical distribution function with averaging, y = (x j + x j+1) /2 if g = 0, y = x j+1 if g>0. Researchers often need to match results obtained by SAS with those given by another statistical package or to reproduce with SAS statistical computations done in another package. If quantiles are involved in those statistical computations, matching may fail because another statistical package may compute quantiles differently. For example, S-PLUS uses the function quantile(x, p) that computes quantiles at specified probabilities linearly interpolating and using formula: quantile(x, p) = [1-(p(n 1) - p(n-1) )]x 1+ p(n-1) + [p(n-1) - p(n-1) ]x 2+ p(n-1) (5) where x 1,,x n is the ordered sample, p is specified probability, denotes the floor or integer part of [4]. The result of the function quantile(x, p) will not be generally identical to any of the five methods described above. Below, we present the macro QUANT6SP that computes S- PLUS-like quantiles by formula (5) and compare its results with those obtained by the five methods of PROC UNIVARIATE. Macro QUANT6SP The macro presented below is richly supplied with comments and is easy to use. %macro quant6sp ( inds=, /* input data set with variable of interest */ var =, /* variable upon which to compute quantiles */ ncell=, /* number of cells boundaries of which are to be determined by quantiles,4 - for quartiles */ prfx=, /* prefix we want for quantiles variables */ outds=, /*data set with quantiles */ ); %let _step = %sysevalf(1/&ncell); /* bounders of quantiles */ data _temp; %macro stq; /* create string with boundaries of quantiles */ f= "0 " %do i=1 %to %eval(&ncell-1); %sysevalf(&i*&_step) ' ' " 1"; %mend; %stq; data _null_; set _temp; call symput('pctl',left(f)); /* create macro variable as string with boundaries of quantiles */ %put BOUNDARIES OF QUANTILES: &pctl; proc sort data=&inds (keep=&var) out=_i; by &var; /* order variable in ascending*/ data _null_; /* number of records,that is values of variable */ set _i end=fin; retain _n; _n+1; if fin then call symput('totn',left(_n)); %do l=1 %to %eval(&ncell+1); %let p&l =%scan(&pctl,&l, %str( )); /* retrieve boundary and put it into respective macro var*/ data &outds (keep= &prfx.:) ; set _i end=_fin; retain %do j=1 %to %eval(&ncell+1); _less&j _greater&j 0; /* retrieve values of variable for formula (1) components */ %do j=1 %to %eval(&ncell+1); /* accumulate components of formula (1) for all quantiles */ 5

6 if _n_ = 1+floor(%sysevalf(&&p&j*(&totn-1))) then _less&j=&var; if _n_ = 2+floor(%sysevalf(&&p&j*(&totn-1))) then _greater&j=&var; if _fin then do; /* compute formula (1) for all boundaries /* using one passage through data set */ %do j=1 %to %eval(&ncell+1); &prfx&j=(1-(%sysevalf(&&p&j*(&totn-1)) - floor(%sysevalf(&&p&j*(&totn-1)))))*_less&j + (%sysevalf(&&p&j*(&totn-1)) - floor(%sysevalf(&&p&j*(&totn- 1))))*_greater&j; output; end; proc print; %mend; Results Here, we present an example of macro QUANT6SP call to break down predicted probabilities by quartiles: %quant6sp (inds = probs, var = probabs, ncell=4, outds=out, prfx =method6); The computed quartiles are shown below: 0% 25% 50% 75% 100% Applying each of the five methods of PROC UNIVARIATE to the same variable probabs, we obtain the following table: PCTLDEF 0% 25% 50% 75% 100% As is shown, none of the five sets of quartiles above is identical to the results obtained using the macro quant6sp References 1.SAS Institute, Inc (1999). SAS/STAT.Version 8. Chapter 39, Cary, NC: SAS institute Inc. 2. Logistic Regression Examples. Using the SAS System. SAS Institute Inc., SAS Institute, Inc (1999). SAS/BASE. SAS Procedures Guide. PROC UNIVARIATE. 4. Venables, W.N, Ripley B.D (2000) Modern Applied Statistics with S-PLUS, Springer-Verlag, New York Contact Information David Izrael Abt Associates Inc. Cambridge, MA tel: (617) david_izrael@abtassoc.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies 6

A SAS Macro for Balancing a Weighted Sample

A SAS Macro for Balancing a Weighted Sample Paper 258-25 A SAS Macro for Balancing a Weighted Sample David Izrael, David C. Hoaglin, and Michael P. Battaglia Abt Associates Inc., Cambridge, Massachusetts Abstract It is often desirable to adjust

More information

A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA

A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA Paper RF10-2015 A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA ABSTRACT Weight of evidence (WOE) recoding

More information

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Paper SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Qixuan Chen, University of Michigan, Ann Arbor, MI Brenda Gillespie, University of Michigan, Ann Arbor, MI ABSTRACT This paper

More information

Using Templates Created by the SAS/STAT Procedures

Using Templates Created by the SAS/STAT Procedures Paper 081-29 Using Templates Created by the SAS/STAT Procedures Yanhong Huang, Ph.D. UMDNJ, Newark, NJ Jianming He, Solucient, LLC., Berkeley Heights, NJ ABSTRACT SAS procedures provide a large quantity

More information

Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research

Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research Liping Huang, Center for Home Care Policy and Research, Visiting Nurse Service of New York, NY, NY ABSTRACT The

More information

Oh Quartile, Where Art Thou? David Franklin, TheProgrammersCabin.com, Litchfield, NH

Oh Quartile, Where Art Thou? David Franklin, TheProgrammersCabin.com, Litchfield, NH PharmaSUG 2013 Paper SP08 Oh Quartile, Where Art Thou? David Franklin, TheProgrammersCabin.com, Litchfield, NH ABSTRACT "Why is my first quartile number different from yours?" It was this question that

More information

To conceptualize the process, the table below shows the highly correlated covariates in descending order of their R statistic.

To conceptualize the process, the table below shows the highly correlated covariates in descending order of their R statistic. Automating the process of choosing among highly correlated covariates for multivariable logistic regression Michael C. Doherty, i3drugsafety, Waltham, MA ABSTRACT In observational studies, there can be

More information

Ranking Between the Lines

Ranking Between the Lines Ranking Between the Lines A %MACRO for Interpolated Medians By Joe Lorenz SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in

More information

BY S NOTSORTED OPTION Karuna Samudral, Octagon Research Solutions, Inc., Wayne, PA Gregory M. Giddings, Centocor R&D Inc.

BY S NOTSORTED OPTION Karuna Samudral, Octagon Research Solutions, Inc., Wayne, PA Gregory M. Giddings, Centocor R&D Inc. ABSTRACT BY S NOTSORTED OPTION Karuna Samudral, Octagon Research Solutions, Inc., Wayne, PA Gregory M. Giddings, Centocor R&D Inc., Malvern, PA What if the usual sort and usual group processing would eliminate

More information

Using PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA

Using PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA Using PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA ABSTRACT This paper describes for an intermediate SAS user the use of PROC REPORT to create

More information

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes Brian E. Lawton Curriculum Research & Development Group University of Hawaii at Manoa Honolulu, HI December 2012 Copyright 2012

More information

PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA

PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA ABSTRACT Labeling of the X-axis usually involves a tedious axis statement specifying

More information

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS ABSTRACT Paper 1938-2018 Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS Robert M. Lucas, Robert M. Lucas Consulting, Fort Collins, CO, USA There is confusion

More information

Identifying Duplicate Variables in a SAS Data Set

Identifying Duplicate Variables in a SAS Data Set Paper 1654-2018 Identifying Duplicate Variables in a SAS Data Set Bruce Gilsen, Federal Reserve Board, Washington, DC ABSTRACT In the big data era, removing duplicate data from a data set can reduce disk

More information

SUGI 29 Statistics and Data Analysis. To Rake or Not To Rake Is Not the Question Anymore with the Enhanced Raking Macro

SUGI 29 Statistics and Data Analysis. To Rake or Not To Rake Is Not the Question Anymore with the Enhanced Raking Macro Paper 7-9 To Rake or Not To Rake Is Not the Question Anymore with the Enhanced Raking Macro David Izrael, David C. Hoaglin, and Michael P. Battaglia Abt Associates Inc., Cambridge, Massachusetts Abstract

More information

Macro to compute best transform variable for the model

Macro to compute best transform variable for the model Paper 3103-2015 Macro to compute best transform variable for the model Nancy Hu, Discover Financial Service ABSTRACT This study is intended to assist Analysts to generate the best of variables using simple

More information

Let s Get FREQy with our Statistics: Data-Driven Approach to Determining Appropriate Test Statistic

Let s Get FREQy with our Statistics: Data-Driven Approach to Determining Appropriate Test Statistic PharmaSUG 2018 - Paper EP-09 Let s Get FREQy with our Statistics: Data-Driven Approach to Determining Appropriate Test Statistic Richann Watson, DataRich Consulting, Batavia, OH Lynn Mullins, PPD, Cincinnati,

More information

BACKGROUND INFORMATION ON COMPLEX SAMPLE SURVEYS

BACKGROUND INFORMATION ON COMPLEX SAMPLE SURVEYS Analysis of Complex Sample Survey Data Using the SURVEY PROCEDURES and Macro Coding Patricia A. Berglund, Institute For Social Research-University of Michigan, Ann Arbor, Michigan ABSTRACT The paper presents

More information

CREATING THE DISTRIBUTION ANALYSIS

CREATING THE DISTRIBUTION ANALYSIS Chapter 12 Examining Distributions Chapter Table of Contents CREATING THE DISTRIBUTION ANALYSIS...176 BoxPlot...178 Histogram...180 Moments and Quantiles Tables...... 183 ADDING DENSITY ESTIMATES...184

More information

Virtual Accessing of a SAS Data Set Using OPEN, FETCH, and CLOSE Functions with %SYSFUNC and %DO Loops

Virtual Accessing of a SAS Data Set Using OPEN, FETCH, and CLOSE Functions with %SYSFUNC and %DO Loops Paper 8140-2016 Virtual Accessing of a SAS Data Set Using OPEN, FETCH, and CLOSE Functions with %SYSFUNC and %DO Loops Amarnath Vijayarangan, Emmes Services Pvt Ltd, India ABSTRACT One of the truths about

More information

Stat 5100 Handout #14.a SAS: Logistic Regression

Stat 5100 Handout #14.a SAS: Logistic Regression Stat 5100 Handout #14.a SAS: Logistic Regression Example: (Text Table 14.3) Individuals were randomly sampled within two sectors of a city, and checked for presence of disease (here, spread by mosquitoes).

More information

Data Quality Control: Using High Performance Binning to Prevent Information Loss

Data Quality Control: Using High Performance Binning to Prevent Information Loss SESUG Paper DM-173-2017 Data Quality Control: Using High Performance Binning to Prevent Information Loss ABSTRACT Deanna N Schreiber-Gregory, Henry M Jackson Foundation It is a well-known fact that the

More information

Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning

Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning ABSTRACT Deanna Naomi Schreiber-Gregory, Henry M Jackson Foundation, Bethesda, MD It is a well-known fact that

More information

Data Quality Control: Using High Performance Binning to Prevent Information Loss

Data Quality Control: Using High Performance Binning to Prevent Information Loss Paper 2821-2018 Data Quality Control: Using High Performance Binning to Prevent Information Loss Deanna Naomi Schreiber-Gregory, Henry M Jackson Foundation ABSTRACT It is a well-known fact that the structure

More information

Fathom Dynamic Data TM Version 2 Specifications

Fathom Dynamic Data TM Version 2 Specifications Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other

More information

Creating Macro Calls using Proc Freq

Creating Macro Calls using Proc Freq Creating Macro Calls using Proc Freq, Educational Testing Service, Princeton, NJ ABSTRACT Imagine you were asked to get a series of statistics/tables for each country in the world. You have the data, but

More information

A Cross-national Comparison Using Stacked Data

A Cross-national Comparison Using Stacked Data A Cross-national Comparison Using Stacked Data Goal In this exercise, we combine household- and person-level files across countries to run a regression estimating the usual hours of the working-aged civilian

More information

A Side of Hash for You To Dig Into

A Side of Hash for You To Dig Into A Side of Hash for You To Dig Into Shan Ali Rasul, Indigo Books & Music Inc, Toronto, Ontario, Canada. ABSTRACT Within the realm of Customer Relationship Management (CRM) there is always a need for segmenting

More information

SAS/STAT 13.1 User s Guide. The NESTED Procedure

SAS/STAT 13.1 User s Guide. The NESTED Procedure SAS/STAT 13.1 User s Guide The NESTED Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute

More information

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA Paper CC-20 Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA ABSTRACT Statistical Hypothesis Testing is performed to determine whether enough statistical

More information

SAS is the most widely installed analytical tool on mainframes. I don t know the situation for midrange and PCs. My Focus for SAS Tools Here

SAS is the most widely installed analytical tool on mainframes. I don t know the situation for midrange and PCs. My Focus for SAS Tools Here Explore, Analyze, and Summarize Your Data with SAS Software: Selecting the Best Power Tool from a Rich Portfolio PhD SAS is the most widely installed analytical tool on mainframes. I don t know the situation

More information

CHAPTER 7 Using Other SAS Software Products

CHAPTER 7 Using Other SAS Software Products 77 CHAPTER 7 Using Other SAS Software Products Introduction 77 Using SAS DATA Step Features in SCL 78 Statements 78 Functions 79 Variables 79 Numeric Variables 79 Character Variables 79 Expressions 80

More information

A Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment

A Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment A Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment Abstract Jiannan Hu Vertex Pharmaceuticals, Inc. When a clinical trial is at the stage of

More information

Assessing superiority/futility in a clinical trial: from multiplicity to simplicity with SAS

Assessing superiority/futility in a clinical trial: from multiplicity to simplicity with SAS PharmaSUG2010 Paper SP10 Assessing superiority/futility in a clinical trial: from multiplicity to simplicity with SAS Phil d Almada, Duke Clinical Research Institute (DCRI), Durham, NC Laura Aberle, Duke

More information

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by

Paper CC-016. METHODOLOGY Suppose the data structure with m missing values for the row indices i=n-m+1,,n can be re-expressed by Paper CC-016 A macro for nearest neighbor Lung-Chang Chien, University of North Carolina at Chapel Hill, Chapel Hill, NC Mark Weaver, Family Health International, Research Triangle Park, NC ABSTRACT SAS

More information

Handling Numeric Representation SAS Errors Caused by Simple Floating-Point Arithmetic Computation Fuad J. Foty, U.S. Census Bureau, Washington, DC

Handling Numeric Representation SAS Errors Caused by Simple Floating-Point Arithmetic Computation Fuad J. Foty, U.S. Census Bureau, Washington, DC Paper BB-206 Handling Numeric Representation SAS Errors Caused by Simple Floating-Point Arithmetic Computation Fuad J. Foty, U.S. Census Bureau, Washington, DC ABSTRACT Every SAS programmer knows that

More information

Getting it Done with PROC TABULATE

Getting it Done with PROC TABULATE ABSTRACT Getting it Done with PROC TABULATE Michael J. Williams, ICON Clinical Research, San Francisco, CA The task of displaying statistical summaries of different types of variables in a single table

More information

Using SAS Macros to Extract P-values from PROC FREQ

Using SAS Macros to Extract P-values from PROC FREQ SESUG 2016 ABSTRACT Paper CC-232 Using SAS Macros to Extract P-values from PROC FREQ Rachel Straney, University of Central Florida This paper shows how to leverage the SAS Macro Facility with PROC FREQ

More information

An Application of PROC NLP to Survey Sample Weighting

An Application of PROC NLP to Survey Sample Weighting An Application of PROC NLP to Survey Sample Weighting Talbot Michael Katz, Analytic Data Information Technologies, New York, NY ABSTRACT The classic weighting formula for survey respondents compensates

More information

Creating Code writing algorithms for producing n-lagged variables. Matt Bates, J.P. Morgan Chase, Columbus, OH

Creating Code writing algorithms for producing n-lagged variables. Matt Bates, J.P. Morgan Chase, Columbus, OH Paper AA05-2014 Creating Code writing algorithms for producing n-lagged variables Matt Bates, J.P. Morgan Chase, Columbus, OH ABSTRACT As a predictive modeler with time-series data there is a continuous

More information

/* SAS Macro UNISTATS Version 2.2 December 2017

/* SAS Macro UNISTATS Version 2.2 December 2017 /*-------------------------------------------------------------------- SAS Macro UNISTATS Version 2.2 December 2017 UNISTATS makes PROC UNIVARIATE statistics more convenient by presenting one row for each

More information

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma ABSTRACT Today there is more pressure on programmers to deliver summary outputs faster without sacrificing quality. By using just a few programming

More information

SAS/STAT 14.2 User s Guide. The SURVEYIMPUTE Procedure

SAS/STAT 14.2 User s Guide. The SURVEYIMPUTE Procedure SAS/STAT 14.2 User s Guide The SURVEYIMPUTE Procedure This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute

More information

Hidden in plain sight: my top ten underpublicized enhancements in SAS Versions 9.2 and 9.3

Hidden in plain sight: my top ten underpublicized enhancements in SAS Versions 9.2 and 9.3 Hidden in plain sight: my top ten underpublicized enhancements in SAS Versions 9.2 and 9.3 Bruce Gilsen, Federal Reserve Board, Washington, DC ABSTRACT SAS Versions 9.2 and 9.3 contain many interesting

More information

How to Keep Multiple Formats in One Variable after Transpose Mindy Wang

How to Keep Multiple Formats in One Variable after Transpose Mindy Wang How to Keep Multiple Formats in One Variable after Transpose Mindy Wang Abstract In clinical trials and many other research fields, proc transpose are used very often. When many variables with their individual

More information

So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines

So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines Paper TT13 So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines Anthony Harris, PPD, Wilmington, NC Robby Diseker, PPD, Wilmington, NC ABSTRACT

More information

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data Paper PO31 The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data MaryAnne DePesquo Hope, Health Services Advisory Group, Phoenix, Arizona Fen Fen Li, Health Services Advisory Group,

More information

Data Quality Review for Missing Values and Outliers

Data Quality Review for Missing Values and Outliers Paper number: PH03 Data Quality Review for Missing Values and Outliers Ying Guo, i3, Indianapolis, IN Bradford J. Danner, i3, Lincoln, NE ABSTRACT Before performing any analysis on a dataset, it is often

More information

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN Paper 045-29 A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN ABSTRACT: PROC MEANS analyzes datasets according to the variables listed in its Class

More information

The NESTED Procedure (Chapter)

The NESTED Procedure (Chapter) SAS/STAT 9.3 User s Guide The NESTED Procedure (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for the complete manual

More information

186 Statistics, Data Analysis and Modeling. Proceedings of MWSUG '95

186 Statistics, Data Analysis and Modeling. Proceedings of MWSUG '95 A Statistical Analysis Macro Library in SAS Carl R. Haske, Ph.D., STATPROBE, nc., Ann Arbor, M Vivienne Ward, M.S., STATPROBE, nc., Ann Arbor, M ABSTRACT Statistical analysis plays a major role in pharmaceutical

More information

Automating Preliminary Data Cleaning in SAS

Automating Preliminary Data Cleaning in SAS Paper PO63 Automating Preliminary Data Cleaning in SAS Alec Zhixiao Lin, Loan Depot, Foothill Ranch, CA ABSTRACT Preliminary data cleaning or scrubbing tries to delete the following types of variables

More information

Missing Pages Report. David Gray, PPD, Austin, TX Zhuo Chen, PPD, Austin, TX

Missing Pages Report. David Gray, PPD, Austin, TX Zhuo Chen, PPD, Austin, TX PharmaSUG2010 - Paper DM05 Missing Pages Report David Gray, PPD, Austin, TX Zhuo Chen, PPD, Austin, TX ABSTRACT In a clinical study it is important for data management teams to receive CRF pages from investigative

More information

SAS Programming Techniques for Manipulating Metadata on the Database Level Chris Speck, PAREXEL International, Durham, NC

SAS Programming Techniques for Manipulating Metadata on the Database Level Chris Speck, PAREXEL International, Durham, NC PharmaSUG2010 - Paper TT06 SAS Programming Techniques for Manipulating Metadata on the Database Level Chris Speck, PAREXEL International, Durham, NC ABSTRACT One great leap that beginning and intermediate

More information

/********************************************/ /* Evaluating the PS distribution!!! */ /********************************************/

/********************************************/ /* Evaluating the PS distribution!!! */ /********************************************/ SUPPLEMENTAL MATERIAL: Example SAS code /* This code demonstrates estimating a propensity score, calculating weights, */ /* evaluating the distribution of the propensity score by treatment group, and */

More information

Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment

Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ Aiming Yang, Merck & Co., Inc., Rahway, NJ ABSTRACT Four pitfalls are commonly

More information

Tales from the Help Desk 6: Solutions to Common SAS Tasks

Tales from the Help Desk 6: Solutions to Common SAS Tasks SESUG 2015 ABSTRACT Paper BB-72 Tales from the Help Desk 6: Solutions to Common SAS Tasks Bruce Gilsen, Federal Reserve Board, Washington, DC In 30 years as a SAS consultant at the Federal Reserve Board,

More information

SAS/STAT 15.1 User s Guide The STDIZE Procedure

SAS/STAT 15.1 User s Guide The STDIZE Procedure SAS/STAT 15.1 User s Guide The STDIZE Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.

More information

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System %MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System Rushi Patel, Creative Information Technology, Inc., Arlington, VA ABSTRACT It is common to find

More information

- 1 - Fig. A5.1 Missing value analysis dialog box

- 1 - Fig. A5.1 Missing value analysis dialog box WEB APPENDIX Sarstedt, M. & Mooi, E. (2019). A concise guide to market research. The process, data, and methods using SPSS (3 rd ed.). Heidelberg: Springer. Missing Value Analysis and Multiple Imputation

More information

2 = Disagree 3 = Neutral 4 = Agree 5 = Strongly Agree. Disagree

2 = Disagree 3 = Neutral 4 = Agree 5 = Strongly Agree. Disagree PharmaSUG 2012 - Paper HO01 Multiple Techniques for Scoring Quality of Life Questionnaires Brandon Welch, Rho, Inc., Chapel Hill, NC Seungshin Rhee, Rho, Inc., Chapel Hill, NC ABSTRACT In the clinical

More information

Validation Summary using SYSINFO

Validation Summary using SYSINFO Validation Summary using SYSINFO Srinivas Vanam Mahipal Vanam Shravani Vanam Percept Pharma Services, Bridgewater, NJ ABSTRACT This paper presents a macro that produces a Validation Summary using SYSINFO

More information

footnote1 height=8pt j=l "(Rev. &sysdate)" j=c "{\b\ Page}{\field{\*\fldinst {\b\i PAGE}}}";

footnote1 height=8pt j=l (Rev. &sysdate) j=c {\b\ Page}{\field{\*\fldinst {\b\i PAGE}}}; Producing an Automated Data Dictionary as an RTF File (or a Topic to Bring Up at a Party If You Want to Be Left Alone) Cyndi Williamson, SRI International, Menlo Park, CA ABSTRACT Data dictionaries are

More information

Tasks Menu Reference. Introduction. Data Management APPENDIX 1

Tasks Menu Reference. Introduction. Data Management APPENDIX 1 229 APPENDIX 1 Tasks Menu Reference Introduction 229 Data Management 229 Report Writing 231 High Resolution Graphics 232 Low Resolution Graphics 233 Data Analysis 233 Planning Tools 235 EIS 236 Remote

More information

SAS Example A10. Output Delivery System (ODS) Sample Data Set sales.txt. Examples of currently available ODS destinations: Mervyn Marasinghe

SAS Example A10. Output Delivery System (ODS) Sample Data Set sales.txt. Examples of currently available ODS destinations: Mervyn Marasinghe SAS Example A10 data sales infile U:\Documents\...\sales.txt input Region : $8. State $2. +1 Month monyy5. Headcnt Revenue Expenses format Month monyy5. Revenue dollar12.2 proc sort by Region State Month

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

Submitting SAS Code On The Side

Submitting SAS Code On The Side ABSTRACT PharmaSUG 2013 - Paper AD24-SAS Submitting SAS Code On The Side Rick Langston, SAS Institute Inc., Cary NC This paper explains the new DOSUBL function and how it can submit SAS code to run "on

More information

Macros to Report Missing Data: An HTML Data Collection Guide Patrick Thornton, University of California San Francisco, SF, California

Macros to Report Missing Data: An HTML Data Collection Guide Patrick Thornton, University of California San Francisco, SF, California Macros to Report Missing Data: An HTML Data Collection Guide Patrick Thornton, University of California San Francisco, SF, California ABSTRACT This paper presents SAS macro programs that calculate missing

More information

Quick and Efficient Way to Check the Transferred Data Divyaja Padamati, Eliassen Group Inc., North Carolina.

Quick and Efficient Way to Check the Transferred Data Divyaja Padamati, Eliassen Group Inc., North Carolina. ABSTRACT PharmaSUG 2016 - Paper QT03 Quick and Efficient Way to Check the Transferred Data Divyaja Padamati, Eliassen Group Inc., North Carolina. Consistency, quality and timelines are the three milestones

More information

Stat 5100 Handout #19 SAS: Influential Observations and Outliers

Stat 5100 Handout #19 SAS: Influential Observations and Outliers Stat 5100 Handout #19 SAS: Influential Observations and Outliers Example: Data collected on 50 countries relevant to a cross-sectional study of a lifecycle savings hypothesis, which states that the response

More information

Contents of SAS Programming Techniques

Contents of SAS Programming Techniques Contents of SAS Programming Techniques Chapter 1 About SAS 1.1 Introduction 1.1.1 SAS modules 1.1.2 SAS module classification 1.1.3 SAS features 1.1.4 Three levels of SAS techniques 1.1.5 Chapter goal

More information

TRANSFORMING MULTIPLE-RECORD DATA INTO SINGLE-RECORD FORMAT WHEN NUMBER OF VARIABLES IS LARGE.

TRANSFORMING MULTIPLE-RECORD DATA INTO SINGLE-RECORD FORMAT WHEN NUMBER OF VARIABLES IS LARGE. TRANSFORMING MULTIPLE-RECORD DATA INTO SINGLE-RECORD FORMAT WHEN NUMBER OF VARIABLES IS LARGE. David Izrael, Abt Associates Inc., Cambridge, MA David Russo, Independent Consultant ABSTRACT In one large

More information

Indenting with Style

Indenting with Style ABSTRACT Indenting with Style Bill Coar, Axio Research, Seattle, WA Within the pharmaceutical industry, many SAS programmers rely heavily on Proc Report. While it is used extensively for summary tables

More information

STEP 1 - /*******************************/ /* Manipulate the data files */ /*******************************/ <<SAS DATA statements>>

STEP 1 - /*******************************/ /* Manipulate the data files */ /*******************************/ <<SAS DATA statements>> Generalized Report Programming Techniques Using Data-Driven SAS Code Kathy Hardis Fraeman, A.K. Analytic Programming, L.L.C., Olney, MD Karen G. Malley, Malley Research Programming, Inc., Rockville, MD

More information

SAS/STAT 14.3 User s Guide The SURVEYFREQ Procedure

SAS/STAT 14.3 User s Guide The SURVEYFREQ Procedure SAS/STAT 14.3 User s Guide The SURVEYFREQ Procedure This document is an individual chapter from SAS/STAT 14.3 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute

More information

Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies

Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies Gandhi R Bhattarai PhD, OptumInsight, Rocky Hill, CT ABSTRACT Measuring the change in outcomes between

More information

PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need

PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need ABSTRACT Paper PO 133 PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need Imelda C. Go, South Carolina Department of Education, Columbia,

More information

Paper SDA-11. Logistic regression will be used for estimation of net error for the 2010 Census as outlined in Griffin (2005).

Paper SDA-11. Logistic regression will be used for estimation of net error for the 2010 Census as outlined in Griffin (2005). Paper SDA-11 Developing a Model for Person Estimation in Puerto Rico for the 2010 Census Coverage Measurement Program Colt S. Viehdorfer, U.S. Census Bureau, Washington, DC This report is released to inform

More information

Robust Linear Regression (Passing- Bablok Median-Slope)

Robust Linear Regression (Passing- Bablok Median-Slope) Chapter 314 Robust Linear Regression (Passing- Bablok Median-Slope) Introduction This procedure performs robust linear regression estimation using the Passing-Bablok (1988) median-slope algorithm. Their

More information

Essential ODS Techniques for Creating Reports in PDF Patrick Thornton, SRI International, Menlo Park, CA

Essential ODS Techniques for Creating Reports in PDF Patrick Thornton, SRI International, Menlo Park, CA Thornton, S. P. (2006). Essential ODS techniques for creating reports in PDF. Paper presented at the Fourteenth Annual Western Users of the SAS Software Conference, Irvine, CA. Essential ODS Techniques

More information

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data Spatial Patterns We will examine methods that are used to analyze patterns in two sorts of spatial data: Point Pattern Analysis - These methods concern themselves with the location information associated

More information

An Easy Route to a Missing Data Report with ODS+PROC FREQ+A Data Step Mike Zdeb, FSL, University at Albany School of Public Health, Rensselaer, NY

An Easy Route to a Missing Data Report with ODS+PROC FREQ+A Data Step Mike Zdeb, FSL, University at Albany School of Public Health, Rensselaer, NY SESUG 2016 Paper BB-170 An Easy Route to a Missing Data Report with ODS+PROC FREQ+A Data Step Mike Zdeb, FSL, University at Albany School of Public Health, Rensselaer, NY ABSTRACT A first step in analyzing

More information

The G4GRID Procedure. Introduction APPENDIX 1

The G4GRID Procedure. Introduction APPENDIX 1 93 APPENDIX 1 The G4GRID Procedure Introduction 93 Data Considerations 94 Terminology 94 Using the Graphical Interface 94 Procedure Syntax 95 The PROC G4GRID Statement 95 The GRID Statement 97 The BY Statement

More information

Mapping Clinical Data to a Standard Structure: A Table Driven Approach

Mapping Clinical Data to a Standard Structure: A Table Driven Approach ABSTRACT Paper AD15 Mapping Clinical Data to a Standard Structure: A Table Driven Approach Nancy Brucken, i3 Statprobe, Ann Arbor, MI Paul Slagle, i3 Statprobe, Ann Arbor, MI Clinical Research Organizations

More information

Right-click on whatever it is you are trying to change Get help about the screen you are on Help Help Get help interpreting a table

Right-click on whatever it is you are trying to change Get help about the screen you are on Help Help Get help interpreting a table Q Cheat Sheets What to do when you cannot figure out how to use Q What to do when the data looks wrong Right-click on whatever it is you are trying to change Get help about the screen you are on Help Help

More information

WEB MATERIAL. eappendix 1: SAS code for simulation

WEB MATERIAL. eappendix 1: SAS code for simulation WEB MATERIAL eappendix 1: SAS code for simulation /* Create datasets with variable # of groups & variable # of individuals in a group */ %MACRO create_simulated_dataset(ngroups=, groupsize=); data simulation_parms;

More information

SAS Enterprise Miner : Tutorials and Examples

SAS Enterprise Miner : Tutorials and Examples SAS Enterprise Miner : Tutorials and Examples SAS Documentation February 13, 2018 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2017. SAS Enterprise Miner : Tutorials

More information

Choosing the Right Procedure

Choosing the Right Procedure 3 CHAPTER 1 Choosing the Right Procedure Functional Categories of Base SAS Procedures 3 Report Writing 3 Statistics 3 Utilities 4 Report-Writing Procedures 4 Statistical Procedures 5 Efficiency Issues

More information

Generating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International

Generating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International Abstract Generating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International SAS has many powerful features, including MACRO facilities, procedures such

More information

T.I.P.S. (Techniques and Information for Programming in SAS )

T.I.P.S. (Techniques and Information for Programming in SAS ) Paper PO-088 T.I.P.S. (Techniques and Information for Programming in SAS ) Kathy Harkins, Carolyn Maass, Mary Anne Rutkowski Merck Research Laboratories, Upper Gwynedd, PA ABSTRACT: This paper provides

More information

Effects of PROC EXPAND Data Interpolation on Time Series Modeling When the Data are Volatile or Complex

Effects of PROC EXPAND Data Interpolation on Time Series Modeling When the Data are Volatile or Complex Effects of PROC EXPAND Data Interpolation on Time Series Modeling When the Data are Volatile or Complex Keiko I. Powers, Ph.D., J. D. Power and Associates, Westlake Village, CA ABSTRACT Discrete time series

More information

TRANSFORMING MULTIPLE-RECORD DATA INTO SINGLE-RECORD FORMAT WHEN NUMBER OF VARIABLES IS LARGE.

TRANSFORMING MULTIPLE-RECORD DATA INTO SINGLE-RECORD FORMAT WHEN NUMBER OF VARIABLES IS LARGE. TRANSFORMING MULTIPLE-RECORD DATA INTO SINGLE-RECORD FORMAT WHEN NUMBER OF VARIABLES IS LARGE. David Izrael, Abt Associates Inc., Cambridge, MA David Russo, Independent Consultant ABSTRACT In one large

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

PROC REPORT Basics: Getting Started with the Primary Statements

PROC REPORT Basics: Getting Started with the Primary Statements Paper HOW07 PROC REPORT Basics: Getting Started with the Primary Statements Arthur L. Carpenter California Occidental Consultants, Oceanside, California ABSTRACT The presentation of data is an essential

More information

PharmaSUG China Paper 059

PharmaSUG China Paper 059 PharmaSUG China 2016 - Paper 059 Using SAS @ to Assemble Output Report Files into One PDF File with Bookmarks Sam Wang, Merrimack Pharmaceuticals, Inc., Cambridge, MA Kaniz Khalifa, Leaf Research Services,

More information

An Algorithm to Compute Exact Power of an Unordered RxC Contingency Table

An Algorithm to Compute Exact Power of an Unordered RxC Contingency Table NESUG 27 An Algorithm to Compute Eact Power of an Unordered RC Contingency Table Vivek Pradhan, Cytel Inc., Cambridge, MA Stian Lydersen, Department of Cancer Research and Molecular Medicine, Norwegian

More information

A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys

A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys Richard L. Downs, Jr. and Pura A. Peréz U.S. Bureau of the Census, Washington, D.C. ABSTRACT This paper explains

More information

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA ABSTRACT Removing duplicate observations from a data set is not as easy as it might

More information

Base and Advance SAS

Base and Advance SAS Base and Advance SAS BASE SAS INTRODUCTION An Overview of the SAS System SAS Tasks Output produced by the SAS System SAS Tools (SAS Program - Data step and Proc step) A sample SAS program Exploring SAS

More information

An introduction to SPSS

An introduction to SPSS An introduction to SPSS To open the SPSS software using U of Iowa Virtual Desktop... Go to https://virtualdesktop.uiowa.edu and choose SPSS 24. Contents NOTE: Save data files in a drive that is accessible

More information