An Algorithm to Compute Exact Power of an Unordered RxC Contingency Table
|
|
- Christopher Morton
- 5 years ago
- Views:
Transcription
1 NESUG 27 An Algorithm to Compute Eact Power of an Unordered RC Contingency Table Vivek Pradhan, Cytel Inc., Cambridge, MA Stian Lydersen, Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Norway ABSTRACT Chi-square, Likelihood Ratio and Fisher-Freeman-Halton test statistics are used to test the association of an unordered rc. Although asymptotically all these statistics follow a chi-square distribution, an eact conditional test based on the permutation distribution is recommended for small samples. The eact power computation of such tests involves huge numbers of permutation tables, and as a result computation becomes infeasible. The asymptotic power computation of these methods can be done using a non-central chi-square test (see Agresti A, 22: Categorical Data Analysis). However, to our knowledge, there is no algorithm/method available to compute the eact power. In this article we give an efficient algorithm and a SAS macro to compute eact power using Chi-square, Likelihood Ratio and Fisher-Freeman-Halton test statistics. The SAS macro also reports eact power for the Mid-P and the randomized versions of the tests. INTRODUCTION The Chi-square (CH), Likelihood Ratio (LR), and Fisher-Freeman-Halton (FI) statistics are commonly used to test the association of an unordered rc table. For an unordered rc table with elements, i-th row sum, j-th column sum n + j, and total sum N, these statistics are defined as follows: n i + CH ( ) = r c ni+ n N n n i= j= i+ + j N + j 2 where LR( ) = 2 r c log i= j = ni + n + j N ( γp( )) FI( ) = 2log γ 2 r c ( r )( c ) ( rc ) ( c ) ( c ) ( 2π ) N ( ni + ) ( n+ j ) i= j = = Under the null hypothesis the above three statistics asymptotically follow a chi-square distribution with (r-)(c-) degrees of freedom. The asymptotic inference may not be appropriate when the sample size N is small. In such situations the inference derived from the eact permutation distribution is appropriate (Mehta and Patel(983) ). The size of the eact test (and hence the p-value) derived from the permutation distribution is very conservative and can be improved by subtracting half the probability of the observed statistics from the eact p-value. This form of p-value is also known as Mid p-value. Although mid-p-value reduces the conservatism of the eact test, it does not always preserve the type- error (Lydersen and Laake (23), Lydersen et al. (25), Lydersen et al. (27)). Another form of inference can be obtained by randomizing the test. The p-value of a randomized test version always preserves the type- error.
2 NESUG 27 In a randomized test version compute the net possible lower p-value than what was actually observed: ( ( ) > T( n) ) = P( T( ) T( n) ) P( T( ) T( n) ) pnet = P T = Reject H with probability P( rejecth α pnet n) = g pvalue pnet where g( t) = t if t < t t > and T(n) is the observed test statistic and T() is any generic test statistic. The asymptotic power computation to test the association of a rc table can be done using a non-central chi-square statistic (see Agresti (22) ). This method may be applied when the sample size is large. However, for an rc table with a small sample size, the inference using asymptotic methods should not be used, and the less conservative method using Mid-p-value is recommended (see Lydersen et al. (27)). In the following sections we give an efficient method to compute the eact power of an unordered rc table using the Monte Carlo method. METHOD TO COMPUTE EXACT POWER OF A RXC TABLE Let i=.r, j=c be the observed cell frequency of a rc table and let 2 = r 2 22 r 2 c 2c rc be the multinomial row probabilities under Then power is defined as the following:. Then the probability of such table is given by: H r c + c P( n, ) = ni / i= j = j = β ( ) = P(reject H = P( reject H ) P( N; ) The eact power computation requires first to generate all possible combinations (outcomes) of tables so that the total sample is N. For a 22 table with fied row sums, this can be easily computed. However, if the number of rows or columns is greater than 2, then the number of possible combinations eplodes. For eample, when the row sums are 2, the number of possible outcomes of a 22 table is, where for a 32 table this number is 926 and for a 33 table this is As a result the following eact and Monte Carlo method is adopted: i. Generate a rc table by taking r multinomial samples with 2
3 NESUG 27 distribution Mult( n i +, i. ic) for each row, i= r. ii. Compute β (n), the eact conditional inference of such table. ), iii. Repeat steps and 2 M times and thereby generates, β ( β (2),., β ( M ). iv. Finally the eact power is the average of β ( ), β (2),., β ( M ) IMPLEMENTATION OF THE PROPOSED METHOD The core problem of this eact power computation is to reduce the number of tables. To do this, in SAS first create a dataset with M (the number of Monte Carlo sampling) by-variables where M is the number of Monte Carlo sampling. Therefore, each by-variable is nothing but an rc table, to be used for eact inference. Needless to mention here, all of the by-variable are not representing a unique contingency table. We used SAS datastep to find the number of distinct tables with the frequency of occurrences using the following way:. For each rc table, compute column sums (row sums) of each column (row) and then order the columns (rows) by column sums. In this way one can bring all the smaller cell values in the upper left corner of a contingency table. For eample in the following table first the column sums (2,6, 2) has been ordered and then the same has been done for row sums ( 5, 5, 5). 3 5 order 3 5 order 2. Write all the cell values starting left to right in a single row. Therefore, a single rc table is represented by a single row. 3. Count the number of distinct rows (therefore, the distinct tables) using the following logic in SAS: Data <dataset>; By by-variable; If first.by-variable then count=; count+; if last.by-variable then output; Once all the distinct tables and the corresponding count of occurrences are found, call SAS s PROC FREQ with a specified by-variable to produce eact inferences (mainly the eact p-value and corresponding point probabilities) of each by variable. Finally ii -iv is applied with adjusting the table counts. AN EXAMPLE The following eample is inspired by the Oral data given in the StatXact PROCs user Manual. The dataset is a 39 table with the following cell counts = 8 8 The above eample is well known for its sparseness. The asymptotic inference using chi-square distribution gives a very high p-value, however, the inference using eact method is significant. Consider the following probabilities under alternate hypothesis: =
4 NESUG 27 where =. The SAS program is run for simulations. The implemented algorithm first reduces n i+ tables to 3253 (appro) distinct tables and then computes the Powers. On a 2.66 Ghz machine with 2 GB RAM, it spends less than 2 minuets and shows the following output: CH LI FI Asymptotic 8% Eact 98% 9% 96% Eact-Midp 98% 95% 96% Randomized 98% 96% 97% Notice, here all the powers using eact method are % more than that of the asymptotic method. CONCLUSION One may use full multinomial or Poisson sampling (Agresti 22, Lydersen et. al. (27)) to do Monte Carlo sampling. However, in this article we have used only product multinomial sampling. We feel that this kind of sampling is sufficiently good enough under this setting. REFERENCES Agresti Alan (22). Introduction to Categorical Data Analysis. New York: Wiley. Lancaster HO (96). Significance tests in discrete distributions. Journal of the American Statistical Association 56: Lydersen S, Laake P(23). Power comparison of two-sided eact tests for association in 22 contingency tables using standard, mid p, and randomized test versions. Statistics in Medicine 22 (2): Lydersen S, Pradhan V, Senchaudhuri P, Laake P (25). Power comparison of two-sided eact tests for association in 22 contingency tables using standard, mid p, and randomized test versions. Journal of Statistical Computation and Simulation 75 (6): Lydersen S, Pradhan V, Senchaudhuri P, Laake P (27). Choice of test for association in small sample unordered rc tables. Statistics in Medicine 26 (23): Mehta CR, Patel NR (983). A network algorithm for performing Fisher's eact test in rc contingency tables. Journal of the American Statistical Association 78:27-3. StatXact 8 PROCs User Manual (27). An eact nonparametric inference for categorical data for SAS users. Cytel Inc., Cambridge, MA 239. SAS Program: %macro tabgeneration(dataname=, number=,rowtotal=,alpha=) ; %global nrows ncols; /*getting the number of rows and columns from the input */ %let dsid = %sysfunc(open(&dataname)); %let nrows=%sysfunc(attrn(&dsid,nobs)); %let ncols=%sysfunc(attrn(&dsid,nvars)); %let rc = %sysfunc(close(&dsid)); /*preparing probabilities for mutinomial samplings */ data input_;set &dataname; %if &ncols > 2 %then %do; data input_;set &dataname; %do vr= %to &ncols; %if &vr= %then %do; varsum&vr=; var_&vr=var&vr; % %else %do;
5 NESUG 27 varsum&vr=varsum% eval(&vr-)+var%eval(&vr-); var_&vr=var&vr/(-varsum&vr); % % keep var_-var_&ncols ; data input_; set input_; array a[*]var_-var_&ncols ; do i= to dim(a); if a[i]> then a[i]=; drop i; % /*end of data preparation */ proc transpose data=input_ out=transpose; proc sql noprint; %do ii= %to %sysevalf(&nrows); select col&ii into:pi&ii separated by ' ' from transpose; % quit; /*creating a data with total byvar=#of sampling based on product mult sampling */ data test; ntables=%sysevalf(&number); do tabno = to ntables; nrows = %sysevalf(&nrows); ncols = % sysevalf(&ncols); do row= to nrows; %do iii= %to %sysevalf(&nrows); if row=%eval(&iii) then rowsum=%scan(&rowtotal,&iii ); % col =; do while( col <=ncols- and rowsum >); %do ii= %to %sysevalf(&nrows); if row=%eval(&ii) then pi=scan("&&pi&ii",col,' '); % if pi= then pi=.; else if pi= then pi= ; wgt =ranbin(,rowsum,pi); output; rowsum = rowsum-wgt; col=col+; if( rowsum >) then do; wgt = rowsum; col = ncols; output; keep tabno row col wgt; /*end of data creation based on product mult sampling */ /********************************************************************************/ /*Sorting the tables with cell values, bringing the all 's at the upper left corner */ /*findding col sum */ data test; set test;format wgt z2.; proc sort data=test out=out2;by tabno col; data out3; set out2;by tabno col; if first.col then colsum=; colsum+wgt; if last.col then output; proc sort data=out3;by tabno colsum; data out; set out3; if first.tabno then col=; col+; keep tabno col col; 5
6 NESUG 27 proc sort data=out;by tabno col; proc sort data=test;by tabno col; data mr;merge test out ;by tabno col; proc sort data=mr;by tabno row col; /*arranging the rows */ proc transpose data=mr out=out_;id col; var wgt; by tabno row; data out_;set out_; ord=catt(of _-_%sysevalf(&ncols));/*number of columns */ proc sort data=out_ out=out_2;by tabno ord; data out_3 ; set out_2; if first.tabno then row=; row+; drop row _name_ ord; proc transpose data=out_3 out=out_;by tabno row; data out_5; set out_; array a[*]col; do i= to dim(a); if a[i]=. then a[i]=; drop i; /******************************************************************************/ /*Finding distinct tables with the total frequences */ data two; set out_5; length yy $; if first.tabno then yy=put(col,z2.); else yy=trim(yy) ',' put(col,z2.); if last.tabno then output; drop col; retain yy; keep tabno yy; proc sort data=two; by yy; data thr; set two; by yy; retain count ; array ids(%eval(&number)); if first.yy then do; count=; do i = to dim(ids); ids(i)=.; count+; ids(count)=tabno; if last.yy then output; drop tabno i; retain _all_; keep count ids; proc sort data=thr;by ids; data abc_; merge test thr(rename=(ids=tabno)); if count=. then delete; /*Computing the eact p-values and the corresponding point probabilities */ 6
7 NESUG 27 proc freq data=abc_ noprint; tables row*col/out=out outpercent nowarn; weight wgt; eact pchi lrchi fisher/point; output out=output_ chisq; data out3 ; merge out (where=(pct_row=)) output_ ; if first.tabno then output; keep tabno P_PCHI p2_fish pt_fish p_lrchi pt_lrch p_pchi pt_pchi; /*taking care of the situation when the input table reduced to a n table*/ data out3 ; set out3 ; array a[*]p_pchi p2_fish pt_fish p_lrchi pt_lrch p_pchi pt_pchi; do i= to dim(a); if a[i]=. then a[i]=; drop i; data out3 ; merge out3 thr(rename=(ids=tabno)); pthalf_ch=.5*pt_pchi; pthalf_fi=.5*pt_fish; pthalf_lr=.5*pt_lrch; /*computing midp values */ midpval_ch=p_pchi-pthalf_ch; midpval_fi=p2_fish-pthalf_fi; midpval_lr=p_lrchi-pthalf_lr; /*calculationg g(t)of the randomized test version */ rndpval_ch=(min(ma(,(&alpha - (p_pchi -2* pthalf_ch))/(p_pchi-(p_pchi-2*pthalf_ch))),)); rndpval_fi=(min(ma(,(&alpha - (p2_fish -2*pthalf_fi))/(p2_fish-(p2_fish-2*pthalf_fi))),)); rndpval_lr=(min(ma(,(&alpha - (p_lrchi -2*pthalf_lr))/(p_lrchi-(p_lrchi-2*pthalf_lr))),)); /*computing flags for power computation */ if P_PCHI<=&alpha then as_ch=; else as_ch=; /*Computing for chi-square statistic */ if p_pchi<=&alpha then stdflag_ch=; else stdflag_ch=; if midpval_ch<=&alpha then midpflag_ch=; else midpflag_ch=; totas=count*as_ch; totstd_ch=count*stdflag_ch; totmidp_ch=count*midpflag_ch; totrnd_ch=count*rndpval_ch; /*Computing for Fisher statistic */ if p2_fish<=&alpha then stdflag_fi=; else stdflag_fi=; if midpval_fi<=&alpha then midpflag_fi=; else midpflag_fi=; totstd_fi=count*stdflag_fi; totmidp_fi=count*midpflag_fi; totrnd_fi=count*rndpval_fi; /*Computing for Likelihood-ratio statistic */ if p_lrchi<=&alpha then stdflag_lr=; else stdflag_lr=; if midpval_lr<=&alpha then midpflag_lr=; else midpflag_lr=; totstd_lr=count*stdflag_lr; totmidp_lr=count*midpflag_lr; totrnd_lr=count*rndpval_lr; /*computing the powers */ proc sql noprint; create table power_ as select sum(totas)/&number as ASCHPOW, sum(totstd_ch)/&number as CHIPOW_STD,sum(totmidp_ch)/&number as CHIPOW_MIDP,(sum(totrnd_ch))/&number as CHIPOW_RND, sum(totstd_fi)/&number as FIPOW_STD,sum(totmidp_fi)/&number as FIPOW_MIDP,(sum(totrnd_fi))/&number as FIPOW_RND, sum(totstd_lr)/&number as LRPOW_STD,sum(totmidp_lr)/&number as LRPOW_MIDP,(sum(totrnd_ch))/&number as LRPOW_RND from out3 ; quit; proc transpose data=power_ out=final_; proc format; ; value $name 'ASCHPOW'='Asymptotic Chi-square' 'CHIPOW_STD'='Chi-square with Eact p' 'CHIPOW_MIDP'='Chi_square with Midp' 'CHIPOW_RND'='Chi_square with Randomized' 'FIPOW_STD'='Fisher with Eact p' 'FIPOW_MIDP'='Fisher with Midp' 'FIPOW_RND'='Fisher with Randomized' 'LRPOW_STD'='Likelihood-ratio with Eact p' 'LRPOW_MIDP'='Likelihood-ratio with Midp' 'LRPOW_RND'='Likelihood-ratio with Randomized' 7
8 NESUG 27 proc print data=final_(rename=(_name_=method COL=Power))noobs ; title"***************************************************************************************"; title2 "* Eact Power of a rc table using different methods *"; title3 ***************************************************************************************"; format Method $name.; %m /*observed proportions from oral data */ data one; input var-var9; cards; ; options nonotes; %tabgeneration(dataname=one, number=,rowtotal= 7,alpha=.5) ; CONTACT INFORMATION Please send your comments or further inquiries at Vivek Pradhan, Cytel Inc., Cambridge, MA 239, USA Phone: (work) vpradhan@cytel.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. 8
Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research
Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research Liping Huang, Center for Home Care Policy and Research, Visiting Nurse Service of New York, NY, NY ABSTRACT The
More information186 Statistics, Data Analysis and Modeling. Proceedings of MWSUG '95
A Statistical Analysis Macro Library in SAS Carl R. Haske, Ph.D., STATPROBE, nc., Ann Arbor, M Vivienne Ward, M.S., STATPROBE, nc., Ann Arbor, M ABSTRACT Statistical analysis plays a major role in pharmaceutical
More informationUsing SAS Macros to Extract P-values from PROC FREQ
SESUG 2016 ABSTRACT Paper CC-232 Using SAS Macros to Extract P-values from PROC FREQ Rachel Straney, University of Central Florida This paper shows how to leverage the SAS Macro Facility with PROC FREQ
More informationUsing Templates Created by the SAS/STAT Procedures
Paper 081-29 Using Templates Created by the SAS/STAT Procedures Yanhong Huang, Ph.D. UMDNJ, Newark, NJ Jianming He, Solucient, LLC., Berkeley Heights, NJ ABSTRACT SAS procedures provide a large quantity
More informationHow to Go From SAS Data Sets to DATA NULL or WordPerfect Tables Anne Horney, Cooperative Studies Program Coordinating Center, Perry Point, Maryland
How to Go From SAS Data Sets to DATA NULL or WordPerfect Tables Anne Horney, Cooperative Studies Program Coordinating Center, Perry Point, Maryland ABSTRACT Clinical trials data reports often contain many
More informationGenerating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International
Abstract Generating Customized Analytical Reports from SAS Procedure Output Brinda Bhaskar and Kennan Murray, RTI International SAS has many powerful features, including MACRO facilities, procedures such
More informationIn a two-way contingency table, the null hypothesis of quasi-independence. (QI) usually arises for two main reasons: 1) some cells involve structural
Simulate and Reject Monte Carlo Exact Conditional Tests for Quasi-independence Peter W. F. Smith and John W. McDonald Department of Social Statistics, University of Southampton, Southampton, SO17 1BJ,
More informationGet into the Groove with %SYSFUNC: Generalizing SAS Macros with Conditionally Executed Code
Get into the Groove with %SYSFUNC: Generalizing SAS Macros with Conditionally Executed Code Kathy Hardis Fraeman, United BioSource Corporation, Bethesda, MD ABSTRACT %SYSFUNC was originally developed in
More informationStrategies for Modeling Two Categorical Variables with Multiple Category Choices
003 Joint Statistical Meetings - Section on Survey Research Methods Strategies for Modeling Two Categorical Variables with Multiple Category Choices Christopher R. Bilder Department of Statistics, University
More informationMacros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA
Paper CC-20 Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA ABSTRACT Statistical Hypothesis Testing is performed to determine whether enough statistical
More informationAn Approach Finding the Right Tolerance Level for Clinical Data Acceptance
Paper P024 An Approach Finding the Right Tolerance Level for Clinical Data Acceptance Karen Walker, Walker Consulting LLC, Chandler Arizona ABSTRACT Highly broadcasted zero tolerance initiatives for database
More informationSAS Macros CORR_P and TANGO: Interval Estimation for the Difference Between Correlated Proportions in Dependent Samples
Paper SD-03 SAS Macros CORR_P and TANGO: Interval Estimation for the Difference Between Correlated Proportions in Dependent Samples Patricia Rodríguez de Gil, Jeanine Romano Thanh Pham, Diep Nguyen, Jeffrey
More informationLet s Get FREQy with our Statistics: Data-Driven Approach to Determining Appropriate Test Statistic
PharmaSUG 2018 - Paper EP-09 Let s Get FREQy with our Statistics: Data-Driven Approach to Determining Appropriate Test Statistic Richann Watson, DataRich Consulting, Batavia, OH Lynn Mullins, PPD, Cincinnati,
More informationUsing SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL
Using SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL ABSTRACT SAS is a powerful programming language. When you find yourself
More informationChapter 6: Modifying and Combining Data Sets
Chapter 6: Modifying and Combining Data Sets The SET statement is a powerful statement in the DATA step. Its main use is to read in a previously created SAS data set which can be modified and saved as
More information%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma
Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma ABSTRACT Today there is more pressure on programmers to deliver summary outputs faster without sacrificing quality. By using just a few programming
More informationThe ctest Package. January 3, 2000
R objects documented: The ctest Package January 3, 2000 bartlett.test....................................... 1 binom.test........................................ 2 cor.test.........................................
More informationHypothesis Testing: An SQL Analogy
Hypothesis Testing: An SQL Analogy Leroy Bracken, Boulder Creek, CA Paul D Sherman, San Jose, CA ABSTRACT This paper is all about missing data. Do you ever know something about someone but don't know who
More informationA SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY
PharmaSUG 2014 - Paper BB14 A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY ABSTRACT Clinical Study
More informationUsing PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA
Using PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA ABSTRACT This paper describes for an intermediate SAS user the use of PROC REPORT to create
More informationA Breeze through SAS options to Enter a Zero-filled row Kajal Tahiliani, ICON Clinical Research, Warrington, PA
ABSTRACT: A Breeze through SAS options to Enter a Zero-filled row Kajal Tahiliani, ICON Clinical Research, Warrington, PA Programmers often need to summarize data into tables as per template. But study
More informationAssessing superiority/futility in a clinical trial: from multiplicity to simplicity with SAS
PharmaSUG2010 Paper SP10 Assessing superiority/futility in a clinical trial: from multiplicity to simplicity with SAS Phil d Almada, Duke Clinical Research Institute (DCRI), Durham, NC Laura Aberle, Duke
More informationCrosstabs Notes Output Created 17-Mai :40:54 Comments Input
Crosstabs Notes Output Created 17-Mai-2011 01:40:54 Comments Input Data /Users/corinnahornei/Desktop/spss table.sav Active Dataset DatenSet3 Filter Weight Split File N of Rows in Working 189 Data File
More information9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA
9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA ABSTRACT Joining or merging data is one of the fundamental actions carried out when manipulating data to bring it
More informationStatistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment
Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ Aiming Yang, Merck & Co., Inc., Rahway, NJ ABSTRACT Four pitfalls are commonly
More informationPharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA
PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA ABSTRACT Labeling of the X-axis usually involves a tedious axis statement specifying
More informationTwo useful macros to nudge SAS to serve you
Two useful macros to nudge SAS to serve you David Izrael, Michael P. Battaglia, Abt Associates Inc., Cambridge, MA Abstract This paper offers two macros that augment the power of two SAS procedures: LOGISTIC
More informationTweaking your tables: Suppressing superfluous subtotals in PROC TABULATE
ABSTRACT Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE Steve Cavill, NSW Bureau of Crime Statistics and Research, Sydney, Australia PROC TABULATE is a great tool for generating
More informationPaper DB2 table. For a simple read of a table, SQL and DATA step operate with similar efficiency.
Paper 76-28 Comparative Efficiency of SQL and Base Code When Reading from Database Tables and Existing Data Sets Steven Feder, Federal Reserve Board, Washington, D.C. ABSTRACT In this paper we compare
More informationAn Application of PROC NLP to Survey Sample Weighting
An Application of PROC NLP to Survey Sample Weighting Talbot Michael Katz, Analytic Data Information Technologies, New York, NY ABSTRACT The classic weighting formula for survey respondents compensates
More informationSubmitting SAS Code On The Side
ABSTRACT PharmaSUG 2013 - Paper AD24-SAS Submitting SAS Code On The Side Rick Langston, SAS Institute Inc., Cary NC This paper explains the new DOSUBL function and how it can submit SAS code to run "on
More informationBiostat Methods STAT 5820/6910 Handout #4: Chi-square, Fisher s, and McNemar s Tests
Biostat Methods STAT 5820/6910 Handout #4: Chi-square, Fisher s, and McNemar s Tests Example 1: 152 patients were randomly assigned to 4 dose groups in a clinical study. During the course of the study,
More informationThe Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data
Paper PO31 The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data MaryAnne DePesquo Hope, Health Services Advisory Group, Phoenix, Arizona Fen Fen Li, Health Services Advisory Group,
More informationOut of Control! A SAS Macro to Recalculate QC Statistics
Paper 3296-2015 Out of Control! A SAS Macro to Recalculate QC Statistics Jesse Pratt, Colleen Mangeot, Kelly Olano, Cincinnati Children s Hospital Medical Center, Cincinnati, OH, USA ABSTRACT SAS/QC provides
More informationStatistical Programming in SAS. From Chapter 10 - Programming with matrices and vectors - IML
Week 12 [30+ Nov.] Class Activities File: week-12-iml-prog-16nov08.doc Directory: \\Muserver2\USERS\B\\baileraj\Classes\sta402\handouts From Chapter 10 - Programming with matrices and vectors - IML 10.1:
More informationAn Animated Guide: Proc Transpose
ABSTRACT An Animated Guide: Proc Transpose Russell Lavery, Independent Consultant If one can think about a SAS data set as being made up of columns and rows one can say Proc Transpose flips the columns
More informationChoosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA
Choosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA ABSTRACT This paper outlines different SAS merging techniques
More informationUsing SAS to Manage Biological Species Data and Calculate Diversity Indices
SCSUG November 2014 Using SAS to Manage Biological Species Data and Calculate Diversity Indices ABSTRACT Paul A. Montagna, Harte Research Institute, TAMU-CC, Corpus Christi, TX Species level information
More informationSD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG
Paper SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG Qixuan Chen, University of Michigan, Ann Arbor, MI Brenda Gillespie, University of Michigan, Ann Arbor, MI ABSTRACT This paper
More informationUsing PROC SQL to Generate Shift Tables More Efficiently
ABSTRACT SESUG Paper 218-2018 Using PROC SQL to Generate Shift Tables More Efficiently Jenna Cody, IQVIA Shift tables display the change in the frequency of subjects across specified categories from baseline
More informationSorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine
PharmaSUG 2015 - Paper QT21 Sorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine ABSTRACT Very often working with big data causes difficulties for SAS programmers.
More informationMerging Data Eight Different Ways
Paper 197-2009 Merging Data Eight Different Ways David Franklin, Independent Consultant, New Hampshire, USA ABSTRACT Merging data is a fundamental function carried out when manipulating data to bring it
More informationUncommon Techniques for Common Variables
Paper 11863-2016 Uncommon Techniques for Common Variables Christopher J. Bost, MDRC, New York, NY ABSTRACT If a variable occurs in more than one data set being merged, the last value (from the variable
More informationA SAS Macro for measuring and testing global balance of categorical covariates
A SAS Macro for measuring and testing global balance of categorical covariates Camillo, Furio and D Attoma,Ida Dipartimento di Scienze Statistiche, Università di Bologna via Belle Arti,41-40126- Bologna,
More informationPackage FHtest. November 8, 2017
Type Package Package FHtest November 8, 2017 Title Tests for Right and Interval-Censored Survival Data Based on the Fleming-Harrington Class Version 1.4 Date 2017-11-8 Author Ramon Oller, Klaus Langohr
More informationImplementing external file processing with no record delimiter via a metadata-driven approach
Paper 2643-2018 Implementing external file processing with no record delimiter via a metadata-driven approach Princewill Benga, F&P Consulting, Saint-Maur, France ABSTRACT Most of the time, we process
More informationAutomating Preliminary Data Cleaning in SAS
Paper PO63 Automating Preliminary Data Cleaning in SAS Alec Zhixiao Lin, Loan Depot, Foothill Ranch, CA ABSTRACT Preliminary data cleaning or scrubbing tries to delete the following types of variables
More informationChecking for Duplicates Wendi L. Wright
Checking for Duplicates Wendi L. Wright ABSTRACT This introductory level paper demonstrates a quick way to find duplicates in a dataset (with both simple and complex keys). It discusses what to do when
More informationContents of SAS Programming Techniques
Contents of SAS Programming Techniques Chapter 1 About SAS 1.1 Introduction 1.1.1 SAS modules 1.1.2 SAS module classification 1.1.3 SAS features 1.1.4 Three levels of SAS techniques 1.1.5 Chapter goal
More informationThe Proc Transpose Cookbook
ABSTRACT PharmaSUG 2017 - Paper TT13 The Proc Transpose Cookbook Douglas Zirbel, Wells Fargo and Co. Proc TRANSPOSE rearranges columns and rows of SAS datasets, but its documentation and behavior can be
More informationSAS - By Group Processing umanitoba.ca/centres/mchp
SAS - By Group Processing umanitoba.ca/centres/mchp Winnipeg SAS users Group SAS By Group Processing Are you First or Last In Line Charles Burchill Manitoba Centre for Health Policy, University of Manitoba
More informationSection I: Dual Retrieval Models
Created by Carlos Gomes (cf365@cornell.edu) and Ryan Yeh (ry58@cornell.edu) 1 The purpose of this tutorial is to outline the application of a group of two-stage Markov models that have been used to quantify
More informationPharmaSUG China Paper 059
PharmaSUG China 2016 - Paper 059 Using SAS @ to Assemble Output Report Files into One PDF File with Bookmarks Sam Wang, Merrimack Pharmaceuticals, Inc., Cambridge, MA Kaniz Khalifa, Leaf Research Services,
More informationJust Sort. Sathish Kumar Vijayakumar Chennai, India (1)
Just Sort Sathish Kumar Vijayakumar Chennai, India satthhishkumar@gmail.com Abstract Sorting is one of the most researched topics of Computer Science and it is one of the essential operations across computing
More informationEVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA. Shang-Lin Yang. B.S., National Taiwan University, 1996
EVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA By Shang-Lin Yang B.S., National Taiwan University, 1996 M.S., University of Pittsburgh, 2005 Submitted to the
More informationMR-2010I %MktMDiff Macro %MktMDiff Macro
MR-2010I %MktMDiff Macro 1105 %MktMDiff Macro The %MktMDiff autocall macro analyzes MaxDiff (maximum difference or best-worst) data (Louviere 1991, Finn and Louviere 1992). The result of the analysis is
More informationAlgorithm Analysis and Design
Algorithm Analysis and Design Dr. Truong Tuan Anh Faculty of Computer Science and Engineering Ho Chi Minh City University of Technology VNU- Ho Chi Minh City 1 References [1] Cormen, T. H., Leiserson,
More informationTechnical Support Minitab Version Student Free technical support for eligible products
Technical Support Free technical support for eligible products All registered users (including students) All registered users (including students) Registered instructors Not eligible Worksheet Size Number
More informationTable of Contents. The RETAIN Statement. The LAG and DIF Functions. FIRST. and LAST. Temporary Variables. List of Programs.
Table of Contents List of Programs Preface Acknowledgments ix xvii xix The RETAIN Statement Introduction 1 Demonstrating a DATA Step with and without a RETAIN Statement 1 Generating Sequential SUBJECT
More informationFrom Manual to Automatic with Overdrive - Using SAS to Automate Report Generation Faron Kincheloe, Baylor University, Waco, TX
Paper 152-27 From Manual to Automatic with Overdrive - Using SAS to Automate Report Generation Faron Kincheloe, Baylor University, Waco, TX ABSTRACT This paper is a case study of how SAS products were
More informationPaper CC06. a seed number for the random number generator, a prime number is recommended
Paper CC06 %ArrayPerm: A SAS Macro for Permutation Analysis of Microarray Data Deqing Pei, M.S., Wei Liu, M.S., Cheng Cheng, Ph.D. St. Jude Children s Research Hospital, Memphis, TN ABSTRACT Microarray
More informationDSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017
DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017 USING PROC MEANS The routine PROC MEANS can be used to obtain limited summaries for numerical variables (e.g., the mean,
More informationPackage binmto. February 19, 2015
Type Package Package binmto February 19, 2015 Title Asymptotic simultaneous confidence intervals for many-to-one comparisons of proportions Version 0.0-6 Date 2013-09-30 Author Maintainer
More information- 1 - Fig. A5.1 Missing value analysis dialog box
WEB APPENDIX Sarstedt, M. & Mooi, E. (2019). A concise guide to market research. The process, data, and methods using SPSS (3 rd ed.). Heidelberg: Springer. Missing Value Analysis and Multiple Imputation
More informationnquery Sample Size & Power Calculation Software Validation Guidelines
nquery Sample Size & Power Calculation Software Validation Guidelines Every nquery sample size table, distribution function table, standard deviation table, and tablespecific side table has been tested
More informationIF there is a Better Way than IF-THEN
PharmaSUG 2018 - Paper QT-17 IF there is a Better Way than IF-THEN Bob Tian, Anni Weng, KMK Consulting Inc. ABSTRACT In this paper, the author compares different methods for implementing piecewise constant
More informationOpen Problem for SUAVe User Group Meeting, November 26, 2013 (UVic)
Open Problem for SUAVe User Group Meeting, November 26, 2013 (UVic) Background The data in a SAS dataset is organized into variables and observations, which equate to rows and columns. While the order
More informationJMP 10 Student Edition Quick Guide
JMP 10 Student Edition Quick Guide Instructions presume an open data table, default preference settings and appropriately typed, user-specified variables of interest. RMC = Click Right Mouse Button Graphing
More informationUsing PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO
Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO ABSTRACT The power of SAS programming can at times be greatly improved using PROC SQL statements for formatting and manipulating
More informationA Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment
A Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment Abstract Jiannan Hu Vertex Pharmaceuticals, Inc. When a clinical trial is at the stage of
More informationA Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys
A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys Richard L. Downs, Jr. and Pura A. Peréz U.S. Bureau of the Census, Washington, D.C. ABSTRACT This paper explains
More informationA SAS Solution to Create a Weekly Format Susan Bakken, Aimia, Plymouth, MN
Paper S126-2012 A SAS Solution to Create a Weekly Format Susan Bakken, Aimia, Plymouth, MN ABSTRACT As programmers, we are frequently asked to report by periods that do not necessarily correspond to weeks
More informationThere s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA
Paper HW04 There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA ABSTRACT Clinical Trials data comes in all shapes and sizes depending
More informationSAS/STAT 13.1 User s Guide. The SURVEYFREQ Procedure
SAS/STAT 13.1 User s Guide The SURVEYFREQ Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS
More informationChoosing the Right Procedure
3 CHAPTER 1 Choosing the Right Procedure Functional Categories of Base SAS Procedures 3 Report Writing 3 Statistics 3 Utilities 4 Report-Writing Procedures 4 Statistical Procedures 5 Efficiency Issues
More informationA Lazy Programmer s Macro for Descriptive Statistics Tables
Paper SA19-2011 A Lazy Programmer s Macro for Descriptive Statistics Tables Matthew C. Fenchel, M.S., Cincinnati Children s Hospital Medical Center, Cincinnati, OH Gary L. McPhail, M.D., Cincinnati Children
More informationSTATISTICS (STAT) Statistics (STAT) 1
Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).
More informationKeeping Track of Database Changes During Database Lock
Paper CC10 Keeping Track of Database Changes During Database Lock Sanjiv Ramalingam, Biogen Inc., Cambridge, USA ABSTRACT Higher frequency of data transfers combined with greater likelihood of changes
More informationCountdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH
PharmaSUG2010 - Paper TU06 Countdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH ABSTRACT Joining or merging data is one of the fundamental actions carried out
More informationSetting the Percentage in PROC TABULATE
SESUG Paper 193-2017 Setting the Percentage in PROC TABULATE David Franklin, QuintilesIMS, Cambridge, MA ABSTRACT PROC TABULATE is a very powerful procedure which can do statistics and frequency counts
More informationSTEP 1 - /*******************************/ /* Manipulate the data files */ /*******************************/ <<SAS DATA statements>>
Generalized Report Programming Techniques Using Data-Driven SAS Code Kathy Hardis Fraeman, A.K. Analytic Programming, L.L.C., Olney, MD Karen G. Malley, Malley Research Programming, Inc., Rockville, MD
More informationSAS Linear Model Demo. Overview
SAS Linear Model Demo Yupeng Wang, Ph.D, Data Scientist Overview SAS is a popular programming tool for biostatistics and clinical trials data analysis. Here I show an example of using SAS linear regression
More informationTop Coding Tips. Neil Merchant Technical Specialist - SAS
Top Coding Tips Neil Merchant Technical Specialist - SAS Bio Work in the ANSWERS team at SAS o Analytics as a Service and Visual Analytics Try before you buy SAS user for 12 years obase SAS and O/S integration
More informationDisplaying Multiple Graphs to Quickly Assess Patient Data Trends
Paper AD11 Displaying Multiple Graphs to Quickly Assess Patient Data Trends Hui Ping Chen and Eugene Johnson, Eli Lilly and Company, Indianapolis, IN ABSTRACT Populating multiple graphs, up to 15, on a
More informationAutomatic Indicators for Dummies: A macro for generating dummy indicators from category type variables
MWSUG 2018 - Paper AA-29 Automatic Indicators for Dummies: A macro for generating dummy indicators from category type variables Matthew Bates, Affusion Consulting, Columbus, OH ABSTRACT Dummy Indicators
More informationPH006 Audit Trails of SAS Data Set Changes An Overview Maria Y. Reiss, Wyeth Pharmaceuticals, Collegeville, PA
PH006 Audit Trails of SAS Data Set Changes An Overview Maria Y. Reiss, Wyeth, Collegeville, PA ABSTRACT SAS programmers often have to modify data in SAS data sets. When modifying data, it is desirable
More informationPharmaSUG China. Systematically Reordering Axis Major Tick Values in SAS Graph Brian Shen, PPDI, ShangHai
PharmaSUG China Systematically Reordering Axis Major Tick Values in SAS Graph Brian Shen, PPDI, ShangHai ABSTRACT Once generating SAS graphs, it is a headache to programmers to reorder the axis tick values
More informationTo conceptualize the process, the table below shows the highly correlated covariates in descending order of their R statistic.
Automating the process of choosing among highly correlated covariates for multivariable logistic regression Michael C. Doherty, i3drugsafety, Waltham, MA ABSTRACT In observational studies, there can be
More information%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System
%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System Rushi Patel, Creative Information Technology, Inc., Arlington, VA ABSTRACT It is common to find
More informationTitle. Description. Menu. Remarks and examples. stata.com. stata.com. PSS Control Panel
Title stata.com GUI Graphical user interface for power and sample-size analysis Description Menu Remarks and examples Also see Description This entry describes the graphical user interface (GUI) for the
More informationCMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD
ABSTRACT SESUG 2016 - RV-201 CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD Those of us who have been using SAS for more than a few years often rely
More informationJMP Chong Ho
JMP Interface: ipod of statistical software Chong Ho Yu, Ph.D. (2012) cyu@apu.edu www.creative wisdom.com JMP is software package created by SAS Institute for data visualization and exploratory data analysis.
More information11. Chi Square. Calculate Chi Square for contingency tables. A Chi Square is used to analyze categorical data. It compares observed
11. Chi Square Objectives Calculate goodness of fit Chi Square Calculate Chi Square for contingency tables Calculate effect size Save data entry time by weighting cases A Chi Square is used to analyze
More informationA Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA
Paper RF10-2015 A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation Chaoxian Cai, Automated Financial Systems, Exton, PA ABSTRACT Weight of evidence (WOE) recoding
More informationReading a Column into a Row to Count N-levels, Calculate Cardinality Ratio and Create Frequency and Summary Output In One Step
Paper RF-04-2015 Reading a Column into a Row to Count N-levels, Calculate Cardinality Ratio and Create Frequency and Summary Output In One Step Ronald J. Fehd, Stakana Analytics Abstract Description :
More informationSAS/STAT 14.3 User s Guide The SURVEYFREQ Procedure
SAS/STAT 14.3 User s Guide The SURVEYFREQ Procedure This document is an individual chapter from SAS/STAT 14.3 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute
More informationABSTRACT INTRODUCTION TRICK 1: CHOOSE THE BEST METHOD TO CREATE MACRO VARIABLES
An Efficient Method to Create a Large and Comprehensive Codebook Wen Song, ICF International, Calverton, MD Kamya Khanna, ICF International, Calverton, MD Baibai Chen, ICF International, Calverton, MD
More informationA SAS Macro for Balancing a Weighted Sample
Paper 258-25 A SAS Macro for Balancing a Weighted Sample David Izrael, David C. Hoaglin, and Michael P. Battaglia Abt Associates Inc., Cambridge, Massachusetts Abstract It is often desirable to adjust
More informationA Side of Hash for You To Dig Into
A Side of Hash for You To Dig Into Shan Ali Rasul, Indigo Books & Music Inc, Toronto, Ontario, Canada. ABSTRACT Within the realm of Customer Relationship Management (CRM) there is always a need for segmenting
More informationPharmaSUG Paper AD06
PharmaSUG 2012 - Paper AD06 A SAS Tool to Allocate and Randomize Samples to Illumina Microarray Chips Huanying Qin, Baylor Institute of Immunology Research, Dallas, TX Greg Stanek, STEEEP Analytics, Baylor
More information