An Application of PROC NLP to Survey Sample Weighting

Size: px
Start display at page:

Download "An Application of PROC NLP to Survey Sample Weighting"

Transcription

1 An Application of PROC NLP to Survey Sample Weighting Talbot Michael Katz, Analytic Data Information Technologies, New York, NY ABSTRACT The classic weighting formula for survey respondents compensates for differences between each cell s proportion of respondents, and its proportion of the target population. Such weighting also can be applied to cells based on variables of interest (beyond the experimental design). If even one cell has no responses, the entire weighting has to be reconsidered. An optimal reapportionment that attempts to preserve row / column marginals is proposed, with a PROC NLP implementation. Keywords: PROC NLP, nonlinear programming, weight adjustment, nonresponse. INTRODUCTION SAS software provides several tools for the design of surveys and the analysis of survey data. But even wellplanned and executed surveys can suffer from nonresponse. Both the PROC SURVEYMEANS and PROC SURVEYREG documentation for SAS/STAT software contain the following passage in their sections on missing values, Once data collection is complete, you can use imputation to replace missing values with acceptable values, and you can use sampling weight adjustments to compensate for nonresponse. You should complete this data preparation and adjustment before you analyze your data with PROC SURVEY[REG/MEANS]. [1] Several methods of weighting adjustment are already in use. One of the simplest methods multiplies each weight by the sum of base weights over all divided by the sum of base weights over responders [2]. Some of the more sophisticated methods use auxiliary data to build predictive models for probability of (non)response [3]. The appropriate weighting method to use may depend on the data available and the goals of the analysis. The method proposed here is useful for situations in which cells of interest are based upon levels of two or more variables, and there is a desire to maintain the marginal weights of the respondents as closely as possible in proportion to the marginal population sums. PROPORTIONAL WEIGHTING Suppose we start with a population of size P and take a sample of size S. Suppose that the population can be split into subgroups, P i, i = 1,,n, and the sample splits into corresponding subgroups S i. In a perfect world, S i / S = P i / P for each i then each sample subgroup has the correct proportion, and each individual in the sample can be given weight 1. In a still-sunny-but-slightly-less-than-perfect world each S i > 0 then each individual in group i can be assigned weight of (P i / P)*(S / S i ). These weights can be used in ANOVA or other modeling to extrapolate back to the original population. This is classic proportional weighting the sum of the individual weights adds up to P. Here is an easy example. Suppose the initial population P = 1000, P 1 = 400, P 2 = 300, P 3 = 200, P 4 = 100. Let S = 100, and S 1 = 20, S 2 = 10, S 3 = 20, S 4 = 50. Then w 1 = 2, w 2 = 3, w 3 = 1, w 4 = 0.2 intuitively, groups 1 and 2 are low in the sample, so their individuals weigh more than 1, group 4 is high in the sample, so its members weigh less than 1, group 3 has the same proportion of the sample as it does of the general population, so its members have unit weight. Proportional weighting breaks down if even a single cell is empty. Even if you decide to give the empty cell a weight of zero, the rest of the weighted individuals do not add up to the original population size. The easiest way to save proportional weighting in the presence of empty cells is to combine the empty cells with non-empty cells, if practical. Here is another easy example, with the same original population as the first example, but in this case S 1 = 20, S 2 = 30, S 3 = 0, S 4 = 50. If we can combine groups 3 and 4, then the weights are w 1 = 2, w 2 = 1, w 3-4 = 0.6. It is not always practical to combine cells, especially when the cells are created by values of two or more underlying variables (such as in a multifactorial design). 1

2 EMPTY CELLS CREATED BY TWO OR MORE VARIABLES Consider a population of 1000 workers who are classified in two ways, as employees (E) or contractors (C), and as SAS users (S) or the Unenlightened (U). Then these classifications can produce four subgroups, e.g., as follows: E C Total S U Total Then a sample of size 60 would get the following perfect weighting: E C Total S U Total What if one of the actual sample quadrants is zero? Suppose the bottom right quadrant, CU, is zero. Then, Combining CU and CS keeps the correct EC split, 36-24, but gives SU split of Combining CU and EU keeps the correct SU split, 42-18, but gives EC split of Combining CU and ES is hard to justify and gives EC split of 42-18, SU split of If the cells are proportionally reweighted, each cell would be multiplied by 60 / 54, giving cell ES weight of 26.67, EU weight of 13.33, CS weight of 20. This gives an EC split of 40-20, and SU split of , sort of a compromise between the first two combinations above. Another possible resolution would be to try to reweight each nonempty cell as close as possible to its proportionate value. This could be done by solving a least squares minimization. In the example above, we would minimize: (24 - ES) 2 + (18 - CS) 2 + (12 - EU) 2 subject to ES + CS + EU = 60. Substituting ES = 24 + d 1, CS = 18 + d 2, EU = 12 + d 3, this transforms to minimizing: (d 1 ) 2 + (d 2 ) 2 + (d 3 ) 2, subject to d 1 + d 2 + d 3 = 6. The solution to this is d 1 = d 2 = d 3 = 2, an even spread of the missing cell s weight to the other cells. This would result in an EC split of 40-20, and SU split of 46-14, a slightly better compromise than proportional reweighting in this case. The cell-by-cell least squares reapportionment example above generalizes to any number of missing cells, and the solution to getting the non-missing cells as close as possible to their proportionate values, in the least squares sense, is to spread the proportionate weight of the missing cells evenly among the remaining cells. This can be done in SAS without using anything as fancy as PROC NLP! Proportional reweighting and even spread as practiced above always affect all the non-empty cells. A more targeted approach would be to leave unmodified the cells that share no values of the underlying variables with the empty cells, and only reweight the guilty cells that share variable values with the empty cells. In our example, the CS and EU cells are guilty and the ES cell is not. Then, guilty proportional reweighting would multiply the two guilty cells by 36 / 30, giving an EC split of and SU split of Guilty even spread gives EC split of and SU split of The two guilty reapportionments are about equal to each other, and slightly better than the global reapportionments. In this example, there actually is a reapportionment solution that perfectly maintains the marginals, ES = 18, CS = 24, EU = 18. However, it does throw off the relative proportions of the individual cells more than the above solutions. Also, in some cases, there is no perfect solution to maintain the marginals. If ES were the empty quadrant in the sample, instead of CU, then a perfect marginal solution would have to satisfy: 2

3 CS + CU + EU = 60, CS + CU = 24, CU + EU = 18 this would require CU = -18, which is impossible. LEAST SQUARES MARGINALS MAINTENANCE Least Squares minimization can be applied to any combination of cells, not just the individual ones. In particular, Least Squares can be used to try to get the closest approximation to the individual variable marginal splits. For the ES = 0 problem in the previous example, minimize: (42 - CS) 2 + (36 - EU) 2 + (18 - (EU + CU)) 2 + (24 - (CS + CU)) 2, subject to EU + CU + CS = 60. The solution to this is CS = 33, EU = 27, CU = 0. Not very comforting, but this is a pretty extreme situation. When there is a missing cell in a 2x2 table, there usually will be a unique solution to the Least Squares problem for the splits on the two individual variables. For larger problems, there may not be a unique solution. For example, in the 2x2 case above with no missing cells, the Least Squares set-up for the SU, EC splits is to minimize: (42 - (CS + ES)) 2 + (36 - (EU + ES)) 2 + (18 - (EU + CU)) 2 + (24 - (CS + CU)) 2, subject to EU + ES + CU + CS = 60. The true values, ES = 24, CS = 18, EU = 12, CU = 6, solve this exactly, but so do ES = 20, CS = 22, EU = 16, CU = 2, and infinitely many other combinations. 2x2 cases can be done by hand, but what can handle more complex minimizations?... PROC NLP SAS has PROC NLP, a nonlinear optimizer, in the SAS/OR software module. It was introduced as an experimental release with SAS 6.08, and was placed into production with SAS 6.09, and has been included with each subsequent release of SAS/OR. The archetypal problem is least squares minimization with linear constraints (of which the sample reweighting problem is an example), but since release 6.11 nonlinear constraints are also allowed. Please note that SAS/IML software also has nonlinear programming capabilities, and PROC NLIN in SAS/STAT uses some of the same techniques. SAS 9 has completely revamped the SAS/OR tool set for release 9.2, and while PROC NLP remains available, the preferred method will be to use PROC NLPC or PROC OPTQP. Here is PROC NLP syntax for the simple 2x2 example above where ES = 0. PROC NLP OUTEST = nlpout1 TECHNIQUE = CONGRA --NOPRINT-- MIN objval PARMS cs = 20, cu = 20, eu = 20 objval = (42 - cs)**2 + (36 - eu)**2 + (18 - (eu + cu))**2 + (24 - (cs + cu))**2 BOUNDS cs cu eu >= 0 LINCON 60 = cs + cu + eu PROC NLP Syntax Notes: OUTEST contains the optimization solution, including optimal parameter values, objective function value, right hand sides of constraints. TECHNIQUE : several solution techniques are available, most (not all) requiring derivative info on objective function (user can supply this independently, like PROC NLIN, but doesn t always need to). CONGRA -- conjugate gradient, converges relatively easily. PARMS : initial parameter values for search. LINCON : linear constraints. BOUNDS : can have upper and lower bounds. Here is an alternative syntax for the same problem: 3

4 PROC NLP OUTEST = nlpout1 TECHNIQUE = CONGRA --NOPRINT-- LSQ fc fe fs fu PARMS cs = 20, cu = 20, eu = 20 fc = (24 - (cs + cu)) fe = (36 - eu) fs = (42 - cs) fu = (18 - (eu + cu)) BOUNDS cs cu eu >= 0 LINCON 60 = cs + cu + eu The key pieces of the PROC NLP set up are the target values, the parameter variables, and the initial parameter variable values. The target values (42, 18, 36, 24, in the example above) are the proportional pieces of the sample for the groups of interest -- in our case, separate groups for each individual trait variable level. There is one parameter variable for each non-empty Cartesian cell in the sample. The initial parameter values can be chosen in many ways one way is to start with the actual sample counts in each cell. To make this all useful, the task is to start with the initial data and go through the following steps : For both the population and sample, compute the total counts, the Cartesian cell counts, and the counts for each individual variable level. Use the counts to determine the number of NLP parameters, initial and target values, and generate the NLP step. Translate the results of the NLP step into weights, and merge back with the initial data. A SAMPLE MACRO FOR THE LEAST SQUARES MARGINAL MAINTENANCE REWEIGHTING This macro has several input parameters, including: inlibp population input data set library indsp population input data set name inlibs sample input data set library indss sample input data set name outlibs output data set library outdst trait value data set name outdss match-weights-to-sample data set name work for work library or other library to save intermediate data sets numtrait number of traits (variables) determining cells trait1, trait2, names of trait variables letter1, letter2 short names of trait variables numctr number of character trait variables (list them first) ncids total sample count wlb weight lower bound wub weight upper bound techneek optimization technique for PROC NLP wgtvar weight variable name * FIND POPULATION TRAIT VALUE PERCENTAGES (TARGETS OF REWEIGHTING SCHEME) %LET maxnval = 0 %* largest number of individual trait values %DO i = 1 %TO &numtrait. 4

5 PROC FREQ DATA = &inlibp..&indsp. NOPRINT TABLES &&trait&i. / OUT = &work..ptr&i. DATA _NULL_ SET &work..ptr&i. END = &last. RETAIN mintgt &ncids. * ncids is total sample count CALL SYMPUT("tv&i._" COMPRESS(_N_),COMPRESS(&&trait&i.)) * individual trait value target = PERCENT * &ncids. / 100 IF target < mintgt THEN DO mintgt = target CALL SYMPUT("tp&i._" COMPRESS(_N_),COMPRESS(target)) * target percentage IF &last. THEN DO CALL SYMPUT("nv&i.",COMPRESS(_N_)) * number of values of trait CALL SYMPUT("mintgt",COMPRESS(mintgt)) * minimum target value %IF &&nv&i. > &maxnval. %THEN %DO %LET maxnval = &&nv&i. % % %* trait i freq %LET ntnv = %SYSEVALF(&numtrait. * &maxnval.) %* upper bound on number of NLP parameters * FIND ALL CELLS REPRESENTED IN SAMPLE PROC FREQ DATA = &inlibs..&indss. NOPRINT TABLES &trait1. %DO i = 2 %TO &numtrait. * &&trait&i. % %* trait i / OUT = &work..smpcel1 * FIND WHICH VARIABLES GO WITH WHICH TRAIT VALUES (ONE VARIABLE PER UNIQUE CELL) DATA &outlibs..&outdst. set &work..smpcel1 END = &last. ARRAY vc{1:&numtrait.,1:&maxnval.} vc1 - vc&ntnv. * array to count number of variables which go with each trait value RETAIN vc1 - vc&ntnv. 0 DROP i j vc1 - vc&ntnv. wlbc wubc CALL SYMPUT("xi" COMPRESS(_N_),COMPRESS(COUNT)) * use actual cell counts as initial variable values wlbc = &wlb. * COUNT wubc = &wub. * COUNT CALL SYMPUT("wl" COMPRESS(_N_),COMPRESS(wlbc)) * to get proper lower bound on weight, have lower bound on cell variable be weight lower bound times cell count CALL SYMPUT("wu" COMPRESS(_N_),COMPRESS(wubc)) * to get proper upper bound on weight, have upper bound on cell variable be weight upper bound times cell count %DO i = 1 %TO &numtrait. %LET li = &&letter&i. SELECT (&&trait&i.) 5

6 %DO j = 1 %TO &&nv&i. WHEN %IF &i. LE &numctr. %THEN %DO ("&&&tv&i._&j.") % %* assume char variables listed first %ELSE %DO (&&&tv&i._&j.) % DO %* create list of vars with level j for trait i vc{&i.,&j.} + 1 CALL SYMPUT("x&li._&j._" COMPRESS(vc{&i.,&j.}),COMPRESS(_N_)) % %* j 1 to nvi OTHERWISE % %* i 1 to numtrait IF &last. THEN DO CALL SYMPUT("numxvar",COMPRESS(_N_)) * number of variables DO i = 1 TO &numtrait. DO j = 1 TO &maxnval. CALL SYMPUT("vc" COMPRESS(i) "_" COMPRESS(j),COMPRESS(vc{i,j})) * j * i * SET UP PROC NLP PROC NLP OUTEST = &work..nlpout1 NOPRINT TECHNIQUE = &techneek. MIN objval PARMS x1 = &xi1. %DO i = 2 %TO &numxvar., x&i. = &&xi&i. % %* i 2 to numxvar BOUNDS %LET numxvar1 = %SYSEVALF(&numxvar. - 1) %DO i = 1 %TO &numxvar1. &&wl&i. <= x&i. <= &&wu&i., % %* i 1 to numxvar1 &&wl&i. <= x&i. <= &&wu&i. %LET notfirst = 0 LINCON &ncids. = x1 %DO i = 2 %TO &numxvar. + x&i. % %* i 2 to numxvar objval = %DO i = 1 %TO &numtrait. %LET li = &&letter&i. %DO j = 1 %TO &&nv&i. %IF &&&vc&i._&j. %THEN %DO %* term irrelevant if no sample cells exist for it %IF &notfirst. %THEN %DO %* plus sign to add successive terms after first + 6

7 % %ELSE %DO %LET notfirst = 1 % %LET wtij = &&trwt&i. &wtij. * (&&&tp&i._&j. - (x&&&x&li._&j._1. %DO k = 2 %TO &&&vc&i._&j. + x&&&x&li._&j._&k. % %* k 2 to vci_j ))**2 % %* vci_j > 0 % %* j 1 to nvi % %* i 1 to numtrait * EXTRACT SOLUTION FOR MATCHING WITH TRANSLATION SET PROC TRANSPOSE DATA = &work..nlpout1 (DROP = _NAME_ WHERE = (_TYPE_ = "PARMS")) OUT = &work..nlparms1 VAR x1 - x&numxvar. * MATCH SOLUTION WITH TRANSLATION SET AND SOLVE FOR WEIGHTS DATA &outlibs..&outdst. MERGE &work..nlparms1 &outlibs..&outdst. * merge had better be one to one DROP COL1 wlb wlbc wub wubc wlb = &wlb. wlbc = &wlb. * COUNT wub = &wub. wubc = &wub. * COUNT IF COL1 < wlbc THEN DO &wgtvar. = &wlb. ELSE IF COL1 > wubc THEN DO &wgtvar. = &wub. ELSE DO &wgtvar. = COL1 / COUNT * SORT SAMPLE TO MATCH WITH TRANSLATION SET AND APPLY WEIGHTS PROC SORT DATA = &inlibs..&indss. OUT = &work..indsrt1 BY %DO i = 1 %TO &numtrait. &&trait&i. % %* i 1 to numtrait * MATCH WEIGHTS TO SAMPLE DATA &outlibs..&outdss. MERGE &work..indsrt1 (IN = ins) &outlibs..&outdst. (IN = ino KEEP = &wgtvar. %DO i = 1 %TO &numtrait. 7

8 &&trait&i. % %* i 1 to numtrait ) END = &last. BY %DO i = 1 %TO &numtrait. &&trait&i. % %* i 1 to numtrait DROP ctm ctn cts cto IF ins THEN DO IF ino THEN DO ctm + 1 OUTPUT ELSE DO cts + 1 * should be zero ELSE IF ino THEN DO cto + 1 * should be zero ELSE DO ctn + 1 * must be zero IF &last. THEN DO PUT "ctm = " ctm PUT "cts = " cts PUT "cto = " cto PUT "ctn = " ctn * * * * * * * * * * * * * * * 8

9 CONCLUSIONS We have seen that reweighting to handle empty cells confronts us with many possible choices different ones may be desirable depending upon circumstances. Many of the reweighting schemes are easy to apply. We showed that under many conditions, it is possible for a more sophisticated reweighting scheme to preserve the marginal distribution. This involves minimizing a quadratic objective function, and may best be accomplished with the assistance of nonlinear optimization software, such as the PROC NLP procedure of SAS/OR. REFERENCES: [1] [2] Department of Energy 1995 Commercial Buildings Energy Consumption Survey [3] Weighting Adjustments for Unit Nonresponse with Multiple Outcome Variables S.L. Vartivarian and R. Little, 2003, University of Michigan Department of Biostatistics Working Paper Series ACKNOWLEDGMENTS SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Talbot Michael Katz Analytic Data Information Technologies 229 East 21 st Street, #2 New York NY Phone: Fax: topkatz@msn.com * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 9

Using PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA

Using PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA Using PROC REPORT to Cross-Tabulate Multiple Response Items Patrick Thornton, SRI International, Menlo Park, CA ABSTRACT This paper describes for an intermediate SAS user the use of PROC REPORT to create

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

CREATING A SUMMARY TABLE OF NORMALIZED (Z) SCORES

CREATING A SUMMARY TABLE OF NORMALIZED (Z) SCORES CREATING A SUMMARY TABLE OF NORMALIZED (Z) SCORES Walter W. OWen The Biostatistics Center The George Washington University ABSTRACT Data from the behavioral sciences are often analyzed by normalizing the

More information

A Side of Hash for You To Dig Into

A Side of Hash for You To Dig Into A Side of Hash for You To Dig Into Shan Ali Rasul, Indigo Books & Music Inc, Toronto, Ontario, Canada. ABSTRACT Within the realm of Customer Relationship Management (CRM) there is always a need for segmenting

More information

The Basics of PROC FCMP. Dachao Liu Northwestern Universtiy Chicago

The Basics of PROC FCMP. Dachao Liu Northwestern Universtiy Chicago The Basics of PROC FCMP Dachao Liu Northwestern Universtiy Chicago ABSTRACT SAS Functions can save SAS users time and effort in programming. Each release of SAS has new functions added. Up to the latest

More information

Chapter 6: Modifying and Combining Data Sets

Chapter 6: Modifying and Combining Data Sets Chapter 6: Modifying and Combining Data Sets The SET statement is a powerful statement in the DATA step. Its main use is to read in a previously created SAS data set which can be modified and saved as

More information

So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines

So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines Paper TT13 So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines Anthony Harris, PPD, Wilmington, NC Robby Diseker, PPD, Wilmington, NC ABSTRACT

More information

- 1 - Fig. A5.1 Missing value analysis dialog box

- 1 - Fig. A5.1 Missing value analysis dialog box WEB APPENDIX Sarstedt, M. & Mooi, E. (2019). A concise guide to market research. The process, data, and methods using SPSS (3 rd ed.). Heidelberg: Springer. Missing Value Analysis and Multiple Imputation

More information

Using Templates Created by the SAS/STAT Procedures

Using Templates Created by the SAS/STAT Procedures Paper 081-29 Using Templates Created by the SAS/STAT Procedures Yanhong Huang, Ph.D. UMDNJ, Newark, NJ Jianming He, Solucient, LLC., Berkeley Heights, NJ ABSTRACT SAS procedures provide a large quantity

More information

The Dataset Diet How to transform short and fat into long and thin

The Dataset Diet How to transform short and fat into long and thin Paper TU06 The Dataset Diet How to transform short and fat into long and thin Kathryn Wright, Oxford Pharmaceutical Sciences, UK ABSTRACT What do you do when you are given a dataset with one observation

More information

Are you Still Afraid of Using Arrays? Let s Explore their Advantages

Are you Still Afraid of Using Arrays? Let s Explore their Advantages Paper CT07 Are you Still Afraid of Using Arrays? Let s Explore their Advantages Vladyslav Khudov, Experis Clinical, Kharkiv, Ukraine ABSTRACT At first glance, arrays in SAS seem to be a complicated and

More information

Uncommon Techniques for Common Variables

Uncommon Techniques for Common Variables Paper 11863-2016 Uncommon Techniques for Common Variables Christopher J. Bost, MDRC, New York, NY ABSTRACT If a variable occurs in more than one data set being merged, the last value (from the variable

More information

Two useful macros to nudge SAS to serve you

Two useful macros to nudge SAS to serve you Two useful macros to nudge SAS to serve you David Izrael, Michael P. Battaglia, Abt Associates Inc., Cambridge, MA Abstract This paper offers two macros that augment the power of two SAS procedures: LOGISTIC

More information

A Quick and Gentle Introduction to PROC SQL

A Quick and Gentle Introduction to PROC SQL ABSTRACT Paper B2B 9 A Quick and Gentle Introduction to PROC SQL Shane Rosanbalm, Rho, Inc. Sam Gillett, Rho, Inc. If you are afraid of SQL, it is most likely because you haven t been properly introduced.

More information

Anyone Can Learn PROC TABULATE, v2.0

Anyone Can Learn PROC TABULATE, v2.0 Paper 63-25 Anyone Can Learn PROC TABULATE, v2.0 Lauren Haworth Ischemia Research & Education Foundation San Francisco ABSTRACT SAS Software provides hundreds of ways you can analyze your data. You can

More information

Using the CLP Procedure to solve the agent-district assignment problem

Using the CLP Procedure to solve the agent-district assignment problem Using the CLP Procedure to solve the agent-district assignment problem Kevin K. Gillette and Stephen B. Sloan, Accenture ABSTRACT The Challenge: assigning outbound calling agents in a telemarketing campaign

More information

A SAS Solution to Create a Weekly Format Susan Bakken, Aimia, Plymouth, MN

A SAS Solution to Create a Weekly Format Susan Bakken, Aimia, Plymouth, MN Paper S126-2012 A SAS Solution to Create a Weekly Format Susan Bakken, Aimia, Plymouth, MN ABSTRACT As programmers, we are frequently asked to report by periods that do not necessarily correspond to weeks

More information

Know What You Are Missing: How to Catalogue and Manage Missing Pieces of Historical Data

Know What You Are Missing: How to Catalogue and Manage Missing Pieces of Historical Data Know What You Are Missing: How to Catalogue and Manage Missing Pieces of Historical Data Shankar Yaddanapudi, SAS Consultant, Washington DC ABSTRACT In certain applications it is necessary to maintain

More information

Ranking Between the Lines

Ranking Between the Lines Ranking Between the Lines A %MACRO for Interpolated Medians By Joe Lorenz SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in

More information

Data Quality Control: Using High Performance Binning to Prevent Information Loss

Data Quality Control: Using High Performance Binning to Prevent Information Loss SESUG Paper DM-173-2017 Data Quality Control: Using High Performance Binning to Prevent Information Loss ABSTRACT Deanna N Schreiber-Gregory, Henry M Jackson Foundation It is a well-known fact that the

More information

Automating Preliminary Data Cleaning in SAS

Automating Preliminary Data Cleaning in SAS Paper PO63 Automating Preliminary Data Cleaning in SAS Alec Zhixiao Lin, Loan Depot, Foothill Ranch, CA ABSTRACT Preliminary data cleaning or scrubbing tries to delete the following types of variables

More information

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO ABSTRACT The power of SAS programming can at times be greatly improved using PROC SQL statements for formatting and manipulating

More information

Using PROC SQL to Generate Shift Tables More Efficiently

Using PROC SQL to Generate Shift Tables More Efficiently ABSTRACT SESUG Paper 218-2018 Using PROC SQL to Generate Shift Tables More Efficiently Jenna Cody, IQVIA Shift tables display the change in the frequency of subjects across specified categories from baseline

More information

Statistics, Data Analysis & Econometrics

Statistics, Data Analysis & Econometrics ST009 PROC MI as the Basis for a Macro for the Study of Patterns of Missing Data Carl E. Pierchala, National Highway Traffic Safety Administration, Washington ABSTRACT The study of missing data patterns

More information

It s Not All Relative: SAS/Graph Annotate Coordinate Systems

It s Not All Relative: SAS/Graph Annotate Coordinate Systems Paper TU05 It s Not All Relative: SAS/Graph Annotate Coordinate Systems Rick Edwards, PPD Inc, Wilmington, NC ABSTRACT This paper discusses the SAS/Graph Annotation coordinate systems and how a combination

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

Square Peg, Square Hole Getting Tables to Fit on Slides in the ODS Destination for PowerPoint

Square Peg, Square Hole Getting Tables to Fit on Slides in the ODS Destination for PowerPoint PharmaSUG 2018 - Paper DV-01 Square Peg, Square Hole Getting Tables to Fit on Slides in the ODS Destination for PowerPoint Jane Eslinger, SAS Institute Inc. ABSTRACT An output table is a square. A slide

More information

Mapping Clinical Data to a Standard Structure: A Table Driven Approach

Mapping Clinical Data to a Standard Structure: A Table Driven Approach ABSTRACT Paper AD15 Mapping Clinical Data to a Standard Structure: A Table Driven Approach Nancy Brucken, i3 Statprobe, Ann Arbor, MI Paul Slagle, i3 Statprobe, Ann Arbor, MI Clinical Research Organizations

More information

CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD

CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD ABSTRACT SESUG 2016 - RV-201 CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD Those of us who have been using SAS for more than a few years often rely

More information

SUGI 29 Statistics and Data Analysis. To Rake or Not To Rake Is Not the Question Anymore with the Enhanced Raking Macro

SUGI 29 Statistics and Data Analysis. To Rake or Not To Rake Is Not the Question Anymore with the Enhanced Raking Macro Paper 7-9 To Rake or Not To Rake Is Not the Question Anymore with the Enhanced Raking Macro David Izrael, David C. Hoaglin, and Michael P. Battaglia Abt Associates Inc., Cambridge, Massachusetts Abstract

More information

It s Proc Tabulate Jim, but not as we know it!

It s Proc Tabulate Jim, but not as we know it! Paper SS02 It s Proc Tabulate Jim, but not as we know it! Robert Walls, PPD, Bellshill, UK ABSTRACT PROC TABULATE has received a very bad press in the last few years. Most SAS Users have come to look on

More information

SAS Macro Technique for Embedding and Using Metadata in Web Pages. DataCeutics, Inc., Pottstown, PA

SAS Macro Technique for Embedding and Using Metadata in Web Pages. DataCeutics, Inc., Pottstown, PA Paper AD11 SAS Macro Technique for Embedding and Using Metadata in Web Pages Paul Gilbert, Troy A. Ruth, Gregory T. Weber DataCeutics, Inc., Pottstown, PA ABSTRACT This paper will present a technique to

More information

Optimization and least squares. Prof. Noah Snavely CS1114

Optimization and least squares. Prof. Noah Snavely CS1114 Optimization and least squares Prof. Noah Snavely CS1114 http://cs1114.cs.cornell.edu Administrivia A5 Part 1 due tomorrow by 5pm (please sign up for a demo slot) Part 2 will be due in two weeks (4/17)

More information

Microsoft Access XP (2002) - Advanced Queries

Microsoft Access XP (2002) - Advanced Queries Microsoft Access XP (2002) - Advanced Queries Group/Summary Operations Change Join Properties Not Equal Query Parameter Queries Working with Text IIF Queries Expression Builder Backing up Tables Action

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

BACKGROUND INFORMATION ON COMPLEX SAMPLE SURVEYS

BACKGROUND INFORMATION ON COMPLEX SAMPLE SURVEYS Analysis of Complex Sample Survey Data Using the SURVEY PROCEDURES and Macro Coding Patricia A. Berglund, Institute For Social Research-University of Michigan, Ann Arbor, Michigan ABSTRACT The paper presents

More information

A SAS Macro for Balancing a Weighted Sample

A SAS Macro for Balancing a Weighted Sample Paper 258-25 A SAS Macro for Balancing a Weighted Sample David Izrael, David C. Hoaglin, and Michael P. Battaglia Abt Associates Inc., Cambridge, Massachusetts Abstract It is often desirable to adjust

More information

Creating an ADaM Data Set for Correlation Analyses

Creating an ADaM Data Set for Correlation Analyses PharmaSUG 2018 - Paper DS-17 ABSTRACT Creating an ADaM Data Set for Correlation Analyses Chad Melson, Experis Clinical, Cincinnati, OH The purpose of a correlation analysis is to evaluate relationships

More information

Table Lookups: Getting Started With Proc Format

Table Lookups: Getting Started With Proc Format Table Lookups: Getting Started With Proc Format John Cohen, AstraZeneca LP, Wilmington, DE ABSTRACT Table lookups are among the coolest tricks you can add to your SAS toolkit. Unfortunately, these techniques

More information

Submitting SAS Code On The Side

Submitting SAS Code On The Side ABSTRACT PharmaSUG 2013 - Paper AD24-SAS Submitting SAS Code On The Side Rick Langston, SAS Institute Inc., Cary NC This paper explains the new DOSUBL function and how it can submit SAS code to run "on

More information

DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017

DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017 DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017 USING PROC MEANS The routine PROC MEANS can be used to obtain limited summaries for numerical variables (e.g., the mean,

More information

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma ABSTRACT Today there is more pressure on programmers to deliver summary outputs faster without sacrificing quality. By using just a few programming

More information

MISSING DATA AND MULTIPLE IMPUTATION

MISSING DATA AND MULTIPLE IMPUTATION Paper 21-2010 An Introduction to Multiple Imputation of Complex Sample Data using SAS v9.2 Patricia A. Berglund, Institute For Social Research-University of Michigan, Ann Arbor, Michigan ABSTRACT This

More information

PREREQUISITES FOR EXAMPLES

PREREQUISITES FOR EXAMPLES 212-2007 SAS Information Map Studio and SAS Web Report Studio A Tutorial Angela Hall, Zencos Consulting LLC, Durham, NC Brian Miles, Zencos Consulting LLC, Durham, NC ABSTRACT Find out how to provide the

More information

Using SAS to Manage Biological Species Data and Calculate Diversity Indices

Using SAS to Manage Biological Species Data and Calculate Diversity Indices SCSUG November 2014 Using SAS to Manage Biological Species Data and Calculate Diversity Indices ABSTRACT Paul A. Montagna, Harte Research Institute, TAMU-CC, Corpus Christi, TX Species level information

More information

Using SAS Macros to Extract P-values from PROC FREQ

Using SAS Macros to Extract P-values from PROC FREQ SESUG 2016 ABSTRACT Paper CC-232 Using SAS Macros to Extract P-values from PROC FREQ Rachel Straney, University of Central Florida This paper shows how to leverage the SAS Macro Facility with PROC FREQ

More information

Greenspace: A Macro to Improve a SAS Data Set Footprint

Greenspace: A Macro to Improve a SAS Data Set Footprint Paper AD-150 Greenspace: A Macro to Improve a SAS Data Set Footprint Brian Varney, Experis Business Intelligence and Analytics Practice ABSTRACT SAS programs can be very I/O intensive. SAS data sets with

More information

Using SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL

Using SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL Using SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL ABSTRACT SAS is a powerful programming language. When you find yourself

More information

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA Paper HW04 There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA ABSTRACT Clinical Trials data comes in all shapes and sizes depending

More information

PhUse Practical Uses of the DOW Loop in Pharmaceutical Programming Richard Read Allen, Peak Statistical Services, Evergreen, CO, USA

PhUse Practical Uses of the DOW Loop in Pharmaceutical Programming Richard Read Allen, Peak Statistical Services, Evergreen, CO, USA PhUse 2009 Paper Tu01 Practical Uses of the DOW Loop in Pharmaceutical Programming Richard Read Allen, Peak Statistical Services, Evergreen, CO, USA ABSTRACT The DOW-Loop was originally developed by Don

More information

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data Paper PO31 The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data MaryAnne DePesquo Hope, Health Services Advisory Group, Phoenix, Arizona Fen Fen Li, Health Services Advisory Group,

More information

BY S NOTSORTED OPTION Karuna Samudral, Octagon Research Solutions, Inc., Wayne, PA Gregory M. Giddings, Centocor R&D Inc.

BY S NOTSORTED OPTION Karuna Samudral, Octagon Research Solutions, Inc., Wayne, PA Gregory M. Giddings, Centocor R&D Inc. ABSTRACT BY S NOTSORTED OPTION Karuna Samudral, Octagon Research Solutions, Inc., Wayne, PA Gregory M. Giddings, Centocor R&D Inc., Malvern, PA What if the usual sort and usual group processing would eliminate

More information

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA ABSTRACT Removing duplicate observations from a data set is not as easy as it might

More information

An Algorithm to Compute Exact Power of an Unordered RxC Contingency Table

An Algorithm to Compute Exact Power of an Unordered RxC Contingency Table NESUG 27 An Algorithm to Compute Eact Power of an Unordered RC Contingency Table Vivek Pradhan, Cytel Inc., Cambridge, MA Stian Lydersen, Department of Cancer Research and Molecular Medicine, Norwegian

More information

(Refer Slide Time 04:53)

(Refer Slide Time 04:53) Programming and Data Structure Dr.P.P.Chakraborty Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 26 Algorithm Design -1 Having made a preliminary study

More information

footnote1 height=8pt j=l "(Rev. &sysdate)" j=c "{\b\ Page}{\field{\*\fldinst {\b\i PAGE}}}";

footnote1 height=8pt j=l (Rev. &sysdate) j=c {\b\ Page}{\field{\*\fldinst {\b\i PAGE}}}; Producing an Automated Data Dictionary as an RTF File (or a Topic to Bring Up at a Party If You Want to Be Left Alone) Cyndi Williamson, SRI International, Menlo Park, CA ABSTRACT Data dictionaries are

More information

What s New in SAS Studio?

What s New in SAS Studio? ABSTRACT Paper SAS1832-2015 What s New in SAS Studio? Mike Porter, Amy Peters, and Michael Monaco, SAS Institute Inc., Cary, NC If you have not had a chance to explore SAS Studio yet, or if you re anxious

More information

SAS/STAT 14.2 User s Guide. The SURVEYIMPUTE Procedure

SAS/STAT 14.2 User s Guide. The SURVEYIMPUTE Procedure SAS/STAT 14.2 User s Guide The SURVEYIMPUTE Procedure This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute

More information

round decimals to the nearest decimal place and order negative numbers in context

round decimals to the nearest decimal place and order negative numbers in context 6 Numbers and the number system understand and use proportionality use the equivalence of fractions, decimals and percentages to compare proportions use understanding of place value to multiply and divide

More information

The new SAS 9.2 FCMP Procedure, what functions are in your future? John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc.

The new SAS 9.2 FCMP Procedure, what functions are in your future? John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc. PharmaSUG2010 - Paper AD02 The new SAS 9.2 FCMP Procedure, what functions are in your future? John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT ABSTRACT Our company recently decided

More information

Statistics Case Study 2000 M. J. Clancy and M. C. Linn

Statistics Case Study 2000 M. J. Clancy and M. C. Linn Statistics Case Study 2000 M. J. Clancy and M. C. Linn Problem Write and test functions to compute the following statistics for a nonempty list of numeric values: The mean, or average value, is computed

More information

Simulation of Imputation Effects Under Different Assumptions. Danny Rithy

Simulation of Imputation Effects Under Different Assumptions. Danny Rithy Simulation of Imputation Effects Under Different Assumptions Danny Rithy ABSTRACT Missing data is something that we cannot always prevent. Data can be missing due to subjects' refusing to answer a sensitive

More information

number Understand the equivalence between recurring decimals and fractions

number Understand the equivalence between recurring decimals and fractions number Understand the equivalence between recurring decimals and fractions Using and Applying Algebra Calculating Shape, Space and Measure Handling Data Use fractions or percentages to solve problems involving

More information

Checking for Duplicates Wendi L. Wright

Checking for Duplicates Wendi L. Wright Checking for Duplicates Wendi L. Wright ABSTRACT This introductory level paper demonstrates a quick way to find duplicates in a dataset (with both simple and complex keys). It discusses what to do when

More information

The Piecewise Regression Model as a Response Modeling Tool

The Piecewise Regression Model as a Response Modeling Tool NESUG 7 The Piecewise Regression Model as a Response Modeling Tool Eugene Brusilovskiy University of Pennsylvania Philadelphia, PA Abstract The general problem in response modeling is to identify a response

More information

Paper PS05_05 Using SAS to Process Repeated Measures Data Terry Fain, RAND Corporation Cyndie Gareleck, RAND Corporation

Paper PS05_05 Using SAS to Process Repeated Measures Data Terry Fain, RAND Corporation Cyndie Gareleck, RAND Corporation Paper PS05_05 Using SAS to Process Repeated Measures Data Terry Fain, RAND Corporation Cyndie Gareleck, RAND Corporation ABSTRACT Data that contain multiple observations per case are called repeated measures

More information

Interleaving a Dataset with Itself: How and Why

Interleaving a Dataset with Itself: How and Why cc002 Interleaving a Dataset with Itself: How and Why Howard Schreier, U.S. Dept. of Commerce, Washington DC ABSTRACT When two or more SAS datasets are combined by means of a SET statement and an accompanying

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule. CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your

More information

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA Paper CC-20 Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA ABSTRACT Statistical Hypothesis Testing is performed to determine whether enough statistical

More information

How to Go From SAS Data Sets to DATA NULL or WordPerfect Tables Anne Horney, Cooperative Studies Program Coordinating Center, Perry Point, Maryland

How to Go From SAS Data Sets to DATA NULL or WordPerfect Tables Anne Horney, Cooperative Studies Program Coordinating Center, Perry Point, Maryland How to Go From SAS Data Sets to DATA NULL or WordPerfect Tables Anne Horney, Cooperative Studies Program Coordinating Center, Perry Point, Maryland ABSTRACT Clinical trials data reports often contain many

More information

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System %MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System Rushi Patel, Creative Information Technology, Inc., Arlington, VA ABSTRACT It is common to find

More information

Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning

Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning ABSTRACT Deanna Naomi Schreiber-Gregory, Henry M Jackson Foundation, Bethesda, MD It is a well-known fact that

More information

... ) city (city, cntyid, area, pop,.. )

... ) city (city, cntyid, area, pop,.. ) PaperP829 PROC SQl - Is it a Required Tool for Good SAS Programming? Ian Whitlock, Westat Abstract No one SAS tool can be the answer to all problems. However, it should be hard to consider a SAS programmer

More information

Producing Summary Tables in SAS Enterprise Guide

Producing Summary Tables in SAS Enterprise Guide Producing Summary Tables in SAS Enterprise Guide Lora D. Delwiche, University of California, Davis, CA Susan J. Slaughter, Avocet Solutions, Davis, CA ABSTRACT This paper shows, step-by-step, how to use

More information

The SAS/OR s OPTMODEL Procedure :

The SAS/OR s OPTMODEL Procedure : The SAS/OR s OPTMODEL Procedure : A Powerful Modeling Environment for Building, Solving, and Maintaining Mathematical Optimization Models Maurice Djona OASUS - Wednesday, November 19 th, 2008 Agenda Context:

More information

The REPORT Procedure: A Primer for the Compute Block

The REPORT Procedure: A Primer for the Compute Block Paper TT15-SAS The REPORT Procedure: A Primer for the Compute Block Jane Eslinger, SAS Institute Inc. ABSTRACT It is well-known in the world of SAS programming that the REPORT procedure is one of the best

More information

Large Margin Classification Using the Perceptron Algorithm

Large Margin Classification Using the Perceptron Algorithm Large Margin Classification Using the Perceptron Algorithm Yoav Freund Robert E. Schapire Presented by Amit Bose March 23, 2006 Goals of the Paper Enhance Rosenblatt s Perceptron algorithm so that it can

More information

Paper DB2 table. For a simple read of a table, SQL and DATA step operate with similar efficiency.

Paper DB2 table. For a simple read of a table, SQL and DATA step operate with similar efficiency. Paper 76-28 Comparative Efficiency of SQL and Base Code When Reading from Database Tables and Existing Data Sets Steven Feder, Federal Reserve Board, Washington, D.C. ABSTRACT In this paper we compare

More information

Let's Play a Game: A SAS Program for Creating a Word Search Matrix Robert S. Matthews, University of Alabama at Birmingham, Birmingham, AL

Let's Play a Game: A SAS Program for Creating a Word Search Matrix Robert S. Matthews, University of Alabama at Birmingham, Birmingham, AL SESUG 2012 Paper CT-18 Let's Play a Game: A SAS Program for Creating a Word Search Matrix Robert S. Matthews, University of Alabama at Birmingham, Birmingham, AL ABSTRACT This paper describes a process

More information

Math Lab- Geometry Pacing Guide Quarter 3. Unit 1: Rational and Irrational Numbers, Exponents and Roots

Math Lab- Geometry Pacing Guide Quarter 3. Unit 1: Rational and Irrational Numbers, Exponents and Roots 1 Jan. 3-6 (4 days) 2 Jan. 9-13 Unit 1: Rational and Irrational Numbers, Exponents and Roots ISTEP+ ISTEP Framework Focus: Unit 1 Number Sense, Expressions, and Computation 8.NS.1: Give examples of rational

More information

Which of the following toolbar buttons would you use to find the sum of a group of selected cells?

Which of the following toolbar buttons would you use to find the sum of a group of selected cells? Which of the following toolbar buttons would you use to find the sum of a group of selected cells? Selecting a group of cells and clicking on Set Print Area as shown in the figure below has what effect?

More information

Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE

Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE ABSTRACT Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE Steve Cavill, NSW Bureau of Crime Statistics and Research, Sydney, Australia PROC TABULATE is a great tool for generating

More information

CHAPTER 4: MICROSOFT OFFICE: EXCEL 2010

CHAPTER 4: MICROSOFT OFFICE: EXCEL 2010 CHAPTER 4: MICROSOFT OFFICE: EXCEL 2010 Quick Summary A workbook an Excel document that stores data contains one or more pages called a worksheet. A worksheet or spreadsheet is stored in a workbook, and

More information

Unlock SAS Code Automation with the Power of Macros

Unlock SAS Code Automation with the Power of Macros SESUG 2015 ABSTRACT Paper AD-87 Unlock SAS Code Automation with the Power of Macros William Gui Zupko II, Federal Law Enforcement Training Centers SAS code, like any computer programming code, seems to

More information

PharmaSUG Paper AD06

PharmaSUG Paper AD06 PharmaSUG 2012 - Paper AD06 A SAS Tool to Allocate and Randomize Samples to Illumina Microarray Chips Huanying Qin, Baylor Institute of Immunology Research, Dallas, TX Greg Stanek, STEEEP Analytics, Baylor

More information

Lab #9: ANOVA and TUKEY tests

Lab #9: ANOVA and TUKEY tests Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for

More information

Monte Carlo Integration

Monte Carlo Integration Lab 18 Monte Carlo Integration Lab Objective: Implement Monte Carlo integration to estimate integrals. Use Monte Carlo Integration to calculate the integral of the joint normal distribution. Some multivariable

More information

Helping You C What You Can Do with SAS

Helping You C What You Can Do with SAS ABSTRACT Paper SAS1747-2015 Helping You C What You Can Do with SAS Andrew Henrick, Donald Erdman, and Karen Croft, SAS Institute Inc., Cary, NC SAS users are already familiar with the FCMP procedure and

More information

2015 Vanderbilt University

2015 Vanderbilt University Excel Supplement 2015 Vanderbilt University Introduction This guide describes how to perform some basic data manipulation tasks in Microsoft Excel. Excel is spreadsheet software that is used to store information

More information

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application SAS/STAT 13.1 User s Guide The Power and Sample Size Application This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as

More information

Comparison of different ways using table lookups on huge tables

Comparison of different ways using table lookups on huge tables PhUSE 007 Paper CS0 Comparison of different ways using table lookups on huge tables Ralf Minkenberg, Boehringer Ingelheim Pharma GmbH & Co. KG, Ingelheim, Germany ABSTRACT In many application areas the

More information

ABSTRACT INTRODUCTION TRICK 1: CHOOSE THE BEST METHOD TO CREATE MACRO VARIABLES

ABSTRACT INTRODUCTION TRICK 1: CHOOSE THE BEST METHOD TO CREATE MACRO VARIABLES An Efficient Method to Create a Large and Comprehensive Codebook Wen Song, ICF International, Calverton, MD Kamya Khanna, ICF International, Calverton, MD Baibai Chen, ICF International, Calverton, MD

More information

Format-o-matic: Using Formats To Merge Data From Multiple Sources

Format-o-matic: Using Formats To Merge Data From Multiple Sources SESUG Paper 134-2017 Format-o-matic: Using Formats To Merge Data From Multiple Sources Marcus Maher, Ipsos Public Affairs; Joe Matise, NORC at the University of Chicago ABSTRACT User-defined formats are

More information

SAS IT Resource Management Forecasting. Setup Specification Document. A SAS White Paper

SAS IT Resource Management Forecasting. Setup Specification Document. A SAS White Paper SAS IT Resource Management Forecasting Setup Specification Document A SAS White Paper Table of Contents Introduction to SAS IT Resource Management Forecasting... 1 Getting Started with the SAS Enterprise

More information

Weighting and estimation for the EU-SILC rotational design

Weighting and estimation for the EU-SILC rotational design Weighting and estimation for the EUSILC rotational design JeanMarc Museux 1 (Provisional version) 1. THE EUSILC INSTRUMENT 1.1. Introduction In order to meet both the crosssectional and longitudinal requirements,

More information

2 = Disagree 3 = Neutral 4 = Agree 5 = Strongly Agree. Disagree

2 = Disagree 3 = Neutral 4 = Agree 5 = Strongly Agree. Disagree PharmaSUG 2012 - Paper HO01 Multiple Techniques for Scoring Quality of Life Questionnaires Brandon Welch, Rho, Inc., Chapel Hill, NC Seungshin Rhee, Rho, Inc., Chapel Hill, NC ABSTRACT In the clinical

More information

BI-09 Using Enterprise Guide Effectively Tom Miron, Systems Seminar Consultants, Madison, WI

BI-09 Using Enterprise Guide Effectively Tom Miron, Systems Seminar Consultants, Madison, WI Paper BI09-2012 BI-09 Using Enterprise Guide Effectively Tom Miron, Systems Seminar Consultants, Madison, WI ABSTRACT Enterprise Guide is not just a fancy program editor! EG offers a whole new window onto

More information

Journey to the center of the earth Deep understanding of SAS language processing mechanism Di Chen, SAS Beijing R&D, Beijing, China

Journey to the center of the earth Deep understanding of SAS language processing mechanism Di Chen, SAS Beijing R&D, Beijing, China Journey to the center of the earth Deep understanding of SAS language processing Di Chen, SAS Beijing R&D, Beijing, China ABSTRACT SAS is a highly flexible and extensible programming language, and a rich

More information

Non-trivial extraction of implicit, previously unknown and potentially useful information from data

Non-trivial extraction of implicit, previously unknown and potentially useful information from data CS 795/895 Applied Visual Analytics Spring 2013 Data Mining Dr. Michele C. Weigle http://www.cs.odu.edu/~mweigle/cs795-s13/ What is Data Mining? Many Definitions Non-trivial extraction of implicit, previously

More information

SAS/STAT 13.1 User s Guide. The SURVEYFREQ Procedure

SAS/STAT 13.1 User s Guide. The SURVEYFREQ Procedure SAS/STAT 13.1 User s Guide The SURVEYFREQ Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS

More information