A SAS Solution to Create a Weekly Format Susan Bakken, Aimia, Plymouth, MN

Similar documents
The Dataset Diet How to transform short and fat into long and thin

PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need

An Efficient Method to Create Titles for Multiple Clinical Reports Using Proc Format within A Do Loop Youying Yu, PharmaNet/i3, West Chester, Ohio

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN

Create a Format from a SAS Data Set Ruth Marisol Rivera, i3 Statprobe, Mexico City, Mexico

Gary L. Katsanis, Blue Cross and Blue Shield of the Rochester Area, Rochester, NY

A Macro that can Search and Replace String in your SAS Programs

Sorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine

Paper CT-16 Manage Hierarchical or Associated Data with the RETAIN Statement Alan R. Mann, Independent Consultant, Harpers Ferry, WV

Journey to the center of the earth Deep understanding of SAS language processing mechanism Di Chen, SAS Beijing R&D, Beijing, China

Application Interface for executing a batch of SAS Programs and Checking Logs Sneha Sarmukadam, inventiv Health Clinical, Pune, India

Paper PS05_05 Using SAS to Process Repeated Measures Data Terry Fain, RAND Corporation Cyndie Gareleck, RAND Corporation

footnote1 height=8pt j=l "(Rev. &sysdate)" j=c "{\b\ Page}{\field{\*\fldinst {\b\i PAGE}}}";

Patricia Guldin, Merck & Co., Inc., Kenilworth, NJ USA

ABSTRACT INTRODUCTION WORK FLOW AND PROGRAM SETUP

PharmaSUG Paper PO10

Paper DB2 table. For a simple read of a table, SQL and DATA step operate with similar efficiency.

Table of Contents. The RETAIN Statement. The LAG and DIF Functions. FIRST. and LAST. Temporary Variables. List of Programs.

PharmaSUG Paper TT11

Tweaking your tables: Suppressing superfluous subtotals in PROC TABULATE

Let SAS Help You Easily Find and Access Your Folders and Files

PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA

Applications Development. Paper 38-28

A Tool to Compare Different Data Transfers Jun Wang, FMD K&L, Inc., Nanjing, China

EXERCISE ACROSS ABBOTT How to Create an Event

Unlock SAS Code Automation with the Power of Macros

Quality Control of Clinical Data Listings with Proc Compare

PhUse Practical Uses of the DOW Loop in Pharmaceutical Programming Richard Read Allen, Peak Statistical Services, Evergreen, CO, USA

Using SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL

CMISS the SAS Function You May Have Been MISSING Mira Shapiro, Analytic Designers LLC, Bethesda, MD

SAS coding for those who like to be control

One Project, Two Teams: The Unblind Leading the Blind

Producing Summary Tables in SAS Enterprise Guide

Checking for Duplicates Wendi L. Wright

Using Templates Created by the SAS/STAT Procedures

Making the most of SAS Jobs in LSAF

ABSTRACT INTRODUCTION TRICK 1: CHOOSE THE BEST METHOD TO CREATE MACRO VARIABLES

SESUG 2014 IT-82 SAS-Enterprise Guide for Institutional Research and Other Data Scientists Claudia W. McCann, East Carolina University.

Not Just Merge - Complex Derivation Made Easy by Hash Object

Moodle Tutorial. Tak Auyeung. April 5, 2006

Practical Uses of the DOW Loop Richard Read Allen, Peak Statistical Services, Evergreen, CO

Choosing the Right Technique to Merge Large Data Sets Efficiently Qingfeng Liang, Community Care Behavioral Health Organization, Pittsburgh, PA

Out of Control! A SAS Macro to Recalculate QC Statistics

Give me EVERYTHING! A macro to combine the CONTENTS procedure output and formats. Lynn Mullins, PPD, Cincinnati, Ohio

Useful Tips When Deploying SAS Code in a Production Environment

Let SAS Write and Execute Your Data-Driven SAS Code

Square Peg, Square Hole Getting Tables to Fit on Slides in the ODS Destination for PowerPoint

A Quick and Gentle Introduction to PROC SQL

Edit the Editor: Creating Keyboard Macros in SAS Enterprise Guide

An Application of PROC NLP to Survey Sample Weighting

BI-09 Using Enterprise Guide Effectively Tom Miron, Systems Seminar Consultants, Madison, WI

To conceptualize the process, the table below shows the highly correlated covariates in descending order of their R statistic.

Remodeling Your Office A New Look for the SAS Add-In for Microsoft Office

Run your reports through that last loop to standardize the presentation attributes

Reading in Data Directly from Microsoft Word Questionnaire Forms

ABSTRACT INTRODUCTION MACRO. Paper RF

Virtual Accessing of a SAS Data Set Using OPEN, FETCH, and CLOSE Functions with %SYSFUNC and %DO Loops

Ranking Between the Lines

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY

A Breeze through SAS options to Enter a Zero-filled row Kajal Tahiliani, ICON Clinical Research, Warrington, PA

The Ins and Outs of %IF

One SAS To Rule Them All

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

A Side of Hash for You To Dig Into

OLAP Drill-through Table Considerations

The Essential Meaning of PROC MEANS: A Beginner's Guide to Summarizing Data Using SAS Software

A Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment

Multiple Facts about Multilabel Formats

DSCI 325: Handout 9 Sorting and Options for Printing Data in SAS Spring 2017

Taming a Spreadsheet Importation Monster

Chapter 6: Modifying and Combining Data Sets

Benchmark Macro %COMPARE Sreekanth Reddy Middela, MaxisIT Inc., Edison, NJ Venkata Sekhar Bhamidipati, Merck & Co., Inc.

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

Reproducibly Random Values William Garner, Gilead Sciences, Inc., Foster City, CA Ting Bai, Gilead Sciences, Inc., Foster City, CA

The GEOCODE Procedure and SAS Visual Analytics

Format-o-matic: Using Formats To Merge Data From Multiple Sources

The Proc Transpose Cookbook

SAS 101. Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23. By Tasha Chapman, Oregon Health Authority

Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.

Extending the Scope of Custom Transformations

SAS Studio: A New Way to Program in SAS

Best Practice for Creation and Maintenance of a SAS Infrastructure

SAS Visual Analytics Environment Stood Up? Check! Data Automatically Loaded and Refreshed? Not Quite

It s Proc Tabulate Jim, but not as we know it!

ABSTRACT INTRODUCTION PROBLEM: TOO MUCH INFORMATION? math nrt scr. ID School Grade Gender Ethnicity read nrt scr

PharmaSUG Paper IB11

Greenspace: A Macro to Improve a SAS Data Set Footprint

While You Were Sleeping, SAS Was Hard At Work Andrea Wainwright-Zimmerman, Capital One Financial, Inc., Richmond, VA

PharmaSUG China. Systematically Reordering Axis Major Tick Values in SAS Graph Brian Shen, PPDI, ShangHai

SAS Macro Technique for Embedding and Using Metadata in Web Pages. DataCeutics, Inc., Pottstown, PA

SAS Macros of Performing Look-Ahead and Look-Back Reads

Dynamic Projects in SAS Enterprise Guide How to Create and Use Parameters

Poster Frequencies of a Multiple Mention Question

Paper HOW-06. Tricia Aanderud, And Data Inc, Raleigh, NC

4010/5010 Transition: FAQs from Ask the Experts Webinars (Updated 2/15/12)

How Managers and Executives Can Leverage SAS Enterprise Guide

Interactive Programming Using Task in SAS Studio

Customized Flowcharts Using SAS Annotation Abhinav Srivastva, PaxVax Inc., Redwood City, CA

A SAS Macro for Generating Informative Cumulative/Point-wise Bar Charts

Applications Development

Transcription:

Paper S126-2012 A SAS Solution to Create a Weekly Format Susan Bakken, Aimia, Plymouth, MN ABSTRACT As programmers, we are frequently asked to report by periods that do not necessarily correspond to weeks on the calendar for example, a client may want to see sales by week since the day a product was released. This paper presents simple, easy-to-use, self-contained SAS code that utilizes date functions, macro variables, a DO/UNTIL loop, and the FORMAT procedure to create formatted values for any weekly periods needed, based on dates specified by the user. It also presents a second example of code that can be used with different start dates within the same data set. INTRODUCTION No matter the industry, SAS programmers generally do a lot of programming with dates. SAS has an extremely robust set of date formats and functions that can be used in many cases, but at times we need to be a bit clever to set up groupings that don t fit a standard week (or other period of time). The two SAS programs presented in this paper can be used to bucket your dates as needed, in order to quantify into groupings based on a specific starting date or dates. CODE FOR A SINGLE STARTING POINT This code can be used if there is a single starting point for all observations in your data set. For example, the number of visits to a website after it went live. The only requirement to run the code is to let SAS know the start and end dates of the period of interest by populating them into macro variables: %LET start='01may2012'd; %LET end = date(); Now that the start and end dates of the period of interest have been set, we need to create a data set that will be used by PROC FORMAT. The following DATA step figures out the start and end dates of each 7-day period between the start and end dates set above. Note that the variables FMTNAME, START, END, and LABEL will all be used by PROC FORMAT to set the parameters of your format. DATA SET_WEEKS; RETAIN fmtname 'weeks'; start=&start; end=start+6; label="week: " PUT(start,mmddyy10.) " - " PUT(end,mmddyy10.); DO UNTIL (end > &end); start=start+7; end=end+7; label="week: " PUT(start,mmddyy10.) " - " PUT(end,mmddyy10.); KEEP fmtname start end label; FORMAT start end mmddyy10.; 1

Figure 1. Screenshot of data set set_weeks We can now use PROC FORMAT to create our date format: PROC FORMAT CNTLIN=SET_WEEKS; We now have a format that we can use to group our dates by the periods we need. The following code shows an example of how the format can be used: DATA test; DO date=&start TO &end; KEEP date; PROC FREQ data=test; TABLES date; FORMAT date weeks.; Figure 2. Screenshot of output CODE FOR MULTIPLE STARTING POINTS What if we have different starting dates for different key variables? In the following code, we want to identify sales by week once after a member enrolls in a loyalty program. Since each member could have a different enrollment date, we need to be able to calculate the sales by week for each individual based on when he/she enrolled. 2

The first step is to identify the period of interest, give it a name, and indicate how many periods we want to track: %LET DaysInPeriod = 7; %LET PeriodName = Week; %LET TotalPeriods = 26; The next step now uses those inputs to create a data set that will be used by PROC FORMAT. This time, instead of using a date range for the format parameters, we will be setting a range for each period based on the number of days from the starting period: DATA DriveFormat; LENGTH label $50; RETAIN fmtname 'Period'; i = 1; start = 0; end = &DaysInPeriod - 1; label = compbl("&periodname " put(i,best.)); DO i = 2 to &TotalPeriods; start = end + 1; end = start + &DaysInPeriod - 1; label = compbl("&periodname " put(i,best.)); Figure 4. Screenshot of data set DriveFormat We can use the FREQUENCY procedure to make sure the code above set the periods of interest correctly: PROC FREQ DATA=DriveFormat; TABLES start * end * label / LIST MISSING; Figure 5. Screenshot of output 3

We can now use PROC FORMAT to create our date format: PROC FORMAT CNTLIN=DriveFormat; Here is an example of using the format. The data set SampleData contains 3 fields: ID, Enrollment Date, and Sales Date. The first thing we need to do is to calculate the number of days between each member s enrollment date and each sale. We will delete any records for sales that took place prior to enrollment, as we are only interested in behavior post-enrolment DATA CalculateDays; SET SampleData; days = SaleDate - EnrolLDate; if days < 0 then delete; if days > (&DaysInPeriod * &TotalPeriods) - 1 then delete; period = PUT(days, period.); Figure 6. Screenshot of data set CalculateDays Now we use PROC FREQ to get the number of sales for each member in each period: PROC FREQ DATA=CalculateDays noprint; TABLES id * period / LIST MISSING OUT=freqs; Figure 7. Screenshot of data set FREQS 4

In this example, we want each weekly period available as a variable, so we will use the TRANSPOSE procedure to create our weekly variables. PROC TRANSPOSE DATA=freqs OUT=Transposed(DROP=_name label_); BY id; VAR count; ID period; Figure 8. Screenshot of data set Transposed As shown in the example above, the weekly periods do not necessarily get created in the order we might want to have them stored in our data set. The following section of code shows how to re-order your variables and to set any missing weekly variables to zero: DATA Ordered; LENGTH id &PeriodName._1 - &PeriodName._&TotalPeriods 8; SET transposed; ARRAY xvar{*} &PeriodName._1 - &PeriodName._&TotalPeriods; DO i = 1 TO dim(xvar); IF xvar{i} =. THEN xvar{i} = 0; DROP i; Figure 9. Screenshot of data set Ordered Now we can do anything we want to with our new variables: PROC MEANS DATA=ordered SUM; VAR &PeriodName._1 - &PeriodName._&TotalPeriods; 5

Figure 10. Screenshot of output CAVEATS The first code will group dates into formatted values that are each 7 days long, with the exception of the last formatted value. If the time period specified is not evenly divisible by 7, the final formatted value will contain all fewer than 7 days. This is important to remember to either clarify results or adjust time periods as necessary. When using the second bit of code, be aware of any members in your dataset who may have a master date that is not far enough in the past to allow for each period s activity. For example, if you are looking for 8 weeks of sales post-enrollment, and one of your members enrolled 4 weeks ago, that person will show no sales for weeks 5-8, when in actuality weeks 4-8 have simply not happened yet for that individual. You may wish to account for this when setting up your dataset or add logic to this code to remove everyone that does not yet have the full period available. CONCLUSION In conclusion, some basic SAS statements, functions, formats, and procedures can allow you to bucket information into the date groupings you need. These pieces of code can be easily modified to adjust to many different unique situations. REFERENCE FIRST CODE IN ITS ENTIRITY FOR EASY COPY/PASTE Here is the entire first program for anyone who wants do copy, paste, and use! ** start date will be used as day 1 of week 1 - all weeks will be 7-day weeks **; ** that start on the same day of the week as this date **; %LET start='01may2012'd; %LET end = date(); ** create a dataset that will drive proc format - note, this will stop once it **; ** reaches a date greater than the end date specified **; DATA SET_WEEKS; RETAIN fmtname 'weeks'; start=&start; 6

end=start+6; label="week: " PUT(start,mmddyy10.) " - " PUT(end,mmddyy10.); DO UNTIL (end > &end); start=start+7; end=end+7; label="week: " PUT(start,mmddyy10.) " - " PUT(end,mmddyy10.); KEEP fmtname start end label; FORMAT start end mmddyy10.; ** use proc format to create the format to the work folder - the format's name **; ** is weeks. **; PROC FORMAT CNTLIN=SET_WEEKS; ** this section just a sample showing how the format can now be used **; DATA test; DO date=&start TO &end; KEEP date; PROC FREQ data=test; TABLES date; FORMAT date weeks.; ** end of program **; REFERENCE SECOND CODE IN ITS ENTIRITY FOR EASY COPY/PASTE Here is the entire first program for anyone who wants do copy, paste, modify, and use! ** set macro variables **; %LET DaysInPeriod = 7; %LET PeriodName = Week; %LET TotalPeriods = 26; ** this step creates the dataset that will drive PROC format **; DATA DriveFormat; LENGTH label $50; RETAIN fmtname 'Period'; i = 1; start = 0; end = &DaysInPeriod - 1; label = compbl("&periodname " put(i,best.)); 7

DO i = 2 to &TotalPeriods; start = end + 1; end = start + &DaysInPeriod - 1; label = compbl("&periodname " put(i,best.)); ** frequency to check our periods and make sure they calculated properly **; PROC FREQ DATA=DriveFormat; TABLES start * end * label / LIST MISSING; ** create the format **; PROC FORMAT CNTLIN=DriveFormat; ** the dataset SAMPLEDATA contains 3 fields - ID, enrollment date, sale date **; ** - calculate the days between enrollment date and each sale date, and **; ** remove any sales that took place before enrollment **; DATA CalculateDays; SET SampleData; days = SaleDate - EnrolLDate; if days < 0 then delete; if days > (&DaysInPeriod * &TotalPeriods) - 1 then delete; period = PUT(days, period.); ** count the number of sales for each member, grouping by our formatted value **; PROC FREQ DATA=CalculateDays noprint; TABLES id * period / LIST MISSING OUT=freqs; ** now we can use PROC transpose to have a separate variable for each period **; PROC TRANSPOSE DATA=freqs OUT=Transposed(DROP=_name label_); BY id; VAR count; ID period; ** this step shows how to use our macro variables to order the new variables **; DATA Ordered; LENGTH id &PeriodName._1 - &PeriodName._&TotalPeriods 8; SET transposed; ARRAY xvar{*} &PeriodName._1 - &PeriodName._&TotalPeriods; DO i = 1 TO dim(xvar); IF xvar{i} =. THEN xvar{i} = 0; DROP i; 8

** now we can use all of our weekly variables however we want **; PROC MEANS DATA=ordered SUM; VAR &PeriodName._1 - &PeriodName._&TotalPeriods; ********************************************************************************; ** end of program **; ********************************************************************************; CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Susan Bakken Enterprise: Aimia Address: 1405 Xenium Lane City, State ZIP: Plymouth, MN 55441 Work Phone: 763.445.3619 E-mail: susan.bakken@aimia.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 9