A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

Similar documents
Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview

Base and Advance SAS

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY

Biostatistics & SAS programming. Kevin Zhang

SAS CURRICULUM. BASE SAS Introduction

SAS Online Training: Course contents: Agenda:

Validation Summary using SYSINFO

Stat 302 Statistical Software and Its Applications SAS: Data I/O

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

Reading data in SAS and Descriptive Statistics

AURA ACADEMY SAS TRAINING. Opposite Hanuman Temple, Srinivasa Nagar East, Ameerpet,Hyderabad Page 1

Final Stat 302, March 17, 2014

work.test temp.test sasuser.test test

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;

Creating Macro Calls using Proc Freq

Contents of SAS Programming Techniques

Lab #1: Introduction to Basic SAS Operations

EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT)

Data Quality Review for Missing Values and Outliers

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

Introduction to DATA Step Programming: SAS Basics II. Susan J. Slaughter, Avocet Solutions

EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT)

A Cross-national Comparison Using Stacked Data

Correcting for natural time lag bias in non-participants in pre-post intervention evaluation studies

Using SAS Macro to Include Statistics Output in Clinical Trial Summary Table

An Introduction to SAS University Edition

Introduction to the SAS Macro Facility

A SAS macro (ordalpha) to compute ordinal Coefficient Alpha Karen L. Spritzer and Ron D. Hays (December 14, 2015)

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS

Seminar Series: CTSI Presents

ABSTRACT INTRODUCTION MACRO. Paper RF

How to use UNIX commands in SAS code to read SAS logs

Paper DB2 table. For a simple read of a table, SQL and DATA step operate with similar efficiency.

So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines

Checking for Duplicates Wendi L. Wright

To conceptualize the process, the table below shows the highly correlated covariates in descending order of their R statistic.

Program Validation: Logging the Log

DSCI 325: Handout 15 Introduction to SAS Macro Programming Spring 2017

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics

Introduction to DATA Step Programming SAS Basics II. Susan J. Slaughter, Avocet Solutions

Top Coding Tips. Neil Merchant Technical Specialist - SAS

EXST SAS Lab Lab #6: More DATA STEP tasks

A Simple Framework for Sequentially Processing Hierarchical Data Sets for Large Surveys

Paper S Data Presentation 101: An Analyst s Perspective

1 Files to download. 3 Macro to list the highest and lowest N data values. 2 Reading in the example data file

USING DATA TO SET MACRO PARAMETERS

Introductory Guide to SAS:

STAT:5400 Computing in Statistics

SAS Programming Basics

2. Don t forget semicolons and RUN statements The two most common programming errors.

Missing Pages Report. David Gray, PPD, Austin, TX Zhuo Chen, PPD, Austin, TX

Tracking Dataset Dependencies in Clinical Trials Reporting

Two useful macros to nudge SAS to serve you

SAS Programs SAS Lecture 4 Procedures. Aidan McDermott, April 18, Outline. Internal SAS formats. SAS Formats

Stat 302 Statistical Software and Its Applications SAS: Working with Data

Taming a Spreadsheet Importation Monster

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Macros are a block of code that can be executed/called on demand

Using an ICPSR set-up file to create a SAS dataset

Leave Your Bad Code Behind: 50 Ways to Make Your SAS Code Execute More Efficiently.

STAT:5400 Computing in Statistics. Other software packages. Microsoft Excel spreadsheet very convenient for entering data in flatfile

STATION

Purchase this book at

SAS Institute Exam A SAS Advanced Programming Version: 6.0 [ Total Questions: 184 ]

A Macro to Create Program Inventory for Analysis Data Reviewer s Guide Xianhua (Allen) Zeng, PAREXEL International, Shanghai, China

Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research

A Macro that can Search and Replace String in your SAS Programs

A Mass Symphony: Directing the Program Logs, Lists, and Outputs

Electricity Forecasting Full Circle

Ranking Between the Lines

Introduction to DATA Step Programming SAS Basics II. Susan J. Slaughter, Avocet Solutions

T.I.P.S. (Techniques and Information for Programming in SAS )

CREATING A SUMMARY TABLE OF NORMALIZED (Z) SCORES

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA

Know Thy Data : Techniques for Data Exploration

Index. Purchase this book at

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

DSCI 325: Handout 2 Getting Data into SAS Spring 2017

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

0.1 Stata Program 50 /********-*********-*********-*********-*********-*********-*********/ 31 /* Obtain Data - Populate Source Folder */

Introduction to SAS. I. Understanding the basics In this section, we introduce a few basic but very helpful commands.

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN

SAS Application Development Using Windows RAD Software for Front End

BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS

Identifying Duplicate Variables in a SAS Data Set

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA

Tales from the Help Desk 6: Solutions to Common SAS Tasks

Macro Magic III. SNUG Presentation (30mins)

Integrating Large Datasets from Multiple Sources Calgary SAS Users Group (CSUG)

Adjusting for daylight saving times. PhUSE Frankfurt, 06Nov2018, Paper CT14 Guido Wendland

ABC Macro and Performance Chart with Benchmarks Annotation

SAS is the most widely installed analytical tool on mainframes. I don t know the situation for midrange and PCs. My Focus for SAS Tools Here

EXAMPLE 2: INTRODUCTION TO SAS AND SOME NOTES ON HOUSEKEEPING PART II - MATCHING DATA FROM RESPONDENTS AT 2 WAVES INTO WIDE FORMAT

Hidden in plain sight: my top ten underpublicized enhancements in SAS Versions 9.2 and 9.3

Why & How To Use SAS Macro Language: Easy Ways To Get More Value & Power from Your SAS Software Tools

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2

STAT 503 Fall Introduction to SAS

LIBNAME CCC "H:\Papers\TradeCycle _replication"; /*This is the folder where I put all the three data sets.*/ RUN;

Statements with the Same Function in Multiple Procedures

Matt Downs and Heidi Christ-Schmidt Statistics Collaborative, Inc., Washington, D.C.

Transcription:

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes Brian E. Lawton Curriculum Research & Development Group University of Hawaii at Manoa Honolulu, HI December 2012 Copyright 2012 Curriculum Research & Development Group, University of Hawaii at Manoa Please cite as: Lawton, B. E. (2012). A SAS macro for producing benchmarks for interpreting school effect sizes [Computer software code]. Retrieved from http://goo.gl/op3q4 /******************************DESCRIPTION*********************************/ /*This code calculates the sizes of the differences between all pairs of schools within a year and grade. It calculates the absolute difference between two schools and divides this number by the pooled standard deviation of those two schools, performing this for every school with every other school. For each grade within a year, it calculates and reports the 1st, 2nd (median), and 3rd quartile of these effect sizes. It also calculates the average 1st, 2nd, and 3rd quartile across all years within one grade. /***DIRECTIONS****/ /* 1. Format the dataset you are analyzing. a. This code requires that your SAS dataset have the following three variables (named as such): Year, Grade, and School. If you do not have multiple years, or multiple grades, you should create a dummy value for all observations (e.g., "1111" for year; "11" for grade). b. This code assumes you are analyzing the effects on outcome variable (math test scores, for example). c. Name this outcome variable using 4 digits or less (too many digits might result in truncated names in the code). 2. Specify the following in the section where it says DATA SPECIFICATION : a. After the "%let test= ", type name of the outcome variable that is in your dataset (e.g., "Math" or "Read"). The length should be four digits or fewer in length. b. After "%let OrigDataSet =", enter name of your dataset. c. After "libname Effects", type the path to the directory where your dataset resides. Type within the quotes. d. After "%let Folder =", type the path to the directory where you can save datasets from running this code. e. After the "%let CSV_Destination =", type the path to the directory where you want the CSV output of the results saved (can be the same directory as either one of the above). f. After the "%let MinNumStudnts =", type the minimum number of students (or scores on the test) per school to include in your comparison.

g. After "%let InvalidScore =", type a code (e.g., "9" or "100") that is used in the dataset for indicating invalid scores on the outcome variable. If none, type a period[.]). 3. At the end of the section of the code entitled "EFFECT CALCULATION", specify all combinations of years and grades. a. For example, if you are examining Grades 5 and 8 in the years 2002 and 2003, your code would be this: %effect (year = 2002, grade = 5) %effect (year = 2002, grade = 8) %effect (year = 2003, grade = 5) %effect (year = 2003, grade = 8) */ /****************************DATA SPECIFICATION**************************/ %let test = math; /*<----The length should be four or fewer digits long (e.g., if your test is called "Reading" create a new variable called "Read").*/ %let OrigDataSet = Your_data_set_name; /*<----Specify the name of your dataset after = */ libname Effects "C:\...your data set directory"; /*<----Specify the location of your dataset.*/ %let Folder = "C:\...location_of_saved_data"; /*<----Specify a location for saving datasets in this code.*/ %let CSV_Destination = "C:\...location_of_CSV_outputs"; /*<----Specify a location for saving CSV outputs.*/ %let MinNumStudnts = 25; /*<----Specify the minimum number of students (or scores on the test) per school to include in your comparison.*/ %let InvalidScore = 100; /*<----Specify the code used for invalid scores. If none is used, type a period [.]*/ libname &test._ds "&Folder"; /*There is no need to change this*/ options mprint spool; /*If the name of your test is longer than four digits (e.g., if it is "Reading"), create a new variable (e.g., "Read") that is four digits long. Here's an example: /*****EXAMPLE ONLY********/ /*data Effects.&OrigDataSet.; set Effects.YourPreviousDataSet.; rename Read = reading; */ /****SUBSET SELECTION****/ /*This section of the code subsets the data based on the outcome variable you specified, and removes invalid scores.*/

proc sort data = Effects.&OrigDataSet.; by year grade school; data temp1; set Effects.&OrigDataSet.; if &test. NE. and &test. NE &InvalidScore.; /*The PROC SUMMARY statement calculates the N, mean, and the standard deviation of scores within each school. This code also eliminates schools that do not have at least N scores per school (in our example, we are keeping schools that have at least 25 or more students per school.)*/ proc summary data = temp1; by year grade school; var &test.; output out = temp2 n = n&test. mean = mean&test. std = std&test.; data &test._ds.&test._ge_&minnumstudnts.; set temp2; if n&test. ge &MinNumStudnts.; if grade ne 31; /*The second PROC SUMMARY statement calculates the number of schools in each grade in a year. This will be used in the do loop in the next section.*/ proc summary data = &test._ds.&test._ge_&minnumstudnts.; by year grade; var school; output out = temp3 n = nschools; /***EFFECT CALCULATION***/ /*This section of the code is the macro. Each iteration is invoked by the macro calls at the end of the section. For example, the call "%effect year=2002, grade=3" tells the code to include only the data within the grade and within the year.*/ /*The first data step creates a subset composed of only schools' scores within the grade within the year. It outputs this to a flat file stored in your folder location (which you specified above).*/ %macro effect (year=, grade=); data _NULL_; set &test._ds.&test._ge_&minnumstudnts.; if year = &year; if grade = &grade; file "&Folder.\&Test.gr&grade._&Year."; put @1 school @10 n&test @20 mean&test @35 std&test; /*The PROC SQL statement uses the count of schools within the grade within the year and creates a macro variable (nschools) which will be used in the do loop*/ proc sql noprint; select (nschools)

into: NumSchls from temp3 (where=(year=&year. and grade=&grade.)) ; quit; %let nschls = %trim(%left(&numschls)); /*The next data step uses the flat file data and do loops to calculate the effect size of every possible pair comparison.*/ data gr&grade._&year.; infile "&Folder.\&Test.gr&grade._&Year."; array school{&nschls}; array size{&nschls}; array mean{&nschls}; array std{&nschls}; n=&nschls; do i=1 to n; input school{i} $ size{i} mean{i} std{i}; end; do i=1 to n; do j=i+1 to n; dmean =abs(mean{i}-mean{j}); stdweight=(((std{i}*std{i})*size{i})+((std{j}*std{j})*size{j})); stdpool=sqrt(stdweight/(size{i}+size{j})); d=dmean/stdpool; output; end; end; /*PROC UNIVARIATE calculates the descriptive statistics (n, mean, std, min, Q1, Q2, Q3, max) of the many effect sizes within this grade and year.*/ proc univariate noprint; var d; output out = temp n = n mean = mean std = std max = max min = min q1 = Q1 median = median q3 = Q3; data Avg_gr&grade._&year.; set temp; year = &year; grade = &grade; /*PROC APPEND appends the PROC UNIVARIATE output to a dataset which will contain all subsets' descriptive statistics. This prepares the results for subsequent print and means procedures.*/ proc append base=appended&test._results data=avg_gr&grade._&year.; /*PROC SORT with the NODUP option eliminates duplicates. This is useful because without it, PROC APPEND would continue to append duplicating data in the event that the user runs the program more than one time in a session. */ proc sort NODUP data=appended&test._results; by year grade; %mend; options nomprint; %effect (year = 2002, grade = 3);

%effect (year = 2002, grade = 5) %effect (year = 2002, grade = 8) %effect (year = 2002, grade = 10) %effect (year = 2003, grade = 3) %effect (year = 2003, grade = 5) %effect (year = 2003, grade = 8) %effect (year = 2003, grade = 10) %effect (year = 2004, grade = 3) %effect (year = 2004, grade = 5) %effect (year = 2004, grade = 8) %effect (year = 2004, grade = 10) %effect (year = 2005, grade = 3) %effect (year = 2005, grade = 4) %effect (year = 2005, grade = 5) %effect (year = 2005, grade = 6) %effect (year = 2005, grade = 7) %effect (year = 2005, grade = 8) %effect (year = 2005, grade = 10) %effect (year = 2006, grade = 3) %effect (year = 2006, grade = 4) %effect (year = 2006, grade = 5) %effect (year = 2006, grade = 6) %effect (year = 2006, grade = 7) %effect (year = 2006, grade = 8) %effect (year = 2006, grade = 10) %effect (year = 2007, grade = 3) %effect (year = 2007, grade = 4) %effect (year = 2007, grade = 5) %effect (year = 2007, grade = 6) %effect (year = 2007, grade = 7) %effect (year = 2007, grade = 8) %effect (year = 2007, grade = 10) %effect (year = 2008, grade = 3) %effect (year = 2008, grade = 4) %effect (year = 2008, grade = 5) %effect (year = 2008, grade = 6) %effect (year = 2008, grade = 7) %effect (year = 2008, grade = 8) %effect (year = 2008, grade = 10) %effect (year = 2009, grade = 3) %effect (year = 2009, grade = 4) %effect (year = 2009, grade = 5) %effect (year = 2009, grade = 6) %effect (year = 2009, grade = 7) %effect (year = 2009, grade = 8) %effect (year = 2009, grade = 10) ; /***PRINTING***/ /*This section of the code prints the descriptive statistics of the effect size results.*/ /*The datastep rearranges the order of the variables as they will print out. The ODS lines output the printout to a CSV file, which can be opened with Excel.*/

data all_&test._effect_results; retain grade year n mean std min Q1 median Q3 max; set Appended&test._results; proc sort data = all_&test._effect_results; by grade; ODS noresults; ODS CSV FILE = "&CSV_Destination.\all_&test._effect_results.CSV"; proc print data=all_&test._effect_results noobs; title "All &test. Effect Size Datasets"; title; ODS CSV Close; ODS Results; /***MEAN EFFECTS PER GRADE***/ /*This section of the code calculates and reports the average 1st, 2nd, and 3rd quartile within one grade across all years.*/ proc means n mean std stderr maxdec=3; by grade; var min Q1 median Q3 max; title "&test. Effect Size Statistics"; title; quit;