Descriptive Statistics: Using Indicator Variables

Size: px
Start display at page:

Download "Descriptive Statistics: Using Indicator Variables"

Transcription

1 Descriptive Statistics: sing ndicator Variables Juliana. a" NC Highway Safety Research Center Tamara R. Fischell, NC Highway Safety Research Center ntroduction ndicator variables have proved useful in certain categorical studies of highway safety data when counts and percentages are computed and displayed. Alternative methods of producing these statistics can be based on the special properties of indicator variables. Examples presented in this paper demonstrate the variety of ways indicator variables can be used. We will present a brief description of highway safety studies where these techniques are effective, and use several examples to illustrate common applications. An intermediate-level knowledge of SAS@ basics (DATA step, SHARY, FREQ, CHART, EANS, WEGHT statement) is assumed. Definition An indicator variable is a very simple variable with special properties. t has two specific non-missing values: one and zero. The value is defined as one for an observation that satisfies some criterion of interest, and zero for one that does not. The criterion may be simple (the driver was using a seat belt) or complex (the car overturned and the driver was not belted). A variable defined as an indicator variable has two useful properties: 1. The sum is a count of the number of observations satisfying the criterion. 2. The mean, which is the sum divided by the total number of observations, is the percentage satisfying the criterion. The five observations below illustrate the properties of an indicator variable ELT (O-not belted, -belted) and its relationship to ELTSE (-unbelted, -belted), a standard categorical variable. Three people, sixty percent (3/5), were belted. OS ELTSE ELT S of ELT ~ EAN of ELT = These properties hold whether the statistics are calculated for an entire file, or for subgroups. Subgroups may be specified using Y variables (PROe EANS), elass variables (PROe SARY), or GROP variables (PROe CHART). Applications We use indicator variables for categorical studies that focus on one characteristic while controlling for other variables. One standard analysis compares different types of cars with regard to rare events, such as overturning or post-crash fires. Another example is the consideration of changes in seat belt usage over time controlling for variables like sex and race. Our vehicle studies of rare events involve large files, where four or five variables are sufficient to define the subgroups of interest. The analysis may be based on 100,000 to 1,000,000 observations. We use indicator variables to count the rare events so that more classification variables can be used to define subgroups. The statistics are computed by PROC SARY as discussed by a and Leininger (1984). Example 1 This example shows how PROC SARY and PROe FREQ can be used to produce tables based on the sum of an indicator variable. The difficulty is that PROC FREQ tables based on the sum cannot be produced directly from summary data. The analysis compares the frequency of post-crash fire in cars, where the different car types are defined by four classification variables. An extra DATA step is required to change the summary data into appropriate input for PROC FREQ. Two counts for each combination of the CLASS variables exist in the summary dataset, SSTAT, created in Figure 1. The overall count is the value of FREQ, while FRE is the count of the-number of post-crash fires. The TASTAT dataset has two observations for each subgroup combination, with an appropriate value for the categorical variable FRETYPE. The first observation is weighted by the number of non-fire cases and the second is weighted by the number of fire cases. 1026

2 Figure 1 PROC SARY DATA=CARS NWAYi CLASS CARGROP YEAR ODYTYPE PACT, VAR FRE: OTPT OT=ddname.SSTAT S= /* CARS is a large file /* defines subgroups /* indicator variable (0,1) /* S=count of fires DATA TASTAT, SET ddname.sstati FRETYPE='NO 'i CONT = FREQ - FRE, FRETYPE='YES'; - CONT = FRE, OTPT, OTPT, /* No fire /* Fire PROC FREQ DATA=TASTAT; WEGHT CONT; TALES (CARGROP ODYTYPE PACT) * FRETYPE, Tables could be produced for any combination of the classification variables using 'PROC FREQ DATA=TASTAT;' and 'WEGHT CONT;'. Example 2 ndicator variables are useful when the possibility exists that a special class of observations will eventually be omitted from a large file analysis. An easy and inexpensive method of producing tables with or without the "special" observations requires planning ahead. oth analyses can be based on summary data that includes the sum of an indicator variable. Tables for two populations are produced from one summary dataset in the example in Figure 2. nitially, all cars were included, but we expected that the final analysis would not include overturn cases. The large file does not need to be re-processed for tables of only non-overturn cases since the summary data includes the sum of an indicator variable (OVERTRN). sing a CLASS variable to separate the overturn cases was not possible for this analysis because of the number of other classification variables required. NEWCONT is the appropriate weight variable for non-overturn tables since it is defined as the difference of FREQ, the total for the entire file, and OVERTRN, the number of overturned cars. Figure 2 PROC SARY DATA=CARS NWAY, CLASS CARGROP YEAR ELT SEX NJRY, VAR OVERTRN, OTPT OT=STAT S= i /* CARS is a large file /* defines subgroups -/* indicator variable (0,1) /* S=count of overturn cases J DATA ddname.stat; SET STAT, NEWCONT = FREQ - OVERTRN, LAEL NEWCONT =-'NON-OVERTRN CONT'; PROC FREQ DATA=ddname.STAT; WEGHT FREQ; /* all cars TALES CARGROP * ELT * NJRY; TTLE ALL CARS; PROC FREQ DATA=ddname.STATi WEGHT NEWCONTi /* non-overturn only TALES CARGROP * ELT * NJRY; TTLE NON-OVERTRN CARS ONLY; 1027

3 Example 3 The purpose of this example is to compare two methods of displaying percentages. PROC CHART is used with an indicator variable and a standard categorical variable. The advantage of displaying percentages as the mean of an indicator variable, is that charts do not include the "other category.~ The charts are based on a dataset containing information about drivers (SEX, ELTSE), with time periods defined by ONTH and an indicator variable named ELT. Figu.res 3a and 3b show the difference between using the mean of ELT (0,1) and. using the standard categorical variable ELTSE (,). The trend for belt usage is clearer using the indicator variable. Example 4 Computing percentages for subgroups is simple using an indicator variable. For small datasets either PROC CHART or a combination of PROC EANS and PROC PLOT can be used. sing PROC SARY to compute the statistics is more efficient for large files. Suppose the difference between males and females was of interest in the data from Example 3. Figures 4a and 4b show two ways of computing and displaying the same statistics. Less coding and processing are required with PROC CHART, while computing the statistics using PROC EANS requires sorting the dataset. Advantages and Disadvantages Some of the advantages of using indicator variables, when the study type is appropriate, are: 1) ncreased flexibility in choosing classification variables to define subgroups when PROC SARY is used to compute statistics. sing an indicator variable may allow an additional CLASS variable. 2) ncreased potential foi generating a variety of tables from summary data of a large file. For example, we could count both the number of overturn and post-crash fire cases for the same car subgroups. 3) Graphic displays of percentages can be based on the mean of an indicator variable. PRoe CHART may be used directly or means may be generated as input for PROe PLOT. oth statistics and a chart can be shown on one page using the HAR statement of PROC CHART. The only disadvantage we have found is the knowledge required to recognize and implement analyses based on statistics calculated from indicator variables. Additional DATA steps may be required before procedures like FREO or PLOT Can produce desired results. A good method for understanding the processing stages is to inclde PROC PRNT after every step (with the OS option) during program testing. using indicator variables effectively takes planning and practice! Conclusion ndicator variables are useful for many of our categorical studies. Efficient analyses of large files are possible when they are used with PROC SARY. The ideas presented here are meant as an introduction to the possibilities created by indicator variables. We hope others will find applications for these techniques when they consider how to analyze their data. References Hamilton, Elizabeth, Single Variable Tabulations for North Carolina Accidents. NC Hlghway Safety Research Center, Chapel Hill, NC, a, J.. and Leininger, C., PROC SHARY as the asis for Efficient Analyses of Large Files." Proc. of the Ninth Annual SG Conference. SAS nst.i tute nc., Cary, NC, SAS is the registered trademark of SAS nstitute nc., Cary, NC For more information, contact: Juliana. a NC Highway Safety Research Center CTP 197A Chapel Hill, NC

4 ELT EAN *"''''* *"''''''' "'*"'... JL *"''''* ** "'''''''* *"''''* AG Figure 311 AR CHART OF EANS "'''''''* ""*. SEP ONTH **>k* "'*"'''' "'''''''''' ***'" """""'* NO *"''''''' "'''''''''' *"'*'" ",,,,,,,oil """'*'" *""*'" PROC CHART OAT A=DR VERS; VAR ONTH TVPE=EAN SVAR=EL T; PEACENTAGE ** 20 + ** 10+ ** *... JL ** * ** "'' A Figure 3b PERCENTAGE AR CHART *'" ** ** "'* SEP ** NOV ** "'*.... ELTSE ONTH PROC CHART DAT A=DR VERS; VAR ELTSE GROP=ONTH TVPE=PERCENT DSCRETE G 00; 1029

5 Figure 4a AR CHART Of EANS C»TH SEX FREQ 6EL T EAN JL ALE ".. 1", t f S00 FEAlE ~l" ** ** AG AlE ~... *** FEALE SEP ALE " ii WH FEAlE ~ * * ** ** AlE ~*** * ** ** FEALE t.w. ii NOV ALE ~. * FEALE 1,, l1li * [) C AlE ~ ***** ** FEALE ELT EAN PROC CHART DATA=DRVERS; HAR SEX / TYPE =EAN SV AR=EL T GROP=ONTH DSCRETE; Figure 4b ELT EAN PLOT Of 6El TEAN'C»TH SYOl S VALlJ Of SEX JL AG SEP C»TH NOV PROC SORT DATA=DRVERS; Y ONTH SEX; PROC EANS DAT A=DR VERS; VAR ELT; OTPT OT=STAT EAN=ELTEAN; PROC PLOT DATA=STAT; PLOT EL TEAN.. ONTH = SEX; 1030

Setting the Percentage in PROC TABULATE

Setting the Percentage in PROC TABULATE SESUG Paper 193-2017 Setting the Percentage in PROC TABULATE David Franklin, QuintilesIMS, Cambridge, MA ABSTRACT PROC TABULATE is a very powerful procedure which can do statistics and frequency counts

More information

PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need

PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need ABSTRACT Paper PO 133 PROC MEANS for Disaggregating Statistics in SAS : One Input Data Set and One Output Data Set with Everything You Need Imelda C. Go, South Carolina Department of Education, Columbia,

More information

It s Proc Tabulate Jim, but not as we know it!

It s Proc Tabulate Jim, but not as we know it! Paper SS02 It s Proc Tabulate Jim, but not as we know it! Robert Walls, PPD, Bellshill, UK ABSTRACT PROC TABULATE has received a very bad press in the last few years. Most SAS Users have come to look on

More information

How to extract suicide statistics by country from the. WHO Mortality Database Online Tool

How to extract suicide statistics by country from the. WHO Mortality Database Online Tool Instructions for users How to extract suicide statistics by country from the WHO Mortality Database Online Tool This guide explains how to access suicide statistics and make graphs and tables, or export

More information

DATA REDUCTION AND SUMMARY STATISTICS IN THE DATA AND PROCEDURE STEPS EFFICIENCY CONSIDERATIONS

DATA REDUCTION AND SUMMARY STATISTICS IN THE DATA AND PROCEDURE STEPS EFFICIENCY CONSIDERATIONS DATA REDUCTON AND SUMMARY STATSTCS N THE DATA AND PROCEDURE STEPS EFFCENCY CONSDERATONS Michael Hein, ARC Professional Services Group Judith Mopsik, ARC Professional Services Group NTRODUCTON One of the

More information

Chapter 6 Creating Reports. Chapter Table of Contents

Chapter 6 Creating Reports. Chapter Table of Contents Chapter 6 Creating Reports Chapter Table of Contents Introduction...115 Listing Data...115 ListDataOptions...116 List Data Titles...118 ListDataVariables...118 Example:CreateaListingReport...119 Creating

More information

ITSMR RESEARCH NOTE EFFECTS OF CELL PHONE USE AND OTHER DRIVER DISTRACTIONS ON HIGHWAY SAFETY: 2006 UPDATE. Introduction SUMMARY

ITSMR RESEARCH NOTE EFFECTS OF CELL PHONE USE AND OTHER DRIVER DISTRACTIONS ON HIGHWAY SAFETY: 2006 UPDATE. Introduction SUMMARY September 2006 ITSMR RESEARCH NOTE EFFECTS OF CELL PHONE USE AND OTHER DRIVER DISTRACTIONS ON HIGHWAY SAFETY: 2006 UPDATE Introduction The use of cell phones and other driver distractions continue to be

More information

DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017

DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017 DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017 USING PROC MEANS The routine PROC MEANS can be used to obtain limited summaries for numerical variables (e.g., the mean,

More information

Lab #3. Viewing Data in SAS. Tables in SAS. 171:161: Introduction to Biostatistics Breheny

Lab #3. Viewing Data in SAS. Tables in SAS. 171:161: Introduction to Biostatistics Breheny 171:161: Introduction to Biostatistics Breheny Lab #3 The focus of this lab will be on using SAS and R to provide you with summary statistics of different variables with a data set. We will look at both

More information

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 3 John Koval Department of Epidemiology and Biostatistics University of Western Ontario What we will do today We will learn to use use SAS to 1. read

More information

Week 6, Week 7 and Week 8 Analyses of Variance

Week 6, Week 7 and Week 8 Analyses of Variance Week 6, Week 7 and Week 8 Analyses of Variance Robyn Crook - 2008 In the next few weeks we will look at analyses of variance. This is an information-heavy handout so take your time reading it, and don

More information

PROC SUMMARY AND PROC FORMAT: A WINNING COMBINATION

PROC SUMMARY AND PROC FORMAT: A WINNING COMBINATION PROC SUMMARY AND PROC FORMAT: A WINNING COMBINATION Alan Dickson - Consultant Introduction: Is this scenario at all familiar to you? Your users want a report with multiple levels of subtotals and grand

More information

Getting it Done with PROC TABULATE

Getting it Done with PROC TABULATE ABSTRACT Getting it Done with PROC TABULATE Michael J. Williams, ICON Clinical Research, San Francisco, CA The task of displaying statistical summaries of different types of variables in a single table

More information

An Introduction to Compressing Data Sets J. Meimei Ma, Quintiles

An Introduction to Compressing Data Sets J. Meimei Ma, Quintiles An Introduction to Compressing Data Sets J. Meimei Ma, Quintiles r:, INTRODUCTION This tutorial introduces compressed data sets. The SAS system compression algorithm is described along with basic syntax.

More information

I AlB 1 C 1 D ~~~ I I ; -j-----; ;--i--;--j- ;- j--; AlB

I AlB 1 C 1 D ~~~ I I ; -j-----; ;--i--;--j- ;- j--; AlB PROC TABULATE: CONTROLLNG TABLE APPEARANCE August V. Treff Baltimore City Public Schools Office of Research and Evaluation ABSTRACT Proc Tabulate provides one, two, and three dimensional tables. Tables

More information

Quality Control of Clinical Data Listings with Proc Compare

Quality Control of Clinical Data Listings with Proc Compare ABSTRACT Quality Control of Clinical Data Listings with Proc Compare Robert Bikwemu, Pharmapace, Inc., San Diego, CA Nicole Wallstedt, Pharmapace, Inc., San Diego, CA Checking clinical data listings with

More information

SAS Training Spring 2006

SAS Training Spring 2006 SAS Training Spring 2006 Coxe/Maner/Aiken Introduction to SAS: This is what SAS looks like when you first open it: There is a Log window on top; this will let you know what SAS is doing and if SAS encountered

More information

Choosing the Right Procedure

Choosing the Right Procedure 3 CHAPTER 1 Choosing the Right Procedure Functional Categories of Base SAS Procedures 3 Report Writing 3 Statistics 3 Utilities 4 Report-Writing Procedures 4 Statistical Procedures 5 Efficiency Issues

More information

Introducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC

Introducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC Paper S1-09-2013 Introducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC ABSTRACT Several years ago, one of my clients was in the business of selling reports to hospitals. He used

More information

Statistics, Data Analysis & Econometrics

Statistics, Data Analysis & Econometrics ST009 PROC MI as the Basis for a Macro for the Study of Patterns of Missing Data Carl E. Pierchala, National Highway Traffic Safety Administration, Washington ABSTRACT The study of missing data patterns

More information

SAS Macros for Grouping Count and Its Application to Enhance Your Reports

SAS Macros for Grouping Count and Its Application to Enhance Your Reports SAS Macros for Grouping Count and Its Application to Enhance Your Reports Shi-Tao Yeh, EDP Contract Services, Bala Cynwyd, PA ABSTRACT This paper provides two SAS macros, one for one grouping variable,

More information

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values Chapter 500 Introduction This procedure produces tables of frequency counts and percentages for categorical and continuous variables. This procedure serves as a summary reporting tool and is often used

More information

Multiple Facts about Multilabel Formats

Multiple Facts about Multilabel Formats Multiple Facts about Multilabel Formats Gwen D. Babcock, ew York State Department of Health, Troy, Y ABSTRACT PROC FORMAT is a powerful procedure which allows the viewing and summarizing of data in various

More information

Lab #9: ANOVA and TUKEY tests

Lab #9: ANOVA and TUKEY tests Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for

More information

ITSMR Research Note KEY FINDINGS. Crash Analyses: Ticket Analyses:

ITSMR Research Note KEY FINDINGS. Crash Analyses: Ticket Analyses: December 2018 KEY FINDINGS Crash Analyses: 2013-2017 Less than 1% of police-reported fatal and personal injury (F & PI) crashes involved the use of a cell phone over the five years, 2013-2017. 15 persons

More information

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System %MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System Rushi Patel, Creative Information Technology, Inc., Arlington, VA ABSTRACT It is common to find

More information

SAS is the most widely installed analytical tool on mainframes. I don t know the situation for midrange and PCs. My Focus for SAS Tools Here

SAS is the most widely installed analytical tool on mainframes. I don t know the situation for midrange and PCs. My Focus for SAS Tools Here Explore, Analyze, and Summarize Your Data with SAS Software: Selecting the Best Power Tool from a Rich Portfolio PhD SAS is the most widely installed analytical tool on mainframes. I don t know the situation

More information

SAS/STAT 13.1 User s Guide. The NESTED Procedure

SAS/STAT 13.1 User s Guide. The NESTED Procedure SAS/STAT 13.1 User s Guide The NESTED Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute

More information

Get into the Groove with %SYSFUNC: Generalizing SAS Macros with Conditionally Executed Code

Get into the Groove with %SYSFUNC: Generalizing SAS Macros with Conditionally Executed Code Get into the Groove with %SYSFUNC: Generalizing SAS Macros with Conditionally Executed Code Kathy Hardis Fraeman, United BioSource Corporation, Bethesda, MD ABSTRACT %SYSFUNC was originally developed in

More information

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma ABSTRACT Today there is more pressure on programmers to deliver summary outputs faster without sacrificing quality. By using just a few programming

More information

ERROR: ERROR: ERROR:

ERROR: ERROR: ERROR: ERROR: ERROR: ERROR: Formatting Variables: Back and forth between character and numeric Why should you care? DATA name1; SET name; if var = Three then delete; if var = 3 the en delete; if var = 3 then

More information

Introduction to STATA

Introduction to STATA Introduction to STATA Duah Dwomoh, MPhil School of Public Health, University of Ghana, Accra July 2016 International Workshop on Impact Evaluation of Population, Health and Nutrition Programs Learning

More information

Example1D.1.sas. * Procedures : ; * 1. print to show the dataset. ;

Example1D.1.sas. * Procedures : ; * 1. print to show the dataset. ; Example1D.1.sas * SAS example program 1D.1 ; * 1. Create a dataset called prob from the following data: ; * age prob lb ub ; * 24.25.20.31 ; * 36.26.21.32 ; * 48.28.24.33 ; * 60.31.28.36 ; * 72.35.32.39

More information

Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC

Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC Paper CC-05 Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC ABSTRACT For many SAS users, learning SQL syntax appears to be a significant effort with a low

More information

Turn In: A copy of the first 50 lines or so of the converted text file.

Turn In: A copy of the first 50 lines or so of the converted text file. STAT 325: Final, Take home Spring 2012 Points: 100 pts Name: We will begin by working with the Fools Five dataset. The Fools Five is a large event held each year in Lewiston, MN. The main event is the

More information

Introduction to SAS: General

Introduction to SAS: General Spring 2019 CJ Anderson Introduction to SAS: General Go to course web-site and click on hsb-datasas There are 5 main working environments (windows) in SAS: Explorer window: Lets you view data in SAS data

More information

The NESTED Procedure (Chapter)

The NESTED Procedure (Chapter) SAS/STAT 9.3 User s Guide The NESTED Procedure (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for the complete manual

More information

Creating Forest Plots Using SAS/GRAPH and the Annotate Facility

Creating Forest Plots Using SAS/GRAPH and the Annotate Facility PharmaSUG2011 Paper TT12 Creating Forest Plots Using SAS/GRAPH and the Annotate Facility Amanda Tweed, Millennium: The Takeda Oncology Company, Cambridge, MA ABSTRACT Forest plots have become common in

More information

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) * OpenStax-CNX module: m39305 1 Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) * Free High School Science Texts Project This work is produced by OpenStax-CNX

More information

STAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS

STAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS STAT 3304/5304 Introduction to Statistical Computing Introduction to SAS What is SAS? SAS (originally an acronym for Statistical Analysis System, now it is not an acronym for anything) is a program designed

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

ABSTRACT INTRODUCTION WHERE TO START? 1. DATA CHECK FOR CONSISTENCIES

ABSTRACT INTRODUCTION WHERE TO START? 1. DATA CHECK FOR CONSISTENCIES Developing Integrated Summary of Safety Database using CDISC Standards Rajkumar Sharma, Genentech Inc., A member of the Roche Group, South San Francisco, CA ABSTRACT Most individual trials are not powered

More information

A Quick and Gentle Introduction to PROC SQL

A Quick and Gentle Introduction to PROC SQL ABSTRACT Paper B2B 9 A Quick and Gentle Introduction to PROC SQL Shane Rosanbalm, Rho, Inc. Sam Gillett, Rho, Inc. If you are afraid of SQL, it is most likely because you haven t been properly introduced.

More information

WORKING WITH PIVOT TABLES

WORKING WITH PIVOT TABLES WORKING WITH PIVOT TABLES Introduction Perhaps the most powerful analytical tool that Excel provides is the PivotTable command, with which one can cross-tabulate data stored in Excel lists. A cross-tabulation

More information

%ANYTL: A Versatile Table/Listing Macro

%ANYTL: A Versatile Table/Listing Macro Paper AD09-2009 %ANYTL: A Versatile Table/Listing Macro Yang Chen, Forest Research Institute, Jersey City, NJ ABSTRACT Unlike traditional table macros, %ANTL has only 3 macro parameters which correspond

More information

A SAS Macro for Balancing a Weighted Sample

A SAS Macro for Balancing a Weighted Sample Paper 258-25 A SAS Macro for Balancing a Weighted Sample David Izrael, David C. Hoaglin, and Michael P. Battaglia Abt Associates Inc., Cambridge, Massachusetts Abstract It is often desirable to adjust

More information

Using SYSTEM 2000 Data in SAS Programs

Using SYSTEM 2000 Data in SAS Programs 23 CHAPTER 4 Using SYSTEM 2000 Data in SAS Programs Introduction 23 Reviewing Variables 24 Printing Data 25 Charting Data 26 Calculating Statistics 27 Using the FREQ Procedure 27 Using the MEANS Procedure

More information

Checking for Duplicates Wendi L. Wright

Checking for Duplicates Wendi L. Wright Checking for Duplicates Wendi L. Wright ABSTRACT This introductory level paper demonstrates a quick way to find duplicates in a dataset (with both simple and complex keys). It discusses what to do when

More information

A Lazy Programmer s Macro for Descriptive Statistics Tables

A Lazy Programmer s Macro for Descriptive Statistics Tables Paper SA19-2011 A Lazy Programmer s Macro for Descriptive Statistics Tables Matthew C. Fenchel, M.S., Cincinnati Children s Hospital Medical Center, Cincinnati, OH Gary L. McPhail, M.D., Cincinnati Children

More information

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file 1 SPSS Guide 2009 Content 1. Basic Steps for Data Analysis. 3 2. Data Editor. 2.4.To create a new SPSS file 3 4 3. Data Analysis/ Frequencies. 5 4. Recoding the variable into classes.. 5 5. Data Analysis/

More information

Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975.

Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975. Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975. SPSS Statistics were designed INTRODUCTION TO SPSS Objective About the

More information

Pruning the SASLOG Digging into the Roots of NOTEs, WARNINGs, and ERRORs

Pruning the SASLOG Digging into the Roots of NOTEs, WARNINGs, and ERRORs Pruning the SASLOG Digging into the Roots of NOTEs, WARNINGs, and ERRORs Andrew T. Kuligowski Nielsen Media Research ABSTRACT "Look at your SASLOG." You hear it from instructors of courses related to the

More information

WORKSHOP: Using the Health Survey for England, 2014

WORKSHOP: Using the Health Survey for England, 2014 WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience

More information

SAS seminar. The little SAS book Chapters 3 & 4. April 15, Åsa Klint. By LD Delwiche and SJ Slaughter. 3.1 Creating and Redefining variables

SAS seminar. The little SAS book Chapters 3 & 4. April 15, Åsa Klint. By LD Delwiche and SJ Slaughter. 3.1 Creating and Redefining variables SAS seminar April 15, 2003 Åsa Klint The little SAS book Chapters 3 & 4 By LD Delwiche and SJ Slaughter Data step - read and modify data - create a new dataset - performs actions on rows Proc step - use

More information

Creating Population Tree Charts (Using SAS/GRAPH Software) Robert E. Allison, Jr. and Dr. Moon W. Suh College of Textiles, N. C.

Creating Population Tree Charts (Using SAS/GRAPH Software) Robert E. Allison, Jr. and Dr. Moon W. Suh College of Textiles, N. C. SESUG 1994 Creating Population Tree Charts (Using SAS/GRAPH Software) Robert E. Allison, Jr. and Dr. Moon W. Suh College of Textiles, N. C. State University ABSTRACT This paper describes a SAS program

More information

Standard Safety Visualization Set-up Using Spotfire

Standard Safety Visualization Set-up Using Spotfire Paper SD08 Standard Safety Visualization Set-up Using Spotfire Michaela Mertes, F. Hoffmann-La Roche, Ltd., Basel, Switzerland ABSTRACT Stakeholders are requesting real-time access to clinical data to

More information

ET01. LIBNAME libref <engine-name> <physical-file-name> <libname-options>; <SAS Code> LIBNAME libref CLEAR;

ET01. LIBNAME libref <engine-name> <physical-file-name> <libname-options>; <SAS Code> LIBNAME libref CLEAR; ET01 Demystifying the SAS Excel LIBNAME Engine - A Practical Guide Paul A. Choate, California State Developmental Services Carol A. Martell, UNC Highway Safety Research Center ABSTRACT This paper is a

More information

DriveTAB is a hardware based, tamper resistant solution to Distracted

DriveTAB is a hardware based, tamper resistant solution to Distracted DriveTAB is a hardware based, tamper resistant solution to Distracted Driving. This Device Management App, working with the installed DriveTAB device and online monitoring, allows administrative control

More information

Clinical Data Visualization using TIBCO Spotfire and SAS

Clinical Data Visualization using TIBCO Spotfire and SAS ABSTRACT SESUG Paper RIV107-2017 Clinical Data Visualization using TIBCO Spotfire and SAS Ajay Gupta, PPD, Morrisville, USA In Pharmaceuticals/CRO industries, you may receive requests from stakeholders

More information

Tasks Menu Reference. Introduction. Data Management APPENDIX 1

Tasks Menu Reference. Introduction. Data Management APPENDIX 1 229 APPENDIX 1 Tasks Menu Reference Introduction 229 Data Management 229 Report Writing 231 High Resolution Graphics 232 Low Resolution Graphics 233 Data Analysis 233 Planning Tools 235 EIS 236 Remote

More information

From An Introduction to SAS University Edition. Full book available for purchase here.

From An Introduction to SAS University Edition. Full book available for purchase here. From An Introduction to SAS University Edition. Full book available for purchase here. Contents List of Programs... xi About This Book... xvii About the Author... xxi Acknowledgments... xxiii Part 1: Getting

More information

Using PROC SQL to Generate Shift Tables More Efficiently

Using PROC SQL to Generate Shift Tables More Efficiently ABSTRACT SESUG Paper 218-2018 Using PROC SQL to Generate Shift Tables More Efficiently Jenna Cody, IQVIA Shift tables display the change in the frequency of subjects across specified categories from baseline

More information

186 Statistics, Data Analysis and Modeling. Proceedings of MWSUG '95

186 Statistics, Data Analysis and Modeling. Proceedings of MWSUG '95 A Statistical Analysis Macro Library in SAS Carl R. Haske, Ph.D., STATPROBE, nc., Ann Arbor, M Vivienne Ward, M.S., STATPROBE, nc., Ann Arbor, M ABSTRACT Statistical analysis plays a major role in pharmaceutical

More information

USING PROC SQL EFFECTIVELY WITH SAS DATA SETS JIM DEFOOR LOCKHEED FORT WORTH COMPANY

USING PROC SQL EFFECTIVELY WITH SAS DATA SETS JIM DEFOOR LOCKHEED FORT WORTH COMPANY USING PROC SQL EFFECTIVELY WITH SAS DATA SETS JIM DEFOOR LOCKHEED FORT WORTH COMPANY INTRODUCTION This paper is a beginning tutorial on reading and reporting Indexed SAS Data Sets with PROC SQL. Its examples

More information

Getting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA

Getting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA SESUG 2012 Paper HW-01 Getting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA ABSTRACT Learning the basics of PROC REPORT can help the new SAS user avoid hours of headaches.

More information

Multiple Graphical and Tabular Reports on One Page, Multiple Ways to Do It Niraj J Pandya, CT, USA

Multiple Graphical and Tabular Reports on One Page, Multiple Ways to Do It Niraj J Pandya, CT, USA Paper TT11 Multiple Graphical and Tabular Reports on One Page, Multiple Ways to Do It Niraj J Pandya, CT, USA ABSTRACT Creating different kind of reports for the presentation of same data sounds a normal

More information

SAS/ACCESS 9.2. Interface to ADABAS Reference. SAS Documentation

SAS/ACCESS 9.2. Interface to ADABAS Reference. SAS Documentation SAS/ACCESS 9.2 Interface to ADABAS Reference SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2008. SAS/ACCESS 9.2 Interface to ADABAS: Reference.

More information

For continuous responses: the Actual by Predicted plot how well the model fits the models. For a perfect fit, all the points would be on the diagonal.

For continuous responses: the Actual by Predicted plot how well the model fits the models. For a perfect fit, all the points would be on the diagonal. 1 ROC Curve and Lift Curve : GRAPHS F0R GOODNESS OF FIT Reference 1. StatSoft, Inc. (2011). STATISTICA (data analysis software system), version 10. www.statsoft.com. 2. JMP, Version 9. SAS Institute Inc.,

More information

Creating and Executing Stored Compiled DATA Step Programs

Creating and Executing Stored Compiled DATA Step Programs 465 CHAPTER 30 Creating and Executing Stored Compiled DATA Step Programs Definition 465 Uses for Stored Compiled DATA Step Programs 465 Restrictions and Requirements 466 How SAS Processes Stored Compiled

More information

Something for Nothing! Converting Plots from SAS/GRAPH to ODS Graphics

Something for Nothing! Converting Plots from SAS/GRAPH to ODS Graphics ABSTRACT Paper 1610-2014 Something for Nothing! Converting Plots from SAS/GRAPH to ODS Graphics Philip R Holland, Holland Numerics Limited, UK All the documentation about the creation of graphs with SAS

More information

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN

A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN Paper 045-29 A Format to Make the _TYPE_ Field of PROC MEANS Easier to Interpret Matt Pettis, Thomson West, Eagan, MN ABSTRACT: PROC MEANS analyzes datasets according to the variables listed in its Class

More information

Chapter 2 Exploring Data with Graphs and Numerical Summaries

Chapter 2 Exploring Data with Graphs and Numerical Summaries Chapter 2 Exploring Data with Graphs and Numerical Summaries Constructing a Histogram on the TI-83 Suppose we have a small class with the following scores on a quiz: 4.5, 5, 5, 6, 6, 7, 8, 8, 8, 8, 9,

More information

Chapter 3 Managing Results in Projects. Chapter Table of Contents

Chapter 3 Managing Results in Projects. Chapter Table of Contents Chapter 3 Managing Results in Projects Chapter Table of Contents Introduction... 55 Managing Projects... 55 CreatingaProject... 55 SavingaProject... 56 SavingaProjectUnderAnotherName... 57 RenamingaFolder...

More information

Building and Updating MDDBs

Building and Updating MDDBs 11 CHAPTER 3 Building and Updating MDDBs Analyzing Your Data 11 Using a Spiral Diagram to Order the Classification Variables 12 MDDB Memory Optimization 15 Stored and Derived Statistics 15 Building an

More information

PDQ-Notes. Reynolds Farley. PDQ-Note 3 Displaying Your Results in the Expert Query Window

PDQ-Notes. Reynolds Farley. PDQ-Note 3 Displaying Your Results in the Expert Query Window PDQ-Notes Reynolds Farley PDQ-Note 3 Displaying Your Results in the Expert Query Window PDQ-Note 3 Displaying Your Results in the Expert Query Window Most of your options for configuring your query results

More information

An introduction to classification and regression trees with PROC HPSPLIT Peter L. Flom Peter Flom Consulting, LLC

An introduction to classification and regression trees with PROC HPSPLIT Peter L. Flom Peter Flom Consulting, LLC Paper AA-42 An introduction to classification and regression trees with PROC HPSPLIT Peter L. Flom Peter Flom Consulting, LLC ABSTRACT Classification and regression trees are extremely intuitive to read

More information

EXAMPLES OF DATA LISTINGS AND CLINICAL SUMMARY TABLES USING PROC REPORT'S BATCH LANGUAGE

EXAMPLES OF DATA LISTINGS AND CLINICAL SUMMARY TABLES USING PROC REPORT'S BATCH LANGUAGE EXAMPLES OF DATA LISTINGS AND CLINICAL SUMMARY TABLES USING PROC REPORT'S BATCH LANGUAGE Rob Hoffman Hoffmann-La Roche, Inc. Abstract PROC REPORT Is a powerful report writing tool which can easily create

More information

Canadian National Longitudinal Survey of Children and Youth (NLSCY)

Canadian National Longitudinal Survey of Children and Youth (NLSCY) Canadian National Longitudinal Survey of Children and Youth (NLSCY) Fathom workshop activity For more information about the survey, see: http://www.statcan.ca/ Daily/English/990706/ d990706a.htm Notice

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users BIOSTATS 640 Spring 2018 Review of Introductory Biostatistics STATA solutions Page 1 of 13 Key Comments begin with an * Commands are in bold black I edited the output so that it appears here in blue Unit

More information

Data to Story Project: SPSS Cheat Sheet for Analyzing General Social Survey Data

Data to Story Project: SPSS Cheat Sheet for Analyzing General Social Survey Data Data to Story Project: SPSS Cheat Sheet for Analyzing General Social Survey Data This guide is intended to help you explore and analyze the variables you have selected for your group project. Conducting

More information

APPENDIX 4 Migrating from QMF to SAS/ ASSIST Software. Each of these steps can be executed independently.

APPENDIX 4 Migrating from QMF to SAS/ ASSIST Software. Each of these steps can be executed independently. 255 APPENDIX 4 Migrating from QMF to SAS/ ASSIST Software Introduction 255 Generating a QMF Export Procedure 255 Exporting Queries from QMF 257 Importing QMF Queries into Query and Reporting 257 Alternate

More information

PharmaSUG Paper IB11

PharmaSUG Paper IB11 PharmaSUG 2015 - Paper IB11 Proc Compare: Wonderful Procedure! Anusuiya Ghanghas, inventiv International Pharma Services Pvt Ltd, Pune, India Rajinder Kumar, inventiv International Pharma Services Pvt

More information

STATA 13 INTRODUCTION

STATA 13 INTRODUCTION STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA

More information

Advanced Visualization using TIBCO Spotfire and SAS

Advanced Visualization using TIBCO Spotfire and SAS PharmaSUG 2018 - Paper DV-04 ABSTRACT Advanced Visualization using TIBCO Spotfire and SAS Ajay Gupta, PPD, Morrisville, USA In Pharmaceuticals/CRO industries, you may receive requests from stakeholders

More information

IBMSPSSSTATL1P: IBM SPSS Statistics Level 1

IBMSPSSSTATL1P: IBM SPSS Statistics Level 1 SPSS IBMSPSSSTATL1P IBMSPSSSTATL1P: IBM SPSS Statistics Level 1 Version: 4.4 QUESTION NO: 1 Which statement concerning IBM SPSS Statistics application windows is correct? A. At least one Data Editor window

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA

Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA ABSTRACT Ditch the Data Memo: Using Macro Variables and Outer Union Corresponding in PROC SQL to Create Data Set Summary Tables Andrea Shane MDRC, Oakland, CA Data set documentation is essential to good

More information

ABSTRACT INTRODUCTION RELEASE 6.06 ENVIRONMENT. I/O Engine Supervisor and the SAS Data Model. DATA Step Processor

ABSTRACT INTRODUCTION RELEASE 6.06 ENVIRONMENT. I/O Engine Supervisor and the SAS Data Model. DATA Step Processor The SAS System Supervisor - A Version 6 Update Merry G. Rabb, SAS Consulting Services nc. Donald J. Henderson, SAS Consulting Services nc. Jeffrey A. Polzin, SAS nstitute nc. ABSTRACT This tutorial updates

More information

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques. Section 2.1 - Introduction Graphs are commonly used to organize, summarize, and analyze collections of data. Using a graph to visually present a data set makes it easy to comprehend and to describe the

More information

Using Cross-Environment Data Access (CEDA)

Using Cross-Environment Data Access (CEDA) 93 CHAPTER 13 Using Cross-Environment Data Access (CEDA) Introduction 93 Benefits of CEDA 93 Considerations for Using CEDA 93 Alternatives to Using CEDA 94 Introduction The cross-environment data access

More information

Going Under the Hood: How Does the Macro Processor Really Work?

Going Under the Hood: How Does the Macro Processor Really Work? Going Under the Hood: How Does the Really Work? ABSTRACT Lisa Lyons, PPD, Inc Hamilton, NJ Did you ever wonder what really goes on behind the scenes of the macro processor, or how it works with other parts

More information

Chapter 6: Modifying and Combining Data Sets

Chapter 6: Modifying and Combining Data Sets Chapter 6: Modifying and Combining Data Sets The SET statement is a powerful statement in the DATA step. Its main use is to read in a previously created SAS data set which can be modified and saved as

More information

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms.

More information

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms.

More information

Bruce Gilsen, Federal Reserve Board

Bruce Gilsen, Federal Reserve Board SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms

More information

SAS/ACCESS 9.3 Interface to ADABAS

SAS/ACCESS 9.3 Interface to ADABAS SAS/ACCESS 9.3 Interface to ADABAS Reference SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc 2011. SAS/ACCESS 9.3 Interface to ADABAS: Reference. Cary,

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

SAS/STAT 13.1 User s Guide. The SURVEYFREQ Procedure

SAS/STAT 13.1 User s Guide. The SURVEYFREQ Procedure SAS/STAT 13.1 User s Guide The SURVEYFREQ Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS

More information

ITSMR Research Note. Crashes Involving Cell Phone Use and Distracted Driving KEY FINDINGS ABSTRACT INTRODUCTION. Crash Analyses.

ITSMR Research Note. Crashes Involving Cell Phone Use and Distracted Driving KEY FINDINGS ABSTRACT INTRODUCTION. Crash Analyses. December 2016 KEY FINDINGS Crash Analyses Less than 1% of police-reported fatal and personal injury (F & PI) crashes involved the use of a cell phone over the five years, 2011-2015. 12 persons were killed

More information

Using Data Transfer Services

Using Data Transfer Services 103 CHAPTER 16 Using Data Transfer Services Introduction 103 Benefits of Data Transfer Services 103 Considerations for Using Data Transfer Services 104 Introduction For many applications, data transfer

More information