Introduction to SAS. Cristina Murray-Krezan Research Assistant Professor of Internal Medicine Biostatistician, CTSC

Similar documents
Stat 302 Statistical Software and Its Applications SAS: Data I/O

ST Lab 1 - The basics of SAS

Level I: Getting comfortable with my data in SAS. Descriptive Statistics

INTRODUCTION TO SAS STAT 525 FALL 2013

SAS Training Spring 2006

DSCI 325: Handout 2 Getting Data into SAS Spring 2017

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2

Reading data in SAS and Descriptive Statistics

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics

STAT 7000: Experimental Statistics I

PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2

Chapter 2: Getting Data Into SAS

SAS 101. Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23. By Tasha Chapman, Oregon Health Authority

A Step by Step Guide to Learning SAS

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

Introduction to SAS. I. Understanding the basics In this section, we introduce a few basic but very helpful commands.

Dr. Barbara Morgan Quantitative Methods

April 4, SAS General Introduction

Writing Programs in SAS Data I/O in SAS

Advanced Regression Analysis Autumn Stata 6.0 For Dummies

Introduction to SAS. Hsueh-Sheng Wu. Center for Family and Demographic Research. November 1, 2010

Using an ICPSR set-up file to create a SAS dataset

Lab #3: Probability, Simulations, Distributions:

Chapter 2 The SAS Environment

SAS Display Manager Windows. For Windows

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

DSCI 325: Handout 10 Summarizing Numerical and Categorical Data in SAS Spring 2017

Data-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp )

Introductory Guide to SAS:

2. Don t forget semicolons and RUN statements The two most common programming errors.

ECLT 5810 SAS Programming - Introduction

STA9750 Lecture I OUTLINE 1. WELCOME TO 9750!

Overview of Data Management Tasks (command file=datamgt.sas)

STAT:5400 Computing in Statistics

ssh tap sas913 sas

STATA 13 INTRODUCTION

Intermediate SAS: Working with Data

The SAS interface is shown in the following screen shot:

Chapter 6: Modifying and Combining Data Sets

Introduction to SAS: General

Lecture 1 Getting Started with SAS

Introduction to SAS Statistical Package

GETTING DATA INTO THE PROGRAM

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

Contents of SAS Programming Techniques

The Ins and Outs of %IF

Applied Regression Modeling: A Business Approach

Surviving SPSS.

Introduction to Stata - Session 2

What you learned so far. Loops & Arrays efficiency for statements while statements. Assignment Plan. Required Reading. Objective 2/3/2018

R in Linguistic Analysis. Week 2 Wassink Autumn 2012

An Introduction to R- Programming

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Producing Summary Tables in SAS Enterprise Guide

IPUMS Training and Development: Requesting Data

Beginning Tutorials. PROC FSEDIT NEW=newfilename LIKE=oldfilename; Fig. 4 - Specifying a WHERE Clause in FSEDIT. Data Editing

Accessing Data and Creating Data Structures. SAS Global Certification Webinar Series

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;

SAS Data Libraries. Definition CHAPTER 26

SAS Studio: A New Way to Program in SAS

SAS Enterprise Guide. Kathleen Nosal Yarmouth Greenway Drive Madison, WI (608)

CMU MSP : SAS FORMATs and INFORMATs Howard Seltman Nov. 7+12, 2018

Introductory SAS example

Introduction to STATA

SAS Online Training: Course contents: Agenda:

Introduction to the SAS System

SAS and Data Management Kim Magee. Department of Biostatistics College of Public Health

An Everyday Guide to Version 7 of the SAS System

An Introduction to Stata Part I: Data Management

STAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS

BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS

INTRODUCTION TO SPSS. Anne Schad Bergsaker 13. September 2018

Methods for Estimating Change from NSCAW I and NSCAW II

Paper S Data Presentation 101: An Analyst s Perspective

Introduction to Stata: An In-class Tutorial

Introduction to Statistical Analyses in SAS

IBMSPSSSTATL1P: IBM SPSS Statistics Level 1

Revision of Stata basics in STATA 11:

Epidemiologic Analysis Using R

If You Need These OBS and These VARS, Then Drop IF, and Keep WHERE Jay Iyengar, Data Systems Consultants LLC

SAS Programming Basics

Please login. Procedures for Data Insight. overview. Take a seat at one of the work stations Login with your HawkID Locate SAS 9.3 in the Start Menu

Introduction to STATA 6.0 ECONOMICS 626

Dynamic Projects in SAS Enterprise Guide How to Create and Use Parameters

Debugging 101 Peter Knapp U.S. Department of Commerce

Intermediate SAS: Statistics

Introduction to SAS and Stata: Data Construction. Hsueh-Sheng Wu CFDR Workshop Series February 2, 2015

A QUICK INTRODUCTION TO STATA

Working with Data in Windows and Descriptive Statistics

Using SAS Macros to Extract P-values from PROC FREQ

For many people, learning any new computer software can be an anxietyproducing

Chapter 1 Introduction to using R with Mind on Statistics

SAS Macro Programming for Beginners

Introduction to R. Andy Grogan-Kaylor October 22, Contents

DSCI 325: Handout 3 Creating and Redefining Variable in SAS Spring 2017

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA

Note on homework for SAS date formats

SAS Programs SAS Lecture 4 Procedures. Aidan McDermott, April 18, Outline. Internal SAS formats. SAS Formats

Using SAS Files CHAPTER 3

Transcription:

Introduction to SAS Cristina Murray-Krezan Research Assistant Professor of Internal Medicine Biostatistician, CTSC cmurray-krezan@salud.unm.edu 20 August 2018

What is SAS? Statistical Analysis System, created in 1976 at NC State for agricultural data analysis A consortium of eight universities with major research funding from the USDA realized the importance of such software. They obtained a grant from NIH to further develop the software, and SAS was born. Widely used in many disciplines including statistics, health sciences, business, and economics.

SAS vs. Other Software Command-driven vs. menu-driven Flexibility comes from using SAS language to write programs. Other software you may use: SPSS Stata Minitab Matlab

Components of SAS Programs DATA steps Here you can: Read in data Manipulate data PROC steps Here you can: Analyze the data Create tables of output

The SAS Environment Five windows: 1. Editor where you write your program (commands). 2. Log log of success of the submitted command. 3. Output display of your statistical results. 4. Explorer a directory for your libraries. 5. Results a listing of all submitted PROC steps.

Where Your Data Will Live Library This is created to refer to permanent data sets(such as your Excel file, or other permanent data set). You specify the directory and then SAS knows where to get the data, or where to put permanent data sets. Use libname statement to name your library and specify the directory.

Types of Data Sets Temporary data sets Stored in the Work library. Created while running your program. Cease to exist when you close SAS. Permanent data sets Stored in a library that you define. Continue to exist after SAS is closed. A data set that you are reading into SAS Can pretty much be any file type. A data set that you export out of SAS Can export into pretty much any file type.

The LIBNAME Statement Example Syntax: libname sasdata C:\cristina\Pharm547 ; Notes: Your library name (called a libref in SAS syntax) must be 8 characters in length. All SAS statements must end with ;.

Ways to Read Your Data into SAS Import Wizard from drop-down menu: Go to File > Import Data Select your data file type Select your data set Give your temporary data set a name SAS can generate the code used to perform the import. Just select a directory where the code should be output. PROC IMPORT is the procedure used in the code. I recommend doing this.

Ways to Read Your Data into SAS (continued) For a very small dataset, or test data, you can input the data in the DATA step using the datalines statement (a.k.a. cards ). Example data mydata; input patid $ age gender $; datalines; A1001 27 F A1002 32 M A1003 29 M A1004 29 F ; run;

Ways to Read Your Data into SAS External data sets (continued) In practice, you will most likely be using Excel or ACCESS files to read into SAS. Use Import wizard, PROC IMPORT, or the infile and input statements. Example: data mydata; infile C:\cristina\Pharm547\dataset.csv dlm =, ; input patid $ age gender $; run;

Ways to Read Your Data into SAS (continued) Large data sets obtained from national databases/registries very often come with programs you can use to read in the data to SAS. BRFSS: Can use the following to create a permanent SAS data set: SASOUT11_LLCP.SAS (this program converts the data from ASCII to SAS7DBAT) LLCP2011.ASC (this is the actual data in ASCII format) Formas11.sas (this formats the data and can put labels over the variable names)

From a Temporary Data Set to a Permanent Data Set All of the previous examples, except BRFSS, created temporary data sets (will not exist after closing SAS). Create a permanent data set for BP_Example which will be stored in the directory you assigned to what you named your library (in this case, sasdata ): data sasdata.bp_example; set bp_example; run;

Vice Versa: From a Permanent Data Set to a Temporary Data Set Create a temporary (or working ) dataset for the BRFSS data, which will now exist in the sasdata library as well as in the Work library. data brfss; set sasdata.brfss2001; run;

Accessing Your Data in SAS To access temporary data sets, use the DATA step, but omit the library name in the front. SAS stores temporary data sets in the library Work. You can refer to the data set as Work.dataset, but by default SAS assumes the Work. -part unless you specify differently, so you don t have to add it.

Accessing Your Data in SAS Examples: data brfss2; set brfss; run; (continued) SAS is thinking of it like: data work.brfss2; set work.brfss; run;

The DATA Step All DATA steps use the following syntax: data <new dataset name>; set <dataset name>; run; NOTE: Every statement ends with a ;. Every step ends with a run;.

Things You Can Do with the Create new variables. DATA Step Change the variable type. e.g., from numeric to character or vice versa. Drop, keep, rename variables. Output to a new temporary or permanent data step. Format the data.

The PROC Step DATA steps are used to read and modify data whereas PROC steps are used to analyze data. All PROC steps use the following syntax: proc <procname> data = <dataset> run;

Commonly Used PROC Steps CONTENT lists the contents of your data set, such as all the variables, whether they are character, numeric, their assigned formats, etc. SORT sorts your data by the variable(s) that you specify. SUMMARY provides basic summary statistics for your data, such as n, means, standard deviations, etc.

Commonly Used PROC Steps (continued) FREQ create counts of categorical variables with specific features and contingency tables (2x2 or greater). Also calculated associated statistics (e.g., chi-square). MEANS calculate means, CIs, etc. of continuous variables and associated statistics. TTEST conduct a two-sample t-test between two continuous variables. REG perform simple or multiple linear regression.

Many PROCS Use the Following Statements by perform commands by certain groups, such as calculate the mean age by gender. class lets SAS know to treat a variable in the class statement as a categorical variable. var tells SAS on which variables to perform requested calculations. output can output the working data set that SAS creates in the background that may contain calculations of interest.

More about the PROC Steps There are many specific statements for each PROC step they are not all the same nor are the always consistent. Don t forget that each statement must end with a semicolon.

Outputting Permanent Data Sets You may want to create a new permanent data set from your original. For example, you may want a subset of variables from the BRFSS data set for your project. You can use the DATA step: data sasdata.mydata; set mydata; run;

Outputting Permanent Data Sets (continued) Use PROC EXPORT (similar to PROC IMPORT). Use the Export Wizard in the drop-down menu under file. NOTE: The DATA step only outputs a SAS data set (in the way I ve shown you). PROC EXPORT or the Export Wizard can output to just about any file type.

A Few More Things about SAS before You Jump in SAS Help is your friend!! Access by either clicking on the book with the question mark or on the Help link and selecting SAS Help and Documentation.

A Few More Things about SAS before You Jump in (continued) The documentation contains almost everything (and often more) that you may want to know, such as all the statements and syntax particular to a given PROC. It also provides detailed discussions about the statistical procedures it uses and how they are implemented. A plethora of information, may be a bit terse for some.

Good SAS Resources UCLA s Statistical Computing website: https://stats.idre.ucla.edu/ Delwiche & Slaughter, The Little SAS Book: A Primer, 5 th Ed. (2012). Cody & Smith, Applied Statistics and the SAS Programming Language, 5 th Ed. (2005). The internet!

Now you are ready to program!