SAS Training Spring 2006

Similar documents
STAT 7000: Experimental Statistics I

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Chapter 2 The SAS Environment

STAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS

Introduction to SAS: General

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

Lecture 1 Getting Started with SAS

Introductory Guide to SAS:

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2

Lab #3. Viewing Data in SAS. Tables in SAS. 171:161: Introduction to Biostatistics Breheny

PART I: USING SAS FOR THE PC AN OVERVIEW 1.0 INTRODUCTION

Introduction to SAS. I. Understanding the basics In this section, we introduce a few basic but very helpful commands.

EXAMPLE 2: INTRODUCTION TO SAS AND SOME NOTES ON HOUSEKEEPING PART II - MATCHING DATA FROM RESPONDENTS AT 2 WAVES INTO WIDE FORMAT

Intermediate SAS: Working with Data

ST Lab 1 - The basics of SAS

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

Introductory SAS example

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

Using an ICPSR set-up file to create a SAS dataset

Introduction to SAS. Cristina Murray-Krezan Research Assistant Professor of Internal Medicine Biostatistician, CTSC

Creating a data file and entering data

Introduction. How to Use this Document. What is SAS? Launching SAS. Windows in SAS for Windows. Research Technologies at Indiana University

A Step by Step Guide to Learning SAS

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide

PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2

INTRODUCTION TO SAS STAT 525 FALL 2013

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA

Lab #1: Introduction to Basic SAS Operations

IBMSPSSSTATL1P: IBM SPSS Statistics Level 1

(on CQUEST) A.L. Gibbs

Basic concepts and terms

Mr. Kongmany Chaleunvong. GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009

Applied Regression Modeling: A Business Approach

(on CQUEST) A.L. Gibbs

Dr. Barbara Morgan Quantitative Methods

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT

DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS

STAT:5400 Computing in Statistics

AURA ACADEMY SAS TRAINING. Opposite Hanuman Temple, Srinivasa Nagar East, Ameerpet,Hyderabad Page 1

Some Basics of CQUEST

STA9750 Lecture I OUTLINE 1. WELCOME TO 9750!

Economics 145 Fall 2009 Howell Getting Started with Stata

There are 3 main windows, and 3 main types of files, in SPSS: Data, Syntax, and Output.

Contents of SAS Programming Techniques

UNIT 4. Research Methods in Business

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;

Opening a Data File in SPSS. Defining Variables in SPSS

SAS Online Training: Course contents: Agenda:

Base and Advance SAS

STATA 13 INTRODUCTION

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Introduction to Stata: An In-class Tutorial

Stata: A Brief Introduction Biostatistics

User Services Spring 2008 OBJECTIVES Introduction Getting Help Instructors

Getting started with Minitab 14 for Windows

General Guidelines: SAS Analyst

INTRODUCTION to. Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010

Advanced Regression Analysis Autumn Stata 6.0 For Dummies

Introduction to Stata Toy Program #1 Basic Descriptives

Reading data in SAS and Descriptive Statistics

1 Introduction to Using Excel Spreadsheets

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

Statements with the Same Function in Multiple Procedures

April 4, SAS General Introduction

Chapter One: Getting Started With IBM SPSS for Windows

DSCI 325: Handout 9 Sorting and Options for Printing Data in SAS Spring 2017

4. Descriptive Statistics: Measures of Variability and Central Tendency

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

2. Don t forget semicolons and RUN statements The two most common programming errors.

IPUMS Training and Development: Requesting Data

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

Key Strokes To make a histogram or box-and-whisker plot: (Using canned program in TI)

Chapter 1: Introduction to SAS

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1

ECLT 5810 SAS Programming - Introduction

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

1 Downloading files and accessing SAS. 2 Sorting, scatterplots, correlation and regression

Introduction to SPSS on the Macintosh. Scott Patterson,Ph.D. Broadcast and Electronic Communication Arts San Francisco State University.

Homework 1 Excel Basics

WORKSHOP: Using the Health Survey for England, 2014

WELCOME! Lecture 3 Thommy Perlinger

Chapter 2 Assignment (due Thursday, April 19)

Lab #9: ANOVA and TUKEY tests

SPSS 11.5 for Windows Assignment 2

SPSS Instructions and Guidelines PSCI 2300 Intro to Political Science Research Dr. Paul Hensel Last updated 10 March 2018

Introduction to STATA

How to program with Matlab (PART 1/3)

Course Code: SPSS19 Introduction to IBM SPSS Statistics

The Very Basics of the R Interpreter

INSTRUCTIONS FOR USING MICROSOFT EXCEL PERFORMING DESCRIPTIVE AND INFERENTIAL STATISTICS AND GRAPHING

Excel Tips and FAQs - MS 2010

Introduction to Stata. Written by Yi-Chi Chen

Data-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp )

Writing Programs in SAS Data I/O in SAS

Step-by-Step Guide to Basic Genetic Analysis

Stata v 12 Illustration. First Session

Transcription:

SAS Training Spring 2006 Coxe/Maner/Aiken Introduction to SAS: This is what SAS looks like when you first open it: There is a Log window on top; this will let you know what SAS is doing and if SAS encountered any errors or problems with your syntax. (In this class, you will be asked to print out the Log file as well as the Output for your homework if you use SAS.) The lower window is called the Editor; this is where you will edit your syntax. You can see that there is also a bar at the bottom for a window called Output, which will contain (surprise!) the output of any analyses that you run. Right now it doesn t contain anything because we haven t run any analyses.

Opening, editing, and saving syntax: As in SPSS, you can open saved syntax files (which have extension.sas) or type syntax directly into the Editor window. To open a saved.sas file, select File then Open, find your file on the computer and select it, and click Open. Similarly, to save a syntax file you have written or edited, select File then Save As

Running syntax: The syntax file looks like this when it has been opened. To run the syntax, select it and press the little running man icon ( ) on the right side of the toolbar or select Run then Submit from the pull-down menus.

After running the syntax: Now you can see the output (SAS has selected the Output window to be in the front). This will contain all of your output except graphs (which are in a new window called Graph1 that you can select from the window bar at the bottom of SAS, see arrow). You can scroll through the output using the scroll bar at the right, or using the Page Up and Page Down keys on the keyboard. You can cycle through the graphs (there are 2 of them in this example) the same way.

Printing SAS output and graphs: When you print the Output, all pages of the output will be printed at once. They are often formatted oddly and use up a lot of pages. It is usually easier to select all of the output and copy and paste it into a word processing program (like Microsoft Word), so you can adjust the margins and page orientation to make the output fit better on the pages (and save a few trees). When you print from the graphs window, only the visible graph will be printed. This is very important! If there are several graphs from your analysis, you will have to scroll through, selecting and printing each one. Here is what it looks like when you first select the Graph window. This is the first graph, a histogram. From here, you can select File then Print to print this graph. Remember that even though there are several graphs in this analysis, SAS will only print the one you are looking at.

To print the second graph, use the Page Down button or the scroll bar to show the second graph, which is a scatterplot. Select File then Print to print the second graph.

General Syntax and Other Rules SAS is not case-sensitive. It does not differentiate between upper and lowercase. Always include a after each block of code. This enables you to run particular blocks of code within your program, as opposed to running the whole program each time. Remember, SAS will not execute a command unless it is followed by a RUN. Always include a semicolon after each line of code. When debugging your program, always check first to make sure that you did not omit a semi-colon. Four times out of five, that will be the problem. For each procedure (proc) include the name of the dataset for which you want the procedure run (data=yourdata). This is important when you have multiple datasets open in SAS. Use comments frequently. They will help you greatly when you return to a program after spending time away from it. There are 2 ways to include a comment in your program. The first is simply to place an asterisk (*) at the beginning of the line. When it sees an asterisk, SAS will ignore everything until it reaches semicolon. The other way is to place a forward slash and asterisk (/*) at the beginning of the line. When SAS sees this it will ignore everything (including semi-colons) until it reaches an asterisk and slash (*/). This second method of commenting is useful if you want to make SAS ignore a large block of code that includes semicolons. See examples of both methods in the program at the end of the handout. Use titles they will help you organize your output. Remember that titles are always enclosed within single quotation marks. You may use several titles and layer them (title, title2, title3, etc.) Each subsequent title will appear under the one before it. To replace a title simply use a new title line with the new title you want it will automatically overwrite the old one. To get ride of a title line simply put the title line with no title (e.g., title2; ). Data-related Syntax The datastep is the portion of your program in which you read in data from an external file. It consists of a DATA statement, which names your working dataset within SAS, an INFILE statement that tells SAS where your external file is and what it is named, and an INPUT statement, which tells SAS what variables to read in, and what columns each variable is located in. Note that variable names and dataset names cannot exceed 8 characters. You may read in alphanumeric data (contains both letters and numbers) by designating a variable as a string variable. You do this by including a dollar sign ($) after the variable name, and before the column designation (e.g., SEX $ 1-4). There are many ways to manage and manipulate your data in SAS. You can sort your data (using the SORT statement), merge different datasets (using the MERGE statement), create sub-datasets (using the DATA and SET statements), and print your data in the output (using the PROC PRINT command). Examples and explanations are included within the sample program.

TITLE 'FALL 93 GRADES, 230 -- SAS PSY531(ex1.sas)'; TITLE2 Regression Class Example 1 ; OPTIONS NOCENTER LINESIZE=80 PAGESIZE=44; ** Here comes the datastep. Notice that I am using comments to annotate this program; DATA grades; INFILE 'a:\grade230.txt'; INPUT id 2-3 sex 8 T1 31-32 PS1 34 PS2 36 PS3 38 PS4 40 PS5 42 PS6 44 PS7 46 T2 48-49 PS8 51 PS9 53 T3 55-57 PS10 59 PS11 61 PS12 63 PS13 65 T4 66-69; LABEL t1='test 1' t2='test 2' t3='test 3' t4='test 4'; ** Notice that I just put a RUN after the datastep include a run after each block of code; /* If the variable sex contained letters (male, fem instead of 1,2) we would need to designate it as a string variable. See the following code: INPUT id 2-3 sex $ 8-12 ; Notice also that I ve used the second method for commenting here. SAS ignores Everything it just saw until it sees the star slash */ * Always use PROC PRINT to look at your data, making sure it has been read in properly; PROC PRINT DATA=grades; * Here we create value labels with PROC FORMAT; PROC FORMAT DATA=grades; VALUE gender 1='Male' 2='Female'; * Here we use PROC FREQ to get frequencies for the different values of sex; * Note that PROC FREQ requires the use of TABLES to designate the variables you want frequencies for; * Note also that we use the FORMAT statement to include value labels in the output; PROC FREQ DATA=grades; TABLES sex; FORMAT sex gender.; /* PROC FREQ can also give you cross tabulations. Imagine that you had a variable called COND (condition), and you wanted to know how many males and females were in each condition. You could use the following code PROC FREQ DATA=grades; TABLES cond*sex; */ * PROC UNIVARIATE supplies univariate statistics. Use the VAR command to tell SAS what variables you want stats for. Most procedures use the VAR command;

PROC UNIVARIATE DATA=grades; VAR t3 t4; * Using PROC FREQ to get frequency distributions for tests 3 and 4; PROC FREQ DATA=grades; TABLES t3 t4; *Using proc gchart and gplot to make charts and scatterplots; *Note the use of title3 SAS will place this under your first two titles; PROC GCHART DATA=grades; VBAR t3; TITLE3 'Historgram of test 3'; PROC GPLOT DATA=grades; PLOT t4 * t3; TITLE3 'Scatterplot of test 4 against test 3'; *Let s turn off the title3; TITLE3; *PROC CORR generates a correlation matrix of the variables you specify; * it also supplies means and standard deviations for these variables; PROC CORR DATA=grades; VARIABLES T1 T2 T3 T4; * Note the NOMISS keyword in the next correlation procedure this requests casewise/listwise deletion of missing data; PROC CORR DATA=GRADES NOMISS; VARIABLES t1 t2 t3 t4; *Using the DATA and SET commands to add or recode variables, and to create subdatasets; * First create a new dataset called GRADES2, and recode females from a 2 to a 0; DATA grades2; SET grades; IF sex=2 THEN sex=0; * Create 2 subdatasets one for males and one for females; DATA female; SET grades2; If sex=0; DATA male; SET grades2; If sex=1;

* Create a new dataset, grades3, that contains only each person s sex and their test scores use KEEP= ; * Note that KEEP immediately follows SET, no semi-colon in between; DATA grades3; SET grades2 (KEEP = sex t1 t2 t3 t4); * PROC SORT is very useful it allows you to run separate analysis for subsets of your data without creating separate datasets. You must first sort by the variable, the values for which you want separate analysis. In the following example I sort by sex, then request means (and SDs) for the 4 test grades, for males and females separately. This procedure generalizes to many procedures (e.g., running separate regression analyses for males and females; PROC SORT DATA=grades2; BY sex; PROC MEANS DATA=grades; VAR t1 t2 t3 t4; BY sex; * Alternatively, I could have requested means for the sex-specific datasets I created earlier; PROC MEANS DATA=male; VAR t1 t2 t3 t4; PROC MEANS DATA=female; VAR t1 t2 t3 t4; * Merging 2 datasets Imagine that when you entered your data, you had 2 research assistants entering 2 different questionnaires (from the same set of subjects) into separate data files. You could read them in separately, and then merge them into a single dataset in SAS using the MERGE statement. You must have a common case identification variable in both datasets you will use that variable to identify subjects. You must sort both datasets by the case id variable. You then create a new dataset that combines the originals. Be careful, if you have variables that are named the same in each dataset, variables will overwrite each other. The following is an example; DATA ques1; INFILE a:\ questionnaire1.txt ; INPUT id 1-3 q1 4 q2 5 q3 6 q4 7 q5 8; DATA ques2; INFILE a:\ questionnaire2.txt ; INPUT id 1-3 q6 4 q7 5 q8 6 q9 7 q10 8; PROC SORT DATA=ques1; BY id; PROC SORT DATA=ques2; BY id; DATA combined; MERGE ques1 ques2; BY ID; * Writing out an ascii dataset from SAS with the PUT statement. If you want to write out a dataset (or part of one), you can use the following code. The following code writes out a file called NEWDATA with all 10 questionnaire items. The output file will have the suffix.dat. Once you have run the code, check the LOG to see where SAS put the file (usually in the SAS folder in PROGRAM FILES on your C: drive). DATA outfile; SET combined; FILE newdata; PUT @1 id 1-3 q1 4 q2 5 q3 6 q4 7 q5 8 q6 9 q7 10 q8 11 q9 12 q10 13;