EXAMPLE 2: INTRODUCTION TO SAS AND SOME NOTES ON HOUSEKEEPING PART II - MATCHING DATA FROM RESPONDENTS AT 2 WAVES INTO WIDE FORMAT

Size: px
Start display at page:

Download "EXAMPLE 2: INTRODUCTION TO SAS AND SOME NOTES ON HOUSEKEEPING PART II - MATCHING DATA FROM RESPONDENTS AT 2 WAVES INTO WIDE FORMAT"

Transcription

1 EXAMPLE 2: PART I - INTRODUCTION TO SAS AND SOME NOTES ON HOUSEKEEPING PART II - MATCHING DATA FROM RESPONDENTS AT 2 WAVES INTO WIDE FORMAT USING THESE WORKSHEETS For each of the worksheets you have a corresponding SAS program containing all the SAS code for the example. You will have to write the code into a SAS program file, or run the codes from the program files we have provided. This is the case when we use loops to repeat commands for multiple waves or multiple variables. Although these commands can be used interactively, it is usually easier to run then from a program file. In general, it is good practice to write your commands into program files, and to include comments, so that you can easily reproduce and amend your work at a later time. The worksheets include the SAS code that is necessary to solve the problem at hand, and explanations of how the commands work. The SAS commands are typed in a different font: This font means: this is a SAS command The aim of this course is to become familiar with the SAS commands necessary to set up the dataset. Some of the worksheets include some commands to perform some simple analysis. These parts of the worksheets are optional (skip them if you are lagging behind). DESCRIPTION: This example has two parts. Part I introduces SAS and is aimed at those who have not used it before, or would like to be reminded of the basics. It also includes some guidance on housekeeping when using SAS. Part II combines data on respondents from the first two waves of Understanding Society: Wave 1 ( ) and Wave 2 ( ). There are two different ways of combining the data for analysis: wide and long format. This Example shows how to combine the data into a wide format (for long format see Example 3). This Example also shows how to inspect the data, create labels, deal with missing values, and compute transitions (changes from wave to wave). It shows how to compute change in marital status and in personal income from wave 1 to wave 2. FILES: a_indresp and b_indresp WAVES: 1 and 2 STEPS: Part I: Introduction to SAS and housekeeping 1. Getting started with SAS 2. Housekeeping Part II: Matching data from respondents at 2 waves into wide format 3. Inspect the data at wave 2 and deal with missing values 4. Merge waves 1 and 2 into wide format 1 E x a m p l e 2

2 5. Inspect the data, deal with missing values 6. Identify transitions (e.g. change in marital status and change in pay) NEW COMMANDS: - Housekeeping: libname - Inspecting the data: proc print; proc contents; proc means; proc summary;proc univariate, proc freq; - Dealing with missing values: data if then statement; - Combining waves: data merge - Creating labels: data label, proc format 2 E x a m p l e 2

3 PART I INTRODUCTION TO SAS AND HOUSEKEEPING 2.1 GETTING STARTED WITH SAS To start SAS double-click on the SAS icon or select Start All Programs Applications SAS SAS Windows Output Window Displays output from SAS program. Can view, print, copy or save information in the output window Log Window Reports on progress of SAS procedures (BLUE). Displays error messages (RED) and warnings (GREEN). Results and Explorer window a. Explorer pane is browsing tool for SAS libraries b. Results pane shows a tree-like summary of the output window. You can select, delete, or edit the output before printing, saving, or copying the results Enhanced Program Editor window Used to create, edit and execute SAS programs. 3 E x a m p l e 2

4 Click here to run all the commands you have typed in a SAS program file Alternatively you can use F3 You can get help from SAS in different ways: Select Help from the drop down menu. From here it is possible to explore a range of support options. This includes: How to use the help guide SAS Help and Documentation Getting started with SAS Learning SAS programming SAS on the web About the software Many of the above options will take you to the SAS website where additional information is available to help you carry out analysis. Also note that 1. SAS is not case-sensitive. You can use capital or lowercase letters in your SAS variables. However, when you specify filenames (as you do with the include and file SAS commands), you must type it exactly as it exists in UNIX. 2. One should not abbreviate commands or variable names. In some instances SAS can recognise a typo in syntax. 4 E x a m p l e 2

5 In the program file, there are different methods of telling SAS it is a comment and should be ignored when executing the do file. * also indicates a comment; it can only be used at the beginning of a line while /* text here */ will not run any text within the forward slash-asterisk. 5 E x a m p l e 2

6 2.2 HOUSEKEEPING, REPRODUCIBILITY AND PORTABILITY We start each program file with some housekeeping: Clear the working directory from any data that might already be open. proc datasets lib=work kill nolist memtype=data; quit; In this example the library is the working directory but this could be altered by the user to be any folder they desire. By default SAS saves data files, graphs, estimation results in its working directory unless a different file path is mentioned. First, if this is a shared space we may want to change the working directory to one of our own, may be to a project specific one. We can view SAS output in the output window within the SAS environment, or alternatively we can send it to a text file. This is particularly useful if you want to share results with a collaborator quickly via . However, it means that SAS will not write the output into the output window so some people prefer to delay this step until their program is complete and correct. /* writing all output to a log file */ filename myoutput "define location.txt"; proc printto print=myoutput new; To make our program files more portable (for example from work to home computer, or between computers or collaborators), we should be able to quickly convert the program file so that it will run at different workstations. Most likely the data files will be stored in different locations at the two workstations. Suppose in the program file, we have code written to use many different data files stored at this location. When we change workstations we will have to change the file path every time it is mentioned in the program file. But there is an easier way. We assign a nickname to the file path and write that at the beginning of the program file by assigning a library name. Then throughout the program file we simply refer to this library. If we type, /* defining a library pointing to where files can be permanently stored */ libname dir REPLACE THIS WITH THE DIRECTORY YOU CAN SAVE FILES IN ; Then in this case we have created a library dir which can be used in commands later on in the program and this library name acts as a pointer as to where particular commands should be directed. In this example (and in all examples in this workshop) we will use libraries to define the file path for where we can save files. SAS does not accept paths including spaces. 6 E x a m p l e 2

7 The log file above creates a text file containing the output. We can also open a log file which contains the contents of the log window in SAS. This is not the results or estimations, but information about whether and how the program has run were there any errors for example, or how many observations are in a new dataset. When you are writing your program it is useful to be able to see both the output and log windows so that you can debug errors and see your results interactively, rather than redirecting them off into a separate files that you need to open each time you want to look at something. However, when you want to share your output and analysis with other people you will want to set up a log file, which can be more easily shared. The extension.log means that the file will be written in ascii, which can be read in Word and other programmes. /* writing to a log file */ filename mylog "define location.log"; proc printto log=mylog; If you start a log file and want to close it so that the log is displayed in the log window again, you can close a log file by running the code below. This also applied to the log we started above for the output. proc printto; Note that we have specified the directory and also invoked a proc command to tell SAS to print all the output to our log file mylog. PART II: MATCHING DATA FROM RESPONDENTS AT 2 WAVES (WIDE FORMAT) 2.3 COMBINING WAVE 1 AND WAVE 2 DATA INTO WIDE FORMAT Let s start by inspecting the a_indresp file. We open the data set for wave 1, keeping only the variables of interest: a_hidp pidp a_istrtdaty a_sex a_mastat_dv a_julkjb a_sclfsato a_paygu_dv. The data are stored in the temporary area in built to SAS known as work. /* include the filename in the path */ proc import datafile = "REPLACE THIS WITH THE FILEPATH TO A_INDRESP" out=a_indresp dbms=sav replace; data a_indresp; set a_indresp; keep a_hidp pidp a_istrtdaty a_sex a_mastat_dv a_julkjb a_paygu_dv a_sclfsato; Then we describe the data: list the first 20 observations; describe the data to find out what variables are contained in the dataset; summarise the data to get information on means and sample sizes: 7 E x a m p l e 2

8 /* prints the first 20 observations in the wave 1 dataset */ proc print data=a_indresp (OBS=20); /* describes variables and datafile of wave 1 dataset */ *Add varnum option to get variables in order as they are in the datset rather than alphabetically; proc contents data=a_indresp varnum; /* produces descriptives stats on variables in wave1 data */ proc means data=a_indresp; /* produces dataset of summary statistics */ proc summary data=a_indresp; output out=a_indrespsumm; var a_hidp pidp a_istrtdaty a_sex a_mastat_dv a_julkjb a_paygu_dv a_sclfsato; Note that the person identifier pidp is the only variable that does not include a wave prefix. That is because a respondent s pidp does not change across waves. Looking at the output of the command proc contents you will notice that some numeric variables have attached value labels to each numeric value. In Understanding Society, the convention is to name the value label the same as the variable. In Understanding Society, -1, -2, -3, -7, -8 and -9 all represent missing values the different values tell us why the variable is missing. But SAS considers these to be valid values. So, we will have to convert these into system missing which is a dot (.). There are different ways to recode missing values. For example we can generate a new variable (to avoid overwriting the original variable) and then recode the new variable using an if-then condition depending on whether it should have a valid (usually positive) value using the information in the original variable: data wave1; set a_indresp; a_marstat = a_mastat_dv; if a_marstat le 0 then a_marstat =.; le in SAS code means less than or equal to it is equivalent to the <= symbol, which is another way of writing the same thing. Alternatively one could directly recode the variable using and if-then condition and pointing SAS to original variable itself: Check that the recoding has worked. Finally we sort the data by the individual unique identifier pidp, this is done to ensure the merge (described below) is carried out successfully. proc sort data=wave1; by pidp; 8 E x a m p l e 2

9 Task: Now repeat these steps for wave 2 To combine data into a wide format we use the command "merge". You will need the wave 2 file you just created in the task for this to work. In SAS both datasets that are being merged needed to be sorted on the linking variable, pidp in this case. The one-to-one merge is used when there is a one-to-one correspondence between the observations of the two files, although not all observations may find a match. We use the one-to-one if for example both files include data on individuals. data wave12merge; merge wave1 (in=a) wave2 (in=b); by pidp; We can also check how many individuals are in only wave 1, only in wave 2 and in waves 1 and 2. /* How many interviewed in wave 1 only, wave 2 only, both */ /* creating a variable called inwave to show this */ if a=1 and b=0 then inwave=1; if a=0 and b=1 then inwave=2; if a=1 and b=1 then inwave=3; Then tabulate the variable inwave: proc freq data=wave12merge; tables inwave; Question: How many respondents are interviewed in year 1 of wave 1, and how many in year 1 of wave 2? Hint: In order to check this we must cross tabulate the year of Wave 1 and Wave 2 interviews which is stored in the variables a_istrtdaty and b_istrtdaty. We are interested in transitions across the two waves and so will keep only those people present in both waves. data wave12merge; set wave12merge; where inwave=3; 9 E x a m p l e 2

10 2.4 IDENTIFYING TRANSITIONS CHANGE IN MARITAL STATUS AND PAY Describe the data to find out what variables are in the dataset; summarise the data to get information on means and sample sizes: proc contents varnum; We want to identify all the transitions across marital statuses. For simplicity, we recode marital status into smaller number of categories. To do this we create a new variable called a_marital and b_marital which identifies an individual s marital status in wave A and B respectively. Then we will attach a short description to the variable using the command label var, and make the content of the variable more accessible by labelling its values using proc format. data wave12wide; set wave12merge; if a_mastat_dv=1 then a_marital=1; if a_mastat_dv in (2,3,10) then a_marital=2; if a_mastat_dv in (4,5,7,8) then a_marital=3; if a_mastat_dv in (6,9) then a_marital=4; if b_mastat_dv=1 then b_marital=1; if b_mastat_dv in (2,3,10) then b_marital=2; if b_mastat_dv in (4,5,7,8) then b_marital=3; if b_mastat_dv in (6,9) then b_marital=4; label a_marital='marital status wave 1'; label b_marital='marital status wave 2'; proc format; value marital 1="single" 2="married/civil partnership/living as a couple"; Now we generate a new variable which summarizes different transitions across marital statuses (below we call this variable maritaltransition), and adding labels to the levels of maritaltransition using proc format. data wave12wide; set wave12wide; if a_marital=1 and b_marital=1 then maritaltransition=1; if a_marital=1 and b_marital=2 then maritaltransition=2; if a_marital=2 and b_marital=1 then maritaltransition=3; if a_marital=2 and b_marital=2 then maritaltransition=4; label maritaltransition="change in marital status between wave 1 and 2"; proc format; value maritaltransition 1="stayed single" 2="got married/entered civil partnership/began living as a couple" 3="became single" 4="remained married/in civil partnership/living as a couple"; data wave12wide; 10 E x a m p l e 2

11 set wave12wide; format maritaltransition maritaltransition.; We can use proc univariate and proc freq to get some summary statistics and the frequency distribution, respectively, of this variable maritaltransition; /* summary statistics of the new marital status transition variable */ proc univariate data=wave12wide; var maritaltransition; /* frequency distribution of the new marital status transition variable */ proc freq data=wave12wide; tables maritaltransition; If we want to compute change in pay across the two waves we first inspect the variables a_paygu_dv and b_paygu_dv. We generate new variables which take the value reported in the original variables if it is strictly positive, otherwise it is system missing. We then compute change in pay across the two waves (changepay) and label the variable. data wave12wide; set wave12wide; if (a_paygu_dv>0) then wave1wage=a_paygu_dv; if (b_paygu_dv>0) then wave2wage=b_paygu_dv; changepay=wave2wage-wave1wage; label changepay="change in pay between waves 1 and 2"; Task: using proc univariate, examine the variable changepay. We want to see how change in pay varies by sex and how it varies by type of marital change and sex. proc means will let us do this despite the name it produces many more options than just the mean of a variable in this case we ask SAS to report the total number of observations, the number missing, the sum of these observations, the mean, standard deviation, variance, minimum and maximum values. proc means is especially useful for comparing across groups though in this case, gender and changes in marital status. proc sort data=wave12wide; by a_sex; proc means data=wave12wide maxdec=2 n nmiss sum mean std var min max; class a_sex; var changepay; title ''; title2 '*******************************************'; title3 'change in pay between waves 1 and 2 by sex'; proc means data=wave12wide maxdec=2 n nmiss sum mean std var min max; 11 E x a m p l e 2

12 class a_sex maritaltransition; var changepay; title ''; title2 '*******************************************'; title3 'change in pay between waves 1 and 2 by sex and marital status transitions'; Save the dataset in the directory specified in dir, so that it can be used again. data dir.wave12wide; set wave12wide; The final step is to clean up delete temporary file no longer needed and close the log file if you are using one: /* Clean SAS work directory */ proc datasets lib=work nolist kill memtype=all; /* stop writing to external log and output files and simply write to the SAS windows */ proc printto; 12 E x a m p l e 2

EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT)

EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT) EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT) DESCRIPTION: This example shows how to combine the data on respondents from the first two waves of Understanding Society into

More information

EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT)

EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT) EXAMPLE 3: MATCHING DATA FROM RESPONDENTS AT 2 OR MORE WAVES (LONG FORMAT) DESCRIPTION: This example shows how to combine the data on respondents from the first two waves of Understanding Society into

More information

Use data on individual respondents from the first 17 waves of the British Household

Use data on individual respondents from the first 17 waves of the British Household Applications of Data Analysis (EC969) Simonetta Longhi and Alita Nandi (ISER) Contact: slonghi and anandi; @essex.ac.uk Week 1 Lecture 2: Data Management Use data on individual respondents from the first

More information

SAS Training Spring 2006

SAS Training Spring 2006 SAS Training Spring 2006 Coxe/Maner/Aiken Introduction to SAS: This is what SAS looks like when you first open it: There is a Log window on top; this will let you know what SAS is doing and if SAS encountered

More information

Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview

Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview Using SAS to Analyze CYP-C Data: Introduction to Procedures CYP-C Research Champion Webinar July 14, 2017 Jason D. Pole, PhD Overview SAS overview revisited Introduction to SAS Procedures PROC FREQ PROC

More information

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA

A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA A Guided Tour Through the SAS Windowing Environment Casey Cantrell, Clarion Consulting, Los Angeles, CA ABSTRACT The SAS system running in the Microsoft Windows environment contains a multitude of tools

More information

Reading data in SAS and Descriptive Statistics

Reading data in SAS and Descriptive Statistics P8130 Recitation 1: Reading data in SAS and Descriptive Statistics Zilan Chai Sep. 18 th /20 th 2017 Outline Intro to SAS (windows, basic rules) Getting Data into SAS Descriptive Statistics SAS Windows

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

Chapter 2 The SAS Environment

Chapter 2 The SAS Environment Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,

More information

Contents of SAS Programming Techniques

Contents of SAS Programming Techniques Contents of SAS Programming Techniques Chapter 1 About SAS 1.1 Introduction 1.1.1 SAS modules 1.1.2 SAS module classification 1.1.3 SAS features 1.1.4 Three levels of SAS techniques 1.1.5 Chapter goal

More information

PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2

PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2 PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2 Overview Most assignments will have a companion lab to help you learn the task and should cover similar

More information

Lecture 1 Getting Started with SAS

Lecture 1 Getting Started with SAS SAS for Data Management, Analysis, and Reporting Lecture 1 Getting Started with SAS Portions reproduced with permission of SAS Institute Inc., Cary, NC, USA Goals of the course To provide skills required

More information

Lab #1: Introduction to Basic SAS Operations

Lab #1: Introduction to Basic SAS Operations Lab #1: Introduction to Basic SAS Operations Getting Started: OVERVIEW OF SAS (access lab pages at http://www.stat.lsu.edu/exstlab/) There are several ways to open the SAS program. You may have a SAS icon

More information

Introduction to Stata: An In-class Tutorial

Introduction to Stata: An In-class Tutorial Introduction to Stata: An I. The Basics - Stata is a command-driven statistical software program. In other words, you type in a command, and Stata executes it. You can use the drop-down menus to avoid

More information

2. Don t forget semicolons and RUN statements The two most common programming errors.

2. Don t forget semicolons and RUN statements The two most common programming errors. Randy s SAS hints March 7, 2013 1. Always begin your programs with internal documentation. * ***************** * Program =test1, Randy Ellis, March 8, 2013 ***************; 2. Don t forget semicolons and

More information

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 3 John Koval Department of Epidemiology and Biostatistics University of Western Ontario What we will do today We will learn to use use SAS to 1. read

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

Introductory Guide to SAS:

Introductory Guide to SAS: Introductory Guide to SAS: For UVM Statistics Students By Richard Single Contents 1 Introduction and Preliminaries 2 2 Reading in Data: The DATA Step 2 2.1 The DATA Statement............................................

More information

Introduction to STATA

Introduction to STATA Introduction to STATA Duah Dwomoh, MPhil School of Public Health, University of Ghana, Accra July 2016 International Workshop on Impact Evaluation of Population, Health and Nutrition Programs Learning

More information

Introduction to SAS Statistical Package

Introduction to SAS Statistical Package Instructor: Introduction to SAS Statistical Package Biostatistics 140.632 Lecture 1 Lucy Meoni lmeoni@jhmi.edu Teaching Assistant : Sorina Eftim seftim@jhsph.edu Lecture/Lab: Room 3017 WEB site: www.biostat.jhsph.edu/bstcourse/bio632/default.htm

More information

An Introduction to Stata Exercise 1

An Introduction to Stata Exercise 1 An Introduction to Stata Exercise 1 Anna Folke Larsen, September 2016 1 Table of Contents 1 Introduction... 1 2 Initial options... 3 3 Reading a data set from a spreadsheet... 5 4 Descriptive statistics...

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

STAT 7000: Experimental Statistics I

STAT 7000: Experimental Statistics I STAT 7000: Experimental Statistics I 2. A Short SAS Tutorial Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2009 Peng Zeng (Auburn University) STAT 7000 Lecture Notes Fall 2009

More information

April 4, SAS General Introduction

April 4, SAS General Introduction PP 105 Spring 01-02 April 4, 2002 SAS General Introduction TA: Kanda Naknoi kanda@stanford.edu Stanford University provides UNIX computing resources for its academic community on the Leland Systems, which

More information

GETTING DATA INTO THE PROGRAM

GETTING DATA INTO THE PROGRAM GETTING DATA INTO THE PROGRAM 1. Have a Stata dta dataset. Go to File then Open. OR Type use pathname in the command line. 2. Using a SAS or SPSS dataset. Use Stat Transfer. (Note: do not become dependent

More information

Skill Area 336 Explain Essential Programming Concept. Programming Language 2 (PL2)

Skill Area 336 Explain Essential Programming Concept. Programming Language 2 (PL2) Skill Area 336 Explain Essential Programming Concept Programming Language 2 (PL2) 336.2-Apply Basic Program Development Techniques 336.2.1 Identify language components for program development 336.2.2 Use

More information

Techdata Solution. SAS Analytics (Clinical/Finance/Banking)

Techdata Solution. SAS Analytics (Clinical/Finance/Banking) +91-9702066624 Techdata Solution Training - Staffing - Consulting Mumbai & Pune SAS Analytics (Clinical/Finance/Banking) What is SAS SAS (pronounced "sass", originally Statistical Analysis System) is an

More information

Uncommon Techniques for Common Variables

Uncommon Techniques for Common Variables Paper 11863-2016 Uncommon Techniques for Common Variables Christopher J. Bost, MDRC, New York, NY ABSTRACT If a variable occurs in more than one data set being merged, the last value (from the variable

More information

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200; Randy s SAS hints, updated Feb 6, 2014 1. Always begin your programs with internal documentation. * ***************** * Program =test1, Randy Ellis, first version: March 8, 2013 ***************; 2. Don

More information

Using an ICPSR set-up file to create a SAS dataset

Using an ICPSR set-up file to create a SAS dataset Using an ICPSR set-up file to create a SAS dataset Name library and raw data files. From the Start menu, launch SAS, and in the Editor program, write the codes to create and name a folder in the SAS permanent

More information

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1 TOPIC: Getting Started with Stata Stata 10 Tutorial 1 DATA: auto1.raw and auto1.txt (two text-format data files) TASKS: Stata 10 Tutorial 1 is intended to introduce (or re-introduce) you to some of the

More information

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA Learning objectives: Getting data ready for analysis: 1) Learn several methods of exploring the

More information

Getting started with Minitab 14 for Windows

Getting started with Minitab 14 for Windows INFORMATION SYSTEMS SERVICES Getting started with Minitab 14 for Windows This document provides an introduction to the Minitab (Version 14) statistical package. AUTHOR: Information Systems Services, University

More information

An Introduction to Stata

An Introduction to Stata An Introduction to Stata Instructions Statistics 111 - Probability and Statistical Inference Jul 3, 2013 Lab Objective To become familiar with the software package Stata. Lab Procedures Stata gives us

More information

Introduction to SAS. I. Understanding the basics In this section, we introduce a few basic but very helpful commands.

Introduction to SAS. I. Understanding the basics In this section, we introduce a few basic but very helpful commands. Center for Teaching, Research and Learning Research Support Group American University, Washington, D.C. Hurst Hall 203 rsg@american.edu (202) 885-3862 Introduction to SAS Workshop Objective This workshop

More information

SAS Display Manager Windows. For Windows

SAS Display Manager Windows. For Windows SAS Display Manager Windows For Windows Computers with SAS software SSCC Windows Terminal Servers (Winstat) Linux Servers (linstat) Lab computers DoIT Info Labs (as of June 2014) In all Labs with Windows

More information

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables Jennie Murack You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics

More information

22S:166. Checking Values of Numeric Variables

22S:166. Checking Values of Numeric Variables 22S:1 Computing in Statistics Lecture 24 Nov. 2, 2016 1 Checking Values of Numeric Variables range checks when you know what the range of possible values is for a given quantitative variable internal consistency

More information

STAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS

STAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS STAT 3304/5304 Introduction to Statistical Computing Introduction to SAS What is SAS? SAS (originally an acronym for Statistical Analysis System, now it is not an acronym for anything) is a program designed

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

An Introduction to STATA ECON 330 Econometrics Prof. Lemke

An Introduction to STATA ECON 330 Econometrics Prof. Lemke An Introduction to STATA ECON 330 Econometrics Prof. Lemke 1. GETTING STARTED A requirement of this class is that you become very comfortable with STATA, a leading statistical software package. You were

More information

SAS Application Development Using Windows RAD Software for Front End

SAS Application Development Using Windows RAD Software for Front End Applications Development SAS Application Development Using Windows RAD Software for Front End Zhuan (John) Xu Blue Cross Blue Shield ofiowa & Big Creek Software, Des Moines, IA Abstract This paper presents

More information

Statements with the Same Function in Multiple Procedures

Statements with the Same Function in Multiple Procedures 67 CHAPTER 3 Statements with the Same Function in Multiple Procedures Overview 67 Statements 68 BY 68 FREQ 70 QUIT 72 WEIGHT 73 WHERE 77 Overview Several statements are available and have the same function

More information

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University While your data tables or spreadsheets may look good to

More information

Session 1 Navigation & Administration

Session 1 Navigation & Administration Session 1 Navigation & Administration Agenda Launching ACPM from AC AC/ACPM Integration Basic Navigation Tips in ACPM Administration Overview ACPM Help Launching ACPM from AC Amazing Charts Practice Management

More information

Level I: Getting comfortable with my data in SAS. Descriptive Statistics

Level I: Getting comfortable with my data in SAS. Descriptive Statistics Level I: Getting comfortable with my data in SAS. Descriptive Statistics Quick Review of reading Data into SAS Preparing Data 1. Variable names in the first row make sure they are appropriate for the statistical

More information

Using SAS Files CHAPTER 3

Using SAS Files CHAPTER 3 55 CHAPTER 3 Using SAS Files Introduction to SAS Files 56 What Is a SAS File? 56 Types of SAS Files 57 Using Short or Long File Extensions in SAS Libraries 58 SAS Data Sets (Member Type: Data or View)

More information

Stata: A Brief Introduction Biostatistics

Stata: A Brief Introduction Biostatistics Stata: A Brief Introduction Biostatistics 140.621 2005-2006 1. Statistical Packages There are many statistical packages (Stata, SPSS, SAS, Splus, etc.) Statistical packages can be used for Analysis Data

More information

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide Paper 809-2017 Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide ABSTRACT Marje Fecht, Prowerk Consulting Whether you have been programming in SAS for years, are new to

More information

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO

More information

PROC FORMAT. CMS SAS User Group Conference October 31, 2007 Dan Waldo

PROC FORMAT. CMS SAS User Group Conference October 31, 2007 Dan Waldo PROC FORMAT CMS SAS User Group Conference October 31, 2007 Dan Waldo 1 Today s topic: Three uses of formats 1. To improve the user-friendliness of printed results 2. To group like data values without affecting

More information

Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury

Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury Using a Fillable PDF together with SAS for Questionnaire Data Donald Evans, US Department of the Treasury Introduction The objective of this paper is to demonstrate how to use a fillable PDF to collect

More information

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS. 1 SPSS 11.5 for Windows Introductory Assignment Material covered: Opening an existing SPSS data file, creating new data files, generating frequency distributions and descriptive statistics, obtaining printouts

More information

Chapter 2: Getting Data Into SAS

Chapter 2: Getting Data Into SAS Chapter 2: Getting Data Into SAS Data stored in many different forms/formats. Four categories of ways to read in data. 1. Entering data directly through keyboard 2. Creating SAS data sets from raw data

More information

Matt Downs and Heidi Christ-Schmidt Statistics Collaborative, Inc., Washington, D.C.

Matt Downs and Heidi Christ-Schmidt Statistics Collaborative, Inc., Washington, D.C. Paper 82-25 Dynamic data set selection and project management using SAS 6.12 and the Windows NT 4.0 file system Matt Downs and Heidi Christ-Schmidt Statistics Collaborative, Inc., Washington, D.C. ABSTRACT

More information

Stat 302 Statistical Software and Its Applications SAS: Data I/O

Stat 302 Statistical Software and Its Applications SAS: Data I/O Stat 302 Statistical Software and Its Applications SAS: Data I/O Yen-Chi Chen Department of Statistics, University of Washington Autumn 2016 1 / 33 Getting Data Files Get the following data sets from the

More information

MadCap Software. Index Guide. Flare 2017 r2

MadCap Software. Index Guide. Flare 2017 r2 MadCap Software Index Guide Flare 2017 r2 Copyright 2017 MadCap Software. All rights reserved. Information in this document is subject to change without notice. The software described in this document

More information

Paper SAS Programming Conventions Lois Levin, Independent Consultant, Bethesda, Maryland

Paper SAS Programming Conventions Lois Levin, Independent Consultant, Bethesda, Maryland Paper 241-28 SAS Programming Conventions Lois Levin, Independent Consultant, Bethesda, Maryland ABSTRACT This paper presents a set of programming conventions and guidelines that can be considered in developing

More information

ECONOMICS 452* -- Stata 12 Tutorial 1. Stata 12 Tutorial 1. TOPIC: Getting Started with Stata: An Introduction or Review

ECONOMICS 452* -- Stata 12 Tutorial 1. Stata 12 Tutorial 1. TOPIC: Getting Started with Stata: An Introduction or Review Stata 12 Tutorial 1 TOPIC: Getting Started with Stata: An Introduction or Review DATA: auto1.raw and auto1.txt (two text-format data files) TASKS: Stata 12 Tutorial 1 is intended to introduce you to some

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes Brian E. Lawton Curriculum Research & Development Group University of Hawaii at Manoa Honolulu, HI December 2012 Copyright 2012

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 Revised September 2008 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab After having logged in you have to log

More information

Creating a data file and entering data

Creating a data file and entering data 4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that

More information

Format-o-matic: Using Formats To Merge Data From Multiple Sources

Format-o-matic: Using Formats To Merge Data From Multiple Sources SESUG Paper 134-2017 Format-o-matic: Using Formats To Merge Data From Multiple Sources Marcus Maher, Ipsos Public Affairs; Joe Matise, NORC at the University of Chicago ABSTRACT User-defined formats are

More information

Empirical Asset Pricing

Empirical Asset Pricing Department of Mathematics and Statistics, University of Vaasa, Finland Texas A&M University, May June, 2013 As of May 17, 2013 Part I Stata Introduction 1 Stata Introduction Interface Commands Command

More information

Intermediate SAS: Working with Data

Intermediate SAS: Working with Data Intermediate SAS: Working with Data OIT Technical Support Services 293-4444 oithelp@mail.wvu.edu oit.wvu.edu/training/classmat/sas/ Table of Contents Getting set up for the Intermediate SAS workshop:...

More information

SAS is the most widely installed analytical tool on mainframes. I don t know the situation for midrange and PCs. My Focus for SAS Tools Here

SAS is the most widely installed analytical tool on mainframes. I don t know the situation for midrange and PCs. My Focus for SAS Tools Here Explore, Analyze, and Summarize Your Data with SAS Software: Selecting the Best Power Tool from a Rich Portfolio PhD SAS is the most widely installed analytical tool on mainframes. I don t know the situation

More information

Introduction to the SAS System

Introduction to the SAS System Introduction to the SAS System The SAS Environment The SAS Environment The SAS Environment has five main windows The SAS Environment The Program Editor The SAS Environment The Log: Notes, Warnings and

More information

EKT332 COMPUTER NETWORK

EKT332 COMPUTER NETWORK EKT332 COMPUTER NETWORK LAB 1 INTRODUCTION TO GNU/LINUX OS Lab #1 : Introduction to GNU/Linux OS Objectives 1. Introduction to Linux File System (Red Hat Distribution). 2. Introduction to various packages

More information

ST Lab 1 - The basics of SAS

ST Lab 1 - The basics of SAS ST 512 - Lab 1 - The basics of SAS What is SAS? SAS is a programming language based in C. For the most part SAS works in procedures called proc s. For instance, to do a correlation analysis there is proc

More information

Getting started with UNIX/Linux for G51PRG and G51CSA

Getting started with UNIX/Linux for G51PRG and G51CSA Getting started with UNIX/Linux for G51PRG and G51CSA David F. Brailsford Steven R. Bagley 1. Introduction These first exercises are very simple and are primarily to get you used to the systems we shall

More information

The DATA Statement: Efficiency Techniques

The DATA Statement: Efficiency Techniques The DATA Statement: Efficiency Techniques S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT One of those SAS statements that everyone learns in the first day of class, the DATA statement rarely gets

More information

Introduction to SAS: General

Introduction to SAS: General Spring 2019 CJ Anderson Introduction to SAS: General Go to course web-site and click on hsb-datasas There are 5 main working environments (windows) in SAS: Explorer window: Lets you view data in SAS data

More information

Demonstration Script: Uniplex Business Software Version 8.1 Upgrading to Version 8.1

Demonstration Script: Uniplex Business Software Version 8.1 Upgrading to Version 8.1 Page 1 Introduction Start the Demonstration Manager (Note to the presenter: this session will - by its very nature - be less structured and inclined towards feature comparisons between versions. Please

More information

Essential Skills for Bioinformatics: Unix/Linux

Essential Skills for Bioinformatics: Unix/Linux Essential Skills for Bioinformatics: Unix/Linux SHELL SCRIPTING Overview Bash, the shell we have used interactively in this course, is a full-fledged scripting language. Unlike Python, Bash is not a general-purpose

More information

Microsoft Power Tools for Data Analysis #04: Power Query: Import Multiple Excel Files & Combine (Append) into Proper Data Set.

Microsoft Power Tools for Data Analysis #04: Power Query: Import Multiple Excel Files & Combine (Append) into Proper Data Set. Microsoft Power Tools for Data Analysis #04: Power Query: Import Multiple Excel Files & Combine (Append) into Proper Data Set Table of Contents: Notes from Video:. Goal of Video.... Main Difficulty When

More information

EXST SAS Lab Lab #6: More DATA STEP tasks

EXST SAS Lab Lab #6: More DATA STEP tasks EXST SAS Lab Lab #6: More DATA STEP tasks Objectives 1. Working from an current folder 2. Naming the HTML output data file 3. Dealing with multiple observations on an input line 4. Creating two SAS work

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

WORKSHOP: Using the Health Survey for England, 2014

WORKSHOP: Using the Health Survey for England, 2014 WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience

More information

Four steps in an effective workflow...

Four steps in an effective workflow... Four steps in an effective workflow... 1. Cleaning data Things to do: Verify your data are accurate Variables should be well named Variables should be properly labeled Ask yourself: Do the variables have

More information

Note on homework for SAS date formats

Note on homework for SAS date formats Note on homework for SAS date formats I m getting error messages using the format MMDDYY10D. even though this is listed on websites for SAS date formats. Instead, MMDDYY10 and similar (without the D seems

More information

DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS

DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS Using SPSS Topics addressed today: 1. Accessing data from CMR 2. Starting SPSS 3. Getting familiar with SPSS 4. Entering data 5. Saving data

More information

Procedures. Calls any BMDP program to analyze data in a SAS data set

Procedures. Calls any BMDP program to analyze data in a SAS data set 219 CHAPTER 15 Procedures SAS Procedures Under UNIX 219 SAS Procedures Under UNIX This chapter describes SAS procedures that have behavior or syntax that is specific to UNIX environments. Each procedure

More information

Intro to Stata for Political Scientists

Intro to Stata for Political Scientists Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge

More information

Principles of Biostatistics and Data Analysis PHP 2510 Lab2

Principles of Biostatistics and Data Analysis PHP 2510 Lab2 Goals for Lab2: Familiarization with Do-file Editor (two important features: reproducible and analysis) Reviewing commands for summary statistics Visual depiction of data- bar chart and histograms Stata

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 HG Revised September 2011 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab and elsewhere at UiO. At the computer

More information

STATIST User Manual

STATIST User Manual STATIST 1.4.1 User Manual Jakson Alves de Aquino jalvesaq@gmail.com September 5, 2006 Contents 1 Introduction 1 2 Warnings for Windows users 1 3 Installation from source code 2 4 Invocation 2 5 Menu 3

More information

What is Stata? A programming language to do sta;s;cs Strongly influenced by economists Open source, sort of. An acceptable way to manage data

What is Stata? A programming language to do sta;s;cs Strongly influenced by economists Open source, sort of. An acceptable way to manage data Introduc)on to Stata Training Workshop on the Commitment to Equity Methodology CEQ Ins;tute, Asian Development Bank, and The Ministry of Finance Dili May-June, 2017 What is Stata? A programming language

More information

Beginning Tutorials. Introduction to SAS/FSP in Version 8 Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California

Beginning Tutorials. Introduction to SAS/FSP in Version 8 Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California Introduction to SAS/FSP in Version 8 Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California ABSTRACT SAS/FSP is a set of procedures used to perform full-screen interactive

More information

Getting it Done with PROC TABULATE

Getting it Done with PROC TABULATE ABSTRACT Getting it Done with PROC TABULATE Michael J. Williams, ICON Clinical Research, San Francisco, CA The task of displaying statistical summaries of different types of variables in a single table

More information

An Introduction to Stata Part I: Data Management

An Introduction to Stata Part I: Data Management An Introduction to Stata Part I: Data Management Kerry L. Papps 1. Overview These two classes aim to give you the necessary skills to get started using Stata for empirical research. The first class will

More information

Writing Programs in SAS Data I/O in SAS

Writing Programs in SAS Data I/O in SAS Writing Programs in SAS Data I/O in SAS Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Writing SAS Programs Your SAS programs can be written in any text editor, though you will often want

More information

Introduction to Stata - Session 2

Introduction to Stata - Session 2 Introduction to Stata - Session 2 Siv-Elisabeth Skjelbred ECON 3150/4150, UiO January 26, 2016 1 / 29 Before we start Download auto.dta, auto.csv from course home page and save to your stata course folder.

More information

General Tips for Working with Large SAS datasets and Oracle tables

General Tips for Working with Large SAS datasets and Oracle tables General Tips for Working with Large SAS datasets and Oracle tables 1) Avoid duplicating Oracle tables as SAS datasets only keep the rows and columns needed for your analysis. Use keep/drop/where directly

More information

Introducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC

Introducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC Paper S1-09-2013 Introducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC ABSTRACT Several years ago, one of my clients was in the business of selling reports to hospitals. He used

More information

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2 SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Department of MathemaGcs and StaGsGcs Phone: 4-3620 Office: Parker 364- A E- mail: carpedm@auburn.edu Web: hup://www.auburn.edu/~carpedm/stat6110

More information

EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL

EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL DESCRIPTION: The objective of this example is to illustrate how external

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

participant is is a different . Below 2 means that both department each store and a 1-10 scalee was not available the store petstore cooperative

participant is is a different . Below 2 means that both department each store and a 1-10 scalee was not available the store petstore cooperative SAS: Data Entry & Variable Specification Five windows available in the SAS screen: 1. Explorer internal version of Windows Explorer 2. Results for navigating among results output 3. Output analysis results

More information