STAT 503 Fall Introduction to SAS

Similar documents
STAT:5400 Computing in Statistics

EXST3201 Mousefeed01 Page 1

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

STAT:5201 Applied Statistic II

Laboratory Topics 1 & 2

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2

Land Cover Stratified Accuracy Assessment For Digital Elevation Model derived from Airborne LIDAR Dade County, Florida

Baruch College STA Senem Acet Coskun

* Sample SAS program * Data set is from Dean and Voss (1999) Design and Analysis of * Experiments. Problem 3, page 129.

SAS/STAT 14.2 User s Guide. The SIMNORMAL Procedure

CREATING THE DISTRIBUTION ANALYSIS

Lab 3 (80 pts.) - Assessing the Normality of Data Objectives: Creating and Interpreting Normal Quantile Plots

The SIMNORMAL Procedure (Chapter)

Reading data in SAS and Descriptive Statistics

Analysis of variance and regression. November 13, 2007

Analysis of variance and regression. November 13, 2007

EXST SAS Lab Lab #6: More DATA STEP tasks

Centering and Interactions: The Training Data

Chapter 5 INSET Statement. Chapter Table of Contents

050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA

Paper ODS, YES! Odious, NO! An Introduction to the SAS Output Delivery System

Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview

Chapters 18, 19, 20 Solutions. Page 1 of 14. Demographics from COLLEGE Data Set

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

SAS Example A10. Output Delivery System (ODS) Sample Data Set sales.txt. Examples of currently available ODS destinations: Mervyn Marasinghe

050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA

Measures of Dispersion

Chapter 1. Looking at Data-Distribution

SPSS. (Statistical Packages for the Social Sciences)

STAT 7000: Experimental Statistics I

MATH NATION SECTION 9 H.M.H. RESOURCES

Introduction to Stata Toy Program #1 Basic Descriptives

Applied Regression Modeling: A Business Approach

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

15 Wyner Statistics Fall 2013

Repeated Measures Part 4: Blood Flow data

Assignment 5.5. Nothing here to hand in

Eksamen ERN4110, 6/ VEDLEGG SPSS utskrifter til oppgavene (Av plasshensyn kan utskriftene være noe redigert)

Processing SAS Data Sets

Stata v 12 Illustration. First Session

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

INTRODUCTION TO SAS STAT 525 FALL 2013

Lab #1: Introduction to Basic SAS Operations

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Introductory SAS example

Stat 302 Statistical Software and Its Applications SAS: Data I/O

Exercise 1: Introduction to Stata

My objective is twofold: Examine the capabilities of MP Connect and apply those capabilities to a real-world application.

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introductory Guide to SAS:

Assignment 0. Nothing here to hand in

CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

Introduction to the SAS System

Factorial ANOVA with SAS

Principles of Biostatistics and Data Analysis PHP 2510 Lab2

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Normal Plot of the Effects (response is Mean free height, Alpha = 0.05)

5b. Descriptive Statistics - Part II

22S:166. Checking Values of Numeric Variables

Chapter 6: DESCRIPTIVE STATISTICS

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

Factorial ANOVA. Skipping... Page 1 of 18

1 Overview of Statistics; Essential Vocabulary

Chapter 2 Describing, Exploring, and Comparing Data

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Stat 302 Statistical Software and Its Applications SAS: Distributions

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Univariate descriptives

Improving the Post-Smoothing of Test Norms with Kernel Smoothing

An Introduction to SAS University Edition

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS

An introduction to SPSS

Example how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler

Averages and Variation

Contents of SAS Programming Techniques

APPENDIX. Appendix 2. HE Staining Examination Result: Distribution of of BALB/c

EXST SAS Lab Lab #8: More data step and t-tests

Table of Contents (As covered from textbook)

STA 570 Spring Lecture 5 Tuesday, Feb 1

Statistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals.

Chapter 5: The beast of bias

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Chapter 2 Exploring Data with Graphs and Numerical Summaries

Exploring Data. This guide describes the facilities in SPM to gain initial insights about a dataset by viewing and generating descriptive statistics.

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics

Chapter 3 - Displaying and Summarizing Quantitative Data

Ch 1 : Descriptive Statistics

Chapter 2 The SAS Environment

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

The SAS interface is shown in the following screen shot:

Lecture 6: Chapter 6 Summary

Cluster Randomization Create Cluster Means Dataset

Visualizing univariate data 1

Quantitative - One Population

WINKS SDA Windows KwikStat Statistical Data Analysis and Graphs Getting Started Guide

Transcription:

Getting Started Introduction to SAS 1) Download all of the files, sas programs (.sas) and data files (.dat) into one of your directories. I would suggest using your H: drive if you are using a computer at Purdue or goremote. 2) Run SAS There are two ways to run SAS on an ITAP computer. 1) On ITAP Computers: Select All Programs Standard Software Statistical Packages SAS 9.3b. 2) Using goremote: Go to https://goremote.itap.purdue.edu/citrix/xenapp/auth/login.aspx and log in using your career account. Find the program by selecting Standard Software Statistical Packages SAS 9-3 64-BIT. 3) No matter how your run the program, you should see a screen like the following: You may want to look at the Start Guides, but unless you have used a previous version of SAS, looking at the Output Changes will not be beneficial. Click on Close. 4) To load up your sas program, go to File Open Program and locate and open the program of interest, say start.sas. 5) You will see the program in the Editor window. To run the program click the running man icon indicated by the red box. You have to be sure that the Editor window is active. 1

Getting the data into SAS There are two ways of getting your data into SAS; you can either enter the data by hand or you can load from a pre-existing file. Sample code for entering data by hand: *from data in the code this example is Example 2.2.4, liter size of sows; data litter; input piglets sows @@; datalines; 5 1 6 0 7 2 8 3 9 3 10 9 11 8 12 5 13 3 14 2 ; proc print data=litter; The colors of the file indicate different things: blue: keywords from the program black: items that you enter green: comments yellow highlight: inputted data bluegreen: numbers in code purple: titles and text @@: data is in rows instead of columns Additional notes common to all SAS source codes: 1) Each line always ends in a semicolon, ; 2) 6) You should always have the statement quit; at the end of each script. Sample code loaded from pre-existing file: *from a file; data liter2; infile 'H:\My Documents\Stat 503\pigs.dat'; input piglets sows; proc print data=liter2; Creating a Histogram The following code creates the same histogram as in Fig. 2.2.4. data liter3; infile 'H:\My Documents\Stat 503\Ex.2.2.4.dat'; input piglets; proc print data=liter3; title1 ''; *creates a histogram using the classes 5, 6,..., 15; proc univariate data=liter3 noprint; histogram piglets / endpoints = 5 to 15 by 1; 2

If you would want to change the classes, you would change what the classes are, you would change the numbers after the keyword endpoints. The first two numbers indicate the range and the last indicates the the size of the bin. For example, if you would want bins from 10 50 with a width of 5, you would type histogram piglets / endpoints = 10 to 50 by 5; Manipulating data sets To following code exemplifies how you copy one set to another one and add another variable. data a2; set liter3; original = 'yes'; proc print data=a2; output: Obs piglets original 1 5 yes 2 7 yes 3 7 yes 4 8 yes 5 8 Yes To combine two data sets, use the following code: data a3; piglets = 15; output; piglets = 16; output; data a4; set a2 a3; proc print data=a4; This will add the following rows to the above data set: Obs piglets original 37 15 38 16 3

output from univariate Finally, we have discussed in class the descriptive stattistics, the are in proc univariate. In addition, we can generate the histogram (automatically) and the QQ-Plot (Chapter 4). These last two plots are used to determine if the distribution is normal. proc univariate data=liter3; var piglets; /*gives many numerical summaries for the variable piglets*/ histogram piglets/normal; /*histogram*/ qqplot piglets; /*qqplot*/ Output: Variable: piglets Moments N 36 Sum Weights 36 Mean 10.4166667 Sum Observations 375 Std Deviation 1.99105141 Variance 3.96428571 Skewness -0.4792508 Kurtosis 0.48822726 Uncorrected SS 4045 Corrected SS 138.75 Coeff Variation 19.1140935 Std Error Mean 0.3318419 Basic Statistical Measures Location Variability Mean 10.41667 Std Deviation 1.99105 Median 10.50000 Variance 3.96429 Mode 10.00000 Range 9.00000 Interquartile Range 2.50000 Tests for Location: Mu0=0 Test Statistic p Value Student's t t 31.39045 Pr > t <.0001 Sign M 18 Pr >= M <.0001 Signed Rank S 333 Pr >= S <.0001 4

Quantiles (Definition 5) Quantile Estimate 100% Max 14.0 99% 14.0 95% 14.0 90% 13.0 75% Q3 12.0 50% Median 10.5 25% Q1 9.5 10% 8.0 5% 7.0 1% 5.0 0% Min 5.0 Extreme Observations Lowest Highest Value Obs Value Obs 5 1 13 32 7 3 13 33 7 2 13 34 8 6 14 35 8 5 14 36 5

Fitted Normal Distribution for piglets Parameters for Normal Distribution Parameter Symbol Estimate Mean Mu 10.41667 Std Dev Sigma 1.991051 Goodness-of-Fit Tests for Normal Distribution Test Statistic p Value Kolmogorov-Smirnov D 0.16711886 Pr > D 0.012 Cramer-von Mises W-Sq 0.11409952 Pr > W-Sq 0.073 Anderson-Darling A-Sq 0.59245060 Pr > A-Sq 0.118 6

Quantiles for Normal Distribution Percent Quantile Observed Estimated 1.0 5.00000 5.78479 5.0 7.00000 7.14168 10.0 8.00000 7.86503 25.0 9.50000 9.07372 50.0 10.50000 10.41667 75.0 12.00000 11.75961 90.0 13.00000 12.96830 95.0 14.00000 13.69165 99.0 14.00000 15.04854 7