Analysis of variance and regression. November 13, 2007
|
|
- Theodore Reeves
- 5 years ago
- Views:
Transcription
1 Analysis of variance and regression November 13, 2007 SAS language The SAS environments Reading in, data-step Summary statistics Subsetting data More on reading in, missing values Combination of data sets Lene Theil Skovgaard, Dept. of Biostatistics, Institute of Public Health, University of Copenhagen
2 SAS language, November SAS exercises on this course Two teachers to help you Private user names and passwords!! two share each machine many of you know SAS ANALYST from course on basic statistics, but here we focus on the SAS language References: Aa. T. Andersen, T.V. Bedsted, M. Feilberg, R.B. Jakobsen and A. Milhøj: Elementær indføring i SAS. Akademisk Forlag (in Danish, 2002) by Aa. T. Andersen, M. Feilberg, R.B. Jakobsen and A. Milhøj: Statistik med SAS. Akademisk Forlag (in Danish, 2002) R.P Cody og J.K. Smith: Applied statistics and the SAS programming language. 4. ed., Prentice Hall, SAS language, November Menus vs. Language Menus + No learning by heart + No syntax error + Stepwise learning Inflexible A bit hard to find your whereabouts Does not contain everything Tedious in the long run SAS language, November Menus vs. Language Language: Some learning by heart Many syntax errors in the beginning + Logical coherent + Reproducably + Easier to document + Easier to communicate
3 SAS language, November Basic structure SAS Core Database system ( Engine ) Programming language SAS Base Data manipulation: DATA, SORT, PRINT, (PLOT) Minimal statistics: MEANS, UNIVARIATE, TABULATE Special modules SAS/STAT: TTEST, GLM, GENMOD, etc. SAS/GRAPH: GPLOT SAS/ASSIST, QC, ETS, FSP, IML,... SAS ANALYST SAS Enterprise SAS language, November SAS in a nutshell Program + { } Raw data Data file { } Data file + Log Output Batch SAS: *.sas Program file *.log Log file *.lst Output file SAS language, November SAS Display Manager Environment for program development and data handling Program editor: common or enhanced Output window Log window Graphics window Explorer, Viewtable, Toolbar, Results Note: Program code must be saved
4 SAS language, November Example O Neill et.al. (1983): Lung function for 25 patients with cystic fibrosis. SAS language, November Some of these data may be found in the text file T:\pemax.txt (created using e.g. Wordpad) age sex height weight fev1 pemax SAS language, November Reading in data (more later on...) data sasuser.pemax; infile T:\pemax.txt firstobs=2; input age sex height weight fev1 pemax; To execute the program, we click on running man, and then we look at the log file NOTE: 25 records were read from the infile pemax.txt. The minimum record length was 21. The maximum record length was 21. NOTE: The data set SASUSER.PEMAX has 25 observations and 6 variables. NOTE: DATA statement used: real time 0.11 seconds cpu time 0.01 seconds No output
5 SAS language, November What if it did not work as intended? 1. Find out why! 2. Correct 3. Try again SAS is executed sequentially. If we want to add something, we can just do it later. Recall commands When a program bit has been executed, it may sometimes disappear from the program editor Earlier bits may be recovered using F4 Note, that the bits accumulate: If you use F4 several times, you will get the previous bits successively after one another SAS language, November Definition of new variables, transformation We want to study body mass index, bmi: data sasuser.pemax; infile T:\pemax.txt firstobs=2; input age sex height weight fev1 pemax; bmi=weight/(height/100)**2; proc print data=sasuser.pemax; Obs age sex height weight fev1 pemax bmi SAS language, November Transformations Arithmetics The usual operators: + - * / Raising to a power: **, e.g.. x**2 Square root: sqrt(x) Logarithms: log(x), log10(x), log2(x) All logarithms are proportional log 2 (x) = log(x) log(2) Relations: = < > <= >= <> (unequal) eq lt gt le ge ne (alternative notation) Logical operators: and or not
6 SAS language, November Other types of variable definitions data sasuser.pemax; infile T:\pemax.txt firstobs=2; input age sex height weight fev1 pemax; length csex $ 6 ; /* in order to avoid truncation */ if sex=1 then csex= male ; if sex=2 then csex= female ; fat=(bmi>18); proc print data=sasuser.pemax; var csex age bmi fat; Obs csex age bmi fat 1 female male male SAS language, November Ingrediences in DATA step Specification line (name of new data set) Data source (here: read from file) Variables to read in Possible calculations Possible redefinitions To be concluded with SAS language, November Variables The columns in a data set May be numerical variables (contain numbers) or character variables (contain text strings, letters) Values of a character variable is enclosed in citation signs, e.g. male (except in data files) Period (.) denotes a missing value for a numerical variable
7 SAS language, November Variable names SAS does not care about upper/lower case (SEX, sex and Sex refer to the same variable) Names may be up to 32 characters long (previously only 8) Names may contain English letters, digits and underscore (_) but they are not allowed to start with a digit SAS language, November Calculation of summary statistics in SAS proc means data=sasuser.pemax; The MEANS Procedure Variable N Mean Std Dev Minimum Maximum age sex fev pemax bmi These are default, others may be chosen as options SAS language, November From Help pages: /*Some of the keywords available with PROC MEANS: N - number of observations MEAN - mean value MIN - minimum value MAX - maximum value SUM - total of values NMISS - number of missing values MAXDEC=n - set maximum number of decimal places */ statistic-keyword(s) specifies which statistics to compute and the order to display them in the output. The available keywords in the PROC statement are Descriptive statistic keywords CLM RANGE CSS SKEWNESS SKEW CV STDDEV STD KURTOSIS KURT STDERR LCLM SUM MAX SUMWGT MEAN UCLM MIN USS N VAR NMISS Quantile statistic keywords MEDIAN P50 Q3 P75 P1 P90 P5 P95 P10 P99 Q1 P25 QRANGE Hypothesis testing keyword PROBT T
8 SAS language, November If we want to see the medians: proc means data=sasuser.pemax median; var age bmi fev1; The MEANS Procedure Variable Median age bmi fev Oops: Now, we got only the median! SAS language, November proc means data=sasuser.pemax N mean median; var age bmi fev1; The MEANS Procedure Variable N Mean Median age bmi fev SAS language, November Sorting the data often used because other procedures demand this Example proc sort data=sasuser.pemax out=sorted_pemax; by sex descending weight; If out=xxx is omitted, the original data set will be replaced by the sorted data. Note the option DESCENDING in front of weight
9 SAS language, November BY statement may be found in many procedures: (MEANS, REG, GLM,... ) performs the analyses within each group separately demands sorted data proc sort data=sasuser.pemax; by sex; proc means; where sex ne.; by sex; Remember to delete missing values, otherwise they will form a separate group SAS language, November Output data set proc sort data=sasuser.pemax; by sex; proc means noprint data=sasuser.pemax mean; by sex; var age bmi fev1; output out=summa mean=mage mbmi mfev1; proc print data=summa; /* a temporary data set */ Obs sex _TYPE FREQ_ mage mbmi mfev SAS language, November Alternative procedure: UNIVARIATE proc univariate data=sasuser.pemax normal; var bmi; a lot of output... (shown on the next page) Several tests for normality created by the option normal: Tests for Normality Test --Statistic p Value----- Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq > Anderson-Darling A-Sq Pr > A-Sq >0.2500
10 SAS language, November The UNIVARIATE Procedure Variable: bmi Quantiles (Definition 5) Moments Quantile Estimate N 25 Sum Weights 25 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Uncorrected SS Corrected SS Coeff Variation Std Error Mean Location Basic Statistical Measures Variability 100% Max % % % % Q % Median % Q % % % % Min Mean Std Deviation Median Variance Mode. Range Interquartile Range Lowest Extreme Observations -----Highest----- Value Obs Value Obs Tests for Location: Mu0=0 Test -Statistic p Value Student s t t Pr > t <.0001 Sign M 12.5 Pr >= M <.0001 Signed Rank S Pr >= S < SAS language, November Tables Two-way tables (cross tabulation): One-way tables: proc freq; tables csex; Cumulative Cumulative csex Frequency Percent Frequency Percent female male proc freq; tables csex*fat / nopercent nocol; Table of csex by fat csex fat Frequency Row Pct 0 1 Total female male Total SAS language, November Filtering data (selecting subsets) In DATA-step regarding observations: IF, WHERE, DELETE regarding variables: DROP, KEEP In procedures Regarding observations: WHERE Regarding variables: VAR-statement (depending on procedure)
11 SAS language, November WHERE If we only want to look at the girls: data pemax; /* temporary data set */ set sasuser.pemax; where csex= female ; proc print data=pemax; var csex age bmi; Obs csex age bmi 1 female female female female SAS language, November IF, DELETE Alternative ways of writing: if csex= female ; if csex ne male ; Look out, if data contains missing values! if sex= male then delete; SAS language, November Filterings may be combined: data pemax; set sasuser.pemax; where csex= female and age>12; proc print data=pemax;; var csex age bmi; Obs csex age bmi 1 female female female female female female female
12 SAS language, November DROP and KEEP Now that we have bmi, we may not need height and weight: data pemax; set sasuser.pemax; drop height weight; If you only want to keep a single variable: data pemax; set sasuser.pemax (keep=bmi); SAS language, November WHERE in procedures If we for a specific procedure only want to look at the girls (but continue to work with all data): proc print data=sasuser.pemax; where csex= female ; var csex age bmi; Obs csex age bmi 1 female female female female Here, we have created no new data set, neither temporary nor permanent. SAS language, November Reading in from file data lines directly in program character variables columns or free fromat data seperation missing values import from Excel
13 SAS language, November Data lines directly in program data sasuser.pemax; input age sex height weight fev1 pemax; bmi=weight/(height/100)**2; datalines; ; SAS language, November Reading in character variables data sasuser.pemax; length sex $ 6; /* to avoid truncation */ input age sex $ height weight fev1 pemax; bmi=weight/(height/100)**2; datalines; 7 male female male male ; SAS language, November Semicolon separated data Until now, data have been nicely separated by blanks. Now it looks a bit different... age;sex;height;weight;fev1;pemax 7;male;109;13.1;32;95 7;female;112;12.9;19;85 8;male;124;14.1;22;100 8;female;125;16.2;41; data sasuser.pemax; * we now specify a list of possible delimiters; infile pemax2.txt firstobs=2 dlm= ; ; input age sex $ height weight fev1 pemax;
14 SAS language, November Formatted input Now, the values are not separated at all! (often useful for many binary observations, e.g. questionnaire data) data sasuser.pemax; length sex $ 1; input age 1-2 sex $ 3 height 4-6 weight 7-10 fev pemax 13-15; datalines; 7M F M M ; SAS language, November Missing values Numeric variables (numbers) must be. (period) Character variables (letters): NA, missing value. etc. blanks?? will of course not work if blanks are used as delimiters Take care with -9, 999 etc. SAS language, November Example 1: List input, numeric variables data sasuser.pemax; input age sex height weight fev1 pemax; datalines; ;
15 SAS language, November Example 2: Formatted input, blanks data sasuser.pemax; length sex $ 1; input age 1-2 sex $ 3 height 4-6 weight 7-10 fev pemax 13-15; datalines; 7M F ; SAS language, November Example 3: Semicolon separated, no symbols for missing data sasuser.pemax; infile pemax2.tal firstobs=2 dlm= ;, dsd; input age csex $ height weight fev1 pemax; bmi=weight/(height/100)**2; datalines; 7;male,;13.1;32;95 7;female;112;12.9;19;85 8;;124;14.1;22; ; Option dsd means: DSD Changes how SAS treats delimiters when list input is used and sets the default delimiter to a comma. When you specify DSD, SAS treats two consecutive delimiters as a missing value. SAS language, November General options (e.g. in first line of the program): options obs=100 nocenter linesize=75 pagesize=60 missing= - ; OBS=: Number of observations (from start) to be included in the analyses CENTER/NOCENTER: Output appearence LINESIZE= Maximum number of characters on each line (at most 256) PAGESIZE= Maximum number of lines on each page MISSING= Specifies the symbol for missing values (default is.)
16 SAS language, November Excel files may be imported directly to SAS proc import datafile= tables.xls out=sasuser.tables dbms=excel2000 replace; getnames=yes; SAS language, November SAS is a programming language A program is like a knitting recipe : A series of instructions which have to be executed in a specific order. Note: SAS is not a spread sheet. Output does not change if you change the data (rerun the program to do this) SAS language, November A simple SAS program data sasuser.pemax; \ infile N:\pemax.txt firstobs=2; input age sex height weight fev1 pemax; Data Step bmi=weight/(height/100)**2; / proc print data=sasuser.pemax; \ proc means data=sasuser.pemax; Proc Steps var age bmi; /
17 SAS language, November DATA steps and PROC steps A SAS program distinguishes between two types of operations ( steps ) DATA steps which define data sets, reading from text files, calculation of derived variables, selection of cases, etc. PROC steps which contain standard procedures, operating on data sets. Note: it is in general not possible to calculate anything in a PROC step. Traditionally, a SAS program is arranged so that the DATA step is at the top, but they may be mixed, if you define new data sets along the way. SAS language, November Basics regarding SAS language Almost everything (except for calculations in a data step) starts with a keyword and ends with a semicolon Statements are bits of code separated by semicolon OPTIONS ls=80; PROC GLM data=sasuser.pemax; MODEL height = age / solution; RUN; Keywords: OPTIONS, PROC, GLM, MODEL, RUN Certain statements belong together in blocks SAS language, November Things to keep in mind The slash (/) is often used to mark the start of options Semicolon and slash are necessary: solution is not a variable name and run is not an option.
18 SAS language, November Formatting code / designing programs SAS generally does not care about extra blanks and line shifts It is however considered good practice to write at most one statement on each line Indenting facilitates the reading considerably. (Sooner or later, you will end up reading your own old code!) SAS language, November How to organize your analyses etc.? If only we knew. But it is to some extent a matter of taste. Some thoughts, though... Interactive program execution is easy, but may be dangerous! Do you remember what was done? What if you have to just correct a few data values? Remember to save the code, at least for the most important analyses. Collect bits and pieces to a more coherent program and save it as a.sas-file, which can stand alone Look out for carrying over effects when using Display Manager. Try out your program in a fresh SAS session. Save also the log-files. They are as important as the output files. SAS language, November SAS libraries SAS is born with four libraries, the most important being WORK and SASUSER Look at Properties in the Explorer to see exactly where they are located WORK is a temporary library, which means that it disappears (with all its contents) when SAS is closed. These data files are denoted work.pemax or simply pemax SASUSER is permanent. Data sets stored here will be there also next time you enter SAS. These files are denoted sasuser.pemax
19 SAS language, November Private libraries, LIBNAME If you want a separate library for each project (surely the best in the long run), you will have to use LIBNAME statement, e.g. libname mysas p:\paper1\sasdata ; data mysas.pemax; infile... Note: The folder has to be created before use! This LIBNAME statement may be saved in the autoexec.sas (too advanced for this course) SAS language, November SAS data set First part (before the period) is the SAS-library The second part is the name of the data set If the first part is omitted, the library is taken to be WORK Seen from Windows point of view, the data files have the extension sas7bdat Data sets have two logical parts, a describing part and the data itself PROC CONTENTS resp. PROC PRINT will show these SAS language, November PROC CONTENTS The CONTENTS Procedure Data Set Name: SASUSER.PEMAX Observations: 25 Member Type: DATA Variables: 6 Engine: V8 Indexes: 0 Created: 14:13 Wednesday, Observation Length: 48 April 14, 2004 Last Modified: 14:13 Wednesday, Deleted Observations: 0 April 14, 2004 Protection: Compressed: NO Data Set Type: Sorted: NO Label:
20 SAS language, November Engine/Host Dependent Information----- Data Set Page Size: 8192 Number of Data Set Pages: 1 First Data Page: 1 Max Obs per Page: 169 Obs in First Data Page: 25 Number of Data Set Repairs: 0 File Name: /saswork/sas_workc3ca _ rasch/pemax.sas7bdat Release Created: M0 Host Created: SunOS Inode Number: 7884 Access Permission: rw-r--r-- Owner Name: lts File Size (bytes): Alphabetic List of Variables and Attributes----- # Variable Type Len Pos age Num fev1 Num height Num pemax Num sex Num weight Num 8 24 SAS language, November Combination of data sets vertically : more cases, identical variables horisontally (merging): new variables, same cases More cases is easy: It is possible to have more data sets in the same SET statement of a data step. Ex. Combining two groups: data all; set group1 group2; If the data sets do not have exactly the same variables, missing values are filled in SAS language, November Merging (horisontal combination) the data sets ought to have a common key variable, e.g. id all data sets have to be sorted according to id proc sort data=info1; by id; proc sort data=info2; by id; data info; merge info1 info2; by id; BY may be omitted, but only if data are complete and are sorted in the same way (not recommended, take good care!)
21 SAS language, November Lists of variables Sometimes you may want to refer to many variables at one time, e.g. in case of repeated measurements or just many connected variables proc freq; tables ques1-ques392; names looking alike: x1-x20 SAS order of appearance: age--weight All character variables: _CHARACTER_ All numerical variables: _NUMERIC_ All variables: _ALL_ SAS language, November Formats: Information of how to read or write a variable Built-in formats (numerical, dates, characters) User defined formats Why use formats? Nicer looking output Grouping for creation of tables Protection against errors in data SAS language, November Formats, continued standard formats: 10.3, best12., E12., $10., date10., yymmdd10.. allways contain a period (keep that in mind!) is associated permanently with the variable in the DATA step, or: is specified ad hoc with a FORMAT statement in PROC steps. User defined formats are created with PROC FORMAT
22 SAS language, November Example of use of formats proc format; invalue sexin M =1 F =2; value sexout 1= male 2= female ; data; informat sex sexin.; format sex sexout.; input sex; datalines; M F m ; proc print; which creates the output: Obs sex 1 male 2 female 3. SAS language, November Example of date formats: data; input x yymmdd10.; /* this covers several formats */ format x yymmddd.; cards; /1/ ; proc print; Obs x The actual value is time in days since 1/ SAS language, November Times in SAS Example with longitudinal measurements of blood pressure during the day: data longitudinal; informat time TIME10.2; input person time bp; datalines; 1 04:45: :21: :15: :18: ; Obs time person bp proc print data=longitudinal;
23 SAS language, November We wish to refer to time since start of treatment First we have to pick out the starting times, i.e. the first observations for each individual: data starttimes; set longitudinal; by person; if first.person; start=time; proc print data=starttimes; Obs time person bp start SAS language, November and then we have to merge the two data sets, and calculate differences: data merge_two; merge longitudinal starttimes; by person; timer=(time-start)/60**2; proc print data=merge_two; Obs time person bp start timer
Analysis of variance and regression. November 13, 2007
Analysis of variance and regression November 13, 2007 SAS language The SAS environments Reading in, data-step Summary statistics Subsetting data More on reading in, missing values Combination of data sets
More informationSTAT:5400 Computing in Statistics
STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,
More informationSTAT 503 Fall Introduction to SAS
Getting Started Introduction to SAS 1) Download all of the files, sas programs (.sas) and data files (.dat) into one of your directories. I would suggest using your H: drive if you are using a computer
More informationEXST3201 Mousefeed01 Page 1
EXST3201 Mousefeed01 Page 1 3 /* 4 Examine differences among the following 6 treatments 5 N/N85 fed normally before weaning and 85 kcal/wk after 6 N/R40 fed normally before weaning and 40 kcal/wk after
More informationEXST SAS Lab Lab #6: More DATA STEP tasks
EXST SAS Lab Lab #6: More DATA STEP tasks Objectives 1. Working from an current folder 2. Naming the HTML output data file 3. Dealing with multiple observations on an input line 4. Creating two SAS work
More informationTHE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)
THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533 MIDTERM EXAMINATION: October 14, 2005 Instructor: Val LeMay Time: 50 minutes 40 Marks FRST 430 50 Marks FRST 533 (extra questions) This examination
More informationStat 302 Statistical Software and Its Applications SAS: Data I/O
Stat 302 Statistical Software and Its Applications SAS: Data I/O Yen-Chi Chen Department of Statistics, University of Washington Autumn 2016 1 / 33 Getting Data Files Get the following data sets from the
More informationSTAT:5201 Applied Statistic II
STAT:5201 Applied Statistic II Two-Factor Experiment (one fixed blocking factor, one fixed factor of interest) Randomized complete block design (RCBD) Primary Factor: Day length (short or long) Blocking
More informationStat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics
Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 19, 2015 2 Getting
More informationEpidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval
Epidemiology 9509 Principles of Biostatistics Chapter 3 John Koval Department of Epidemiology and Biostatistics University of Western Ontario What we will do today We will learn to use use SAS to 1. read
More informationThe SAS interface is shown in the following screen shot:
The SAS interface is shown in the following screen shot: There are several items of importance shown in the screen shot First there are the usual main menu items, such as File, Edit, etc I seldom use anything
More informationReading data in SAS and Descriptive Statistics
P8130 Recitation 1: Reading data in SAS and Descriptive Statistics Zilan Chai Sep. 18 th /20 th 2017 Outline Intro to SAS (windows, basic rules) Getting Data into SAS Descriptive Statistics SAS Windows
More informationSAS Training Spring 2006
SAS Training Spring 2006 Coxe/Maner/Aiken Introduction to SAS: This is what SAS looks like when you first open it: There is a Log window on top; this will let you know what SAS is doing and if SAS encountered
More informationPLS205 Lab 1 January 9, Laboratory Topics 1 & 2
PLS205 Lab 1 January 9, 2014 Laboratory Topics 1 & 2 Welcome, introduction, logistics, and organizational matters Introduction to SAS Writing and running programs saving results checking for errors Different
More informationINTRODUCTION SAS Prepared by A. B. Billings West Virginia University May 1999 (updated August 2006)
INTRODUCTION To SAS Prepared by A. B. Billings West Virginia University May 1999 (updated August 2006) 1 Getting Started with SAS SAS stands for Statistical Analysis System. SAS is a computer software
More informationBasic Concepts #6: Introduction to Report Writing
Basic Concepts #6: Introduction to Report Writing Using By-line, PROC Report, PROC Means, PROC Freq JC Wang By-Group Processing By-group processing in a procedure step, a BY line identifies each group
More informationLaboratory Topics 1 & 2
PLS205 Lab 1 January 12, 2012 Laboratory Topics 1 & 2 Welcome, introduction, logistics, and organizational matters Introduction to SAS Writing and running programs; saving results; checking for errors
More informationIntroductory Guide to SAS:
Introductory Guide to SAS: For UVM Statistics Students By Richard Single Contents 1 Introduction and Preliminaries 2 2 Reading in Data: The DATA Step 2 2.1 The DATA Statement............................................
More informationIt s Proc Tabulate Jim, but not as we know it!
Paper SS02 It s Proc Tabulate Jim, but not as we know it! Robert Walls, PPD, Bellshill, UK ABSTRACT PROC TABULATE has received a very bad press in the last few years. Most SAS Users have come to look on
More informationChapter 2: Getting Data Into SAS
Chapter 2: Getting Data Into SAS Data stored in many different forms/formats. Four categories of ways to read in data. 1. Entering data directly through keyboard 2. Creating SAS data sets from raw data
More information* Sample SAS program * Data set is from Dean and Voss (1999) Design and Analysis of * Experiments. Problem 3, page 129.
SAS Most popular Statistical software worldwide. SAS claims that its products are used at over 40,000 sites, including at 90% of the Fortune 500. This will not be all SAS as they make other products, such
More informationCentering and Interactions: The Training Data
Centering and Interactions: The Training Data A random sample of 150 technical support workers were first given a test of their technical skill and knowledge, and then randomly assigned to one of three
More informationSAS seminar. The little SAS book Chapters 3 & 4. April 15, Åsa Klint. By LD Delwiche and SJ Slaughter. 3.1 Creating and Redefining variables
SAS seminar April 15, 2003 Åsa Klint The little SAS book Chapters 3 & 4 By LD Delwiche and SJ Slaughter Data step - read and modify data - create a new dataset - performs actions on rows Proc step - use
More informationSAS Programs SAS Lecture 4 Procedures. Aidan McDermott, April 18, Outline. Internal SAS formats. SAS Formats
SAS Programs SAS Lecture 4 Procedures Aidan McDermott, April 18, 2006 A SAS program is in an imperative language consisting of statements. Each statement ends in a semi-colon. Programs consist of (at least)
More informationIntroduction to SAS Procedures SAS Basics III. Susan J. Slaughter, Avocet Solutions
Introduction to SAS Procedures SAS Basics III Susan J. Slaughter, Avocet Solutions DATA versus PROC steps Two basic parts of SAS programs DATA step PROC step Begin with DATA statement Begin with PROC statement
More informationIntroduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus
Introduction (SPSS) SPSS is the acronym of Statistical Package for the Social Sciences. SPSS is one of the most popular statistical packages which can perform highly complex data manipulation and analysis
More informationLab #9: ANOVA and TUKEY tests
Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for
More informationDSCI 325: Handout 2 Getting Data into SAS Spring 2017
DSCI 325: Handout 2 Getting Data into SAS Spring 2017 Data sets come in many different formats. In some situations, data sets are stored on paper (e.g., surveys) and other times data are stored in huge
More informationLab 1: Introduction to Data
1 Lab 1: Introduction to Data Some define Statistics as the field that focuses on turning information into knowledge. The first step in that process is to summarize and describe the raw information the
More informationIntroduction to SAS. Cristina Murray-Krezan Research Assistant Professor of Internal Medicine Biostatistician, CTSC
Introduction to SAS Cristina Murray-Krezan Research Assistant Professor of Internal Medicine Biostatistician, CTSC cmurray-krezan@salud.unm.edu 20 August 2018 What is SAS? Statistical Analysis System,
More informationLevel I: Getting comfortable with my data in SAS. Descriptive Statistics
Level I: Getting comfortable with my data in SAS. Descriptive Statistics Quick Review of reading Data into SAS Preparing Data 1. Variable names in the first row make sure they are appropriate for the statistical
More informationSAS Example A10. Output Delivery System (ODS) Sample Data Set sales.txt. Examples of currently available ODS destinations: Mervyn Marasinghe
SAS Example A10 data sales infile U:\Documents\...\sales.txt input Region : $8. State $2. +1 Month monyy5. Headcnt Revenue Expenses format Month monyy5. Revenue dollar12.2 proc sort by Region State Month
More informationSAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2
SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Department of MathemaGcs and StaGsGcs Phone: 4-3620 Office: Parker 364- A E- mail: carpedm@auburn.edu Web: hup://www.auburn.edu/~carpedm/stat6110
More informationA Step by Step Guide to Learning SAS
A Step by Step Guide to Learning SAS 1 Objective Familiarize yourselves with the SAS programming environment and language. Learn how to create and manipulate data sets in SAS and how to use existing data
More informationWriting Reports with the
Writing Reports with the SAS System s TABULATE Procedure or Big Money Proc Tabulate Ben Cochran The Bedford Group bencochran@nc.rr.com Writing Reports with the SAS System s TABULATE Procedure Copyright
More informationSTAT 7000: Experimental Statistics I
STAT 7000: Experimental Statistics I 2. A Short SAS Tutorial Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2009 Peng Zeng (Auburn University) STAT 7000 Lecture Notes Fall 2009
More information22S:166. Checking Values of Numeric Variables
22S:1 Computing in Statistics Lecture 24 Nov. 2, 2016 1 Checking Values of Numeric Variables range checks when you know what the range of possible values is for a given quantitative variable internal consistency
More informationLab #1: Introduction to Basic SAS Operations
Lab #1: Introduction to Basic SAS Operations Getting Started: OVERVIEW OF SAS (access lab pages at http://www.stat.lsu.edu/exstlab/) There are several ways to open the SAS program. You may have a SAS icon
More informationIntroduction to SAS Procedures SAS Basics III. Susan J. Slaughter, Avocet Solutions
Introduction to SAS Procedures SAS Basics III Susan J. Slaughter, Avocet Solutions SAS Essentials Section for people new to SAS Core presentations 1. How SAS Thinks 2. Introduction to DATA Step Programming
More informationRepeated Measures Part 4: Blood Flow data
Repeated Measures Part 4: Blood Flow data /* bloodflow.sas */ options linesize=79 pagesize=100 noovp formdlim='_'; title 'Two within-subjecs factors: Blood flow data (NWK p. 1181)'; proc format; value
More informationSPSS. (Statistical Packages for the Social Sciences)
Inger Persson SPSS (Statistical Packages for the Social Sciences) SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform basic statistical calculations in SPSS.
More informationMaximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University
Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University While your data tables or spreadsheets may look good to
More informationUsing SAS to Analyze CYP-C Data: Introduction to Procedures. Overview
Using SAS to Analyze CYP-C Data: Introduction to Procedures CYP-C Research Champion Webinar July 14, 2017 Jason D. Pole, PhD Overview SAS overview revisited Introduction to SAS Procedures PROC FREQ PROC
More informationContents of SAS Programming Techniques
Contents of SAS Programming Techniques Chapter 1 About SAS 1.1 Introduction 1.1.1 SAS modules 1.1.2 SAS module classification 1.1.3 SAS features 1.1.4 Three levels of SAS techniques 1.1.5 Chapter goal
More informationFactorial ANOVA. Skipping... Page 1 of 18
Factorial ANOVA The potato data: Batches of potatoes randomly assigned to to be stored at either cool or warm temperature, infected with one of three bacterial types. Then wait a set period. The dependent
More informationStatements with the Same Function in Multiple Procedures
67 CHAPTER 3 Statements with the Same Function in Multiple Procedures Overview 67 Statements 68 BY 68 FREQ 70 QUIT 72 WEIGHT 73 WHERE 77 Overview Several statements are available and have the same function
More informationChapter 2 The SAS Environment
Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,
More informationCluster Randomization Create Cluster Means Dataset
Chapter 270 Cluster Randomization Create Cluster Means Dataset Introduction A cluster randomization trial occurs when whole groups or clusters of individuals are treated together. Examples of such clusters
More informationBaruch College STA Senem Acet Coskun
Baruch College STA 9750 BOOK BUY A Predictive Mode Senem Acet Coskun Table of Contents Summary 3 Why this topic? 4 Data Sources 6 Variable Definitions 7 Descriptive Statistics 8 Univariate Analysis 9 Two-Sample
More informationBase and Advance SAS
Base and Advance SAS BASE SAS INTRODUCTION An Overview of the SAS System SAS Tasks Output produced by the SAS System SAS Tools (SAS Program - Data step and Proc step) A sample SAS program Exploring SAS
More informationAn Introduction to SAS University Edition
An Introduction to SAS University Edition Ron Cody From An Introduction to SAS University Edition. Full book available for purchase here. Contents List of Programs... xi About This Book... xvii About the
More informationIntermediate SAS: Working with Data
Intermediate SAS: Working with Data OIT Technical Support Services 293-4444 oithelp@mail.wvu.edu oit.wvu.edu/training/classmat/sas/ Table of Contents Getting set up for the Intermediate SAS workshop:...
More informationIntroducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC
Paper S1-09-2013 Introducing a Colorful Proc Tabulate Ben Cochran, The Bedford Group, Raleigh, NC ABSTRACT Several years ago, one of my clients was in the business of selling reports to hospitals. He used
More informationSAS Programming Basics
SAS Programming Basics SAS Programs SAS Programs consist of three major components: Global statements Procedures Data steps SAS Programs Global Statements Procedures Data Step Notes Data steps and procedures
More informationLearning SAS by Example
Learning SAS by Example A Programmer's Guide Second Edition.sas Ron Cody The correct bibliographic citation for this manual is as follows: Cody, Ron. 2018. Learning SAS by Example: A Programmer's Guide,
More informationSTAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS
STAT 3304/5304 Introduction to Statistical Computing Introduction to SAS What is SAS? SAS (originally an acronym for Statistical Analysis System, now it is not an acronym for anything) is a program designed
More informationSAS/STAT 13.1 User s Guide. The NESTED Procedure
SAS/STAT 13.1 User s Guide The NESTED Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute
More informationApril 4, SAS General Introduction
PP 105 Spring 01-02 April 4, 2002 SAS General Introduction TA: Kanda Naknoi kanda@stanford.edu Stanford University provides UNIX computing resources for its academic community on the Leland Systems, which
More informationPART I: USING SAS FOR THE PC AN OVERVIEW 1.0 INTRODUCTION
PART I: USING SAS FOR THE PC AN OVERVIEW 1.0 INTRODUCTION The statistical package SAS (Statistical Analysis System) is a large computer program specifically designed to do statistical analyses. It is very
More informationWithin-Cases: Multivariate approach part one
Within-Cases: Multivariate approach part one /* sleep2.sas */ options linesize=79 noovp formdlim=' '; title "Student's Sleep data: Matched t-tests with proc reg"; data bedtime; infile 'studentsleep.data'
More informationIntroduction to STATA 6.0 ECONOMICS 626
Introduction to STATA 6.0 ECONOMICS 626 Bill Evans Fall 2001 This handout gives a very brief introduction to STATA 6.0 on the Economics Department Network. In a few short years, STATA has become one of
More informationIntroduction to Stata - Session 2
Introduction to Stata - Session 2 Siv-Elisabeth Skjelbred ECON 3150/4150, UiO January 26, 2016 1 / 29 Before we start Download auto.dta, auto.csv from course home page and save to your stata course folder.
More information3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;
Randy s SAS hints, updated Feb 6, 2014 1. Always begin your programs with internal documentation. * ***************** * Program =test1, Randy Ellis, first version: March 8, 2013 ***************; 2. Don
More informationGetting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA
SESUG 2012 Paper HW-01 Getting Up to Speed with PROC REPORT Kimberly LeBouton, K.J.L. Computing, Rossmoor, CA ABSTRACT Learning the basics of PROC REPORT can help the new SAS user avoid hours of headaches.
More informationSAS 101. Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23. By Tasha Chapman, Oregon Health Authority
SAS 101 Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23 By Tasha Chapman, Oregon Health Authority Topics covered All the leftovers! Infile options Missover LRECL=/Pad/Truncover
More information2. Don t forget semicolons and RUN statements The two most common programming errors.
Randy s SAS hints March 7, 2013 1. Always begin your programs with internal documentation. * ***************** * Program =test1, Randy Ellis, March 8, 2013 ***************; 2. Don t forget semicolons and
More informationLecture 1 Getting Started with SAS
SAS for Data Management, Analysis, and Reporting Lecture 1 Getting Started with SAS Portions reproduced with permission of SAS Institute Inc., Cary, NC, USA Goals of the course To provide skills required
More informationFactorial ANOVA with SAS
Factorial ANOVA with SAS /* potato305.sas */ options linesize=79 noovp formdlim='_' ; title 'Rotten potatoes'; title2 ''; proc format; value tfmt 1 = 'Cool' 2 = 'Warm'; data spud; infile 'potato2.data'
More informationssh tap sas913 sas
Fall 2010, STAT 430 SAS Examples SAS9 ===================== ssh abc@glue.umd.edu tap sas913 sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm a. Reading external files using INFILE and INPUT (Ch
More information5b. Descriptive Statistics - Part II
5b. Descriptive Statistics - Part II In this lab we ll cover how you can calculate descriptive statistics that we discussed in class. We also learn how to summarize large multi-level databases efficiently,
More informationData-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp )
Applied Longitudinal Data Analysis Page 1 Data-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp. 357-467) Purpose of the Exercise This data-analytic
More informationEXST SAS Lab Lab #8: More data step and t-tests
EXST SAS Lab Lab #8: More data step and t-tests Objectives 1. Input a text file in column input 2. Output two data files from a single input 3. Modify datasets with a KEEP statement or option 4. Prepare
More informationFSEDIT Procedure Windows
25 CHAPTER 4 FSEDIT Procedure Windows Overview 26 Viewing and Editing Observations 26 How the Control Level Affects Editing 27 Scrolling 28 Adding Observations 28 Entering and Editing Variable Values 28
More informationLand Cover Stratified Accuracy Assessment For Digital Elevation Model derived from Airborne LIDAR Dade County, Florida
Land Cover Stratified Accuracy Assessment For Digital Elevation Model derived from Airborne LIDAR Dade County, Florida FINAL REPORT Submitted October 2004 Prepared by: Daniel Gann Geographic Information
More informationCreating a data file and entering data
4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that
More informationTHIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010
THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE
More informationCH5: CORR & SIMPLE LINEAR REFRESSION =======================================
STAT 430 SAS Examples SAS5 ===================== ssh xyz@glue.umd.edu, tap sas913 (old sas82), sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm CH5: CORR & SIMPLE LINEAR REFRESSION =======================================
More informationWHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide
STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO
More informationA Brief Tour of SAS. The SAS Desktop
A Brief Tour of SAS SAS is one of the most versatile and comprehensive statistical software packages available today, with data management, analysis, and graphical capabilities. It is great at working
More informationSAS Display Manager Windows. For Windows
SAS Display Manager Windows For Windows Computers with SAS software SSCC Windows Terminal Servers (Winstat) Linux Servers (linstat) Lab computers DoIT Info Labs (as of June 2014) In all Labs with Windows
More informationdata Vote; /* Read a CSV file */ infile 'c:\users\yuen\documents\6250\homework\hw1\political.csv' dsd; input state $ Party $ Age; run;
Chapter 3 2. data Vote; /* Read a CSV file */ infile 'c:\users\yuen\documents\6250\homework\hw1\political.csv' dsd; input state $ Party $ Age; title "Listing of Vote data set"; /* compute frequencies for
More informationChapter 3 Analyzing Normal Quantitative Data
Chapter 3 Analyzing Normal Quantitative Data Introduction: In chapters 1 and 2, we focused on analyzing categorical data and exploring relationships between categorical data sets. We will now be doing
More informationExample how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler
JMP in a nutshell 1 HR, 17 Apr 2018 The software JMP Pro 14 is installed on the Macs of the Phonetics Institute. Private versions can be bought from
More informationGetting Your Data into SAS The Basics. Math 3210 Dr. Zeng Department of Mathematics California State University, Bakersfield
Getting Your Data into SAS The Basics Math 3210 Dr. Zeng Department of Mathematics California State University, Bakersfield Outline Getting data into SAS -Entering data directly into SAS -Creating SAS
More informationIntermediate SAS: Statistics
Intermediate SAS: Statistics OIT TSS 293-4444 oithelp@mail.wvu.edu oit.wvu.edu/training/classmat/sas/ Table of Contents Procedures... 2 Two-sample t-test:... 2 Paired differences t-test:... 2 Chi Square
More informationIntroduction to SAS Mike Zdeb ( , #61
Mike Zdeb (402-6479, msz03@albany.edu) #61 FORMAT, you can design informats for reading and interpreting non-standard data, and you can design formats for displaying data in non-standard ways....example
More informationChapter 1: Introduction to SAS
Chapter 1: Introduction to SAS SAS programs: A sequence of statements in a particular order. Rules for SAS statements: 1. Every SAS statement ends in a semicolon!!!; 2. Upper/lower case does not matter
More informationLoading Data. Introduction. Understanding the Volume Grid CHAPTER 2
19 CHAPTER 2 Loading Data Introduction 19 Understanding the Volume Grid 19 Loading Data Representing a Complete Grid 20 Loading Data Representing an Incomplete Grid 21 Loading Sparse Data 23 Understanding
More informationIntroduction to Statistical Analyses in SAS
Introduction to Statistical Analyses in SAS Programming Workshop Presented by the Applied Statistics Lab Sarah Janse April 5, 2017 1 Introduction Today we will go over some basic statistical analyses in
More informationWriting Programs in SAS Data I/O in SAS
Writing Programs in SAS Data I/O in SAS Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Writing SAS Programs Your SAS programs can be written in any text editor, though you will often want
More informationIntroduction to Stata
Introduction to Stata Introduction In introductory biostatistics courses, you will use the Stata software to apply statistical concepts and practice analyses. Most of the commands you will need are available
More informationPHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2
PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2 Overview Most assignments will have a companion lab to help you learn the task and should cover similar
More informationTYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT
PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the
More informationAPPENDIX 4 Migrating from QMF to SAS/ ASSIST Software. Each of these steps can be executed independently.
255 APPENDIX 4 Migrating from QMF to SAS/ ASSIST Software Introduction 255 Generating a QMF Export Procedure 255 Exporting Queries from QMF 257 Importing QMF Queries into Query and Reporting 257 Alternate
More informationPrepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.
Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good
More informationCell means coding and effect coding
Cell means coding and effect coding /* mathregr_3.sas */ %include 'readmath.sas'; title2 ''; /* The data step continues */ if ethnic ne 6; /* Otherwise, throw the case out */ /* Indicator dummy variables
More information1 Downloading files and accessing SAS. 2 Sorting, scatterplots, correlation and regression
Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 2 Feb. 6, 2015 1 Downloading files and accessing SAS. We will be using the billion.dat dataset again today, as well as the OECD dataset
More informationCHAPTER 6. The Normal Probability Distribution
The Normal Probability Distribution CHAPTER 6 The normal probability distribution is the most widely used distribution in statistics as many statistical procedures are built around it. The central limit
More informationSAS CURRICULUM. BASE SAS Introduction
SAS CURRICULUM BASE SAS Introduction Data Warehousing Concepts What is a Data Warehouse? What is a Data Mart? What is the difference between Relational Databases and the Data in Data Warehouse (OLTP versus
More informationOpening a Data File in SPSS. Defining Variables in SPSS
Opening a Data File in SPSS To open an existing SPSS file: 1. Click File Open Data. Go to the appropriate directory and find the name of the appropriate file. SPSS defaults to opening SPSS data files with
More information