(on CQUEST) A.L. Gibbs

Similar documents
(on CQUEST) A.L. Gibbs

Some Basics of CQUEST

STA 303 / 1002 Using SAS on CQUEST

ST Lab 1 - The basics of SAS

A Step by Step Guide to Learning SAS

INTRODUCTION TO SAS STAT 525 FALL 2013

CS Fundamentals of Programming II Fall Very Basic UNIX

Using LINUX a BCMB/CHEM 8190 Tutorial Updated (1/17/12)

When you first log in, you will be placed in your home directory. To see what this directory is named, type:

Running Sentaurus on the DOE Network

CS 215 Fundamentals of Programming II Spring 2019 Very Basic UNIX

STAT:5400 Computing in Statistics

SAS Training Spring 2006

Stat 5411 Lab 1 Fall Assignment: Turn in copies of bike.sas and bike.lst from your SAS run. Turn this in next Friday with your assignment.

CpSc 1111 Lab 1 Introduction to Unix Systems, Editors, and C

Session 1: Accessing MUGrid and Command Line Basics

Using the Zoo Workstations

Unix Tutorial Haverford Astronomy 2014/2015

Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory

CSE 391 Lecture 3. bash shell continued: processes; multi-user systems; remote login; editors

Parallel Programming Pre-Assignment. Setting up the Software Environment

Introduction. Overview of 201 Lab and Linux Tutorials. Stef Nychka. September 10, Department of Computing Science University of Alberta

Linux Survival Guide

Linux/Cygwin Practice Computer Architecture

Introduction to Linux and Supercomputers

Network Monitoring & Management. A few Linux basics

CSE Linux VM. For Microsoft Windows. Based on opensuse Leap 42.2

Linux Essentials. Programming and Data Structures Lab M Tech CS First Year, First Semester

CS CS Tutorial 2 2 Winter 2018

CSE 390a Lecture 3. bash shell continued: processes; multi-user systems; remote login; editors

Oregon State University School of Electrical Engineering and Computer Science. CS 261 Recitation 1. Spring 2011

Chapter-3. Introduction to Unix: Fundamental Commands

: the User (owner) for this file (your cruzid, when you do it) Position: directory flag. read Group.

STA 248 S: Some R Basics

You can use the WinSCP program to load or copy (FTP) files from your computer onto the Codd server.

Linux hep.wisc.edu

User Guide Version 2.0

/23/2004 TA : Jiyoon Kim. Recitation Note 1

CSC 112 Lab 1: Introduction to Unix and C++ Fall 2009

Short Read Sequencing Analysis Workshop

CS 2400 Laboratory Assignment #1: Exercises in Compilation and the UNIX Programming Environment (100 pts.)

Lab 1 Introduction to UNIX and C

Introductory SAS example

Introduction to Linux

Unix/Linux Operating System. Introduction to Computational Statistics STAT 598G, Fall 2011

Lab Working with Linux Command Line

LAB #5 Intro to Linux and Python on ENGR

EKT332 COMPUTER NETWORK

Brief Linux Presentation. July 10th, 2006 Elan Borenstein

Lecture # 2 Introduction to UNIX (Part 2)

Getting started with Hugs on Linux

Booting a Galaxy Instance

Logging in to the CRAY

You should see something like this, called the prompt :

CSCI 161: Introduction to Programming I Lab 1a: Programming Environment: Linux and Eclipse

History. Terminology. Opening a Terminal. Introduction to the Unix command line GNOME

Hitchhiker s Guide to VLSI Design with Cadence & Synopsys

Introduction to Unix - Lab Exercise 0

Getting Started with UNIX

Part I. Introduction to Linux

LAB #7 Linux Tutorial

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

Introduction to the Linux Command Line

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Lab 1 Introduction to UNIX and C

CS370 Operating Systems

Introduction. SSH Secure Shell Client 1

Introduction to Linux and Cluster Computing Environments for Bioinformatics

Introduction to the UNIX command line

Linux Command Line Primer. By: Scott Marshall

Firewalls can prevent access to the Unix Servers. Please make sure any firewall software or hardware allows access through Port 22.

COMS 6100 Class Notes 3

Lab 2: Linux/Unix shell

Tutorial 1: Unix Basics

Helpful Tips for Labs. CS140, Spring 2015

Unix File System. Learning command-line navigation of the file system is essential for efficient system usage

Using WestGrid from the desktop Oct on Access Grid

THE HONG KONG POLYTECHNIC UNIVERSITY Department of Electronic and Information Engineering

Intro. To Unix commands. What are the machines? Very basics

Getting started with Hugs on Linux

Linux Bootcamp Fall 2015

Getting Started With UNIX Lab Exercises

Introduction to Stata: An In-class Tutorial

Outline. Structure of a UNIX command

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Lab Assignment #1. University of Pittsburgh Department of Electrical and Computer Engineering

CSCE 212H, Spring 2008, Matthews Lab Assignment 1: Representation of Integers Assigned: January 17 Due: January 22

The Unix Shell & Shell Scripts

Scripting Languages Course 1. Diana Trandabăț

Unix tutorial. Thanks to Michael Wood-Vasey (UPitt) and Beth Willman (Haverford) for providing Unix tutorials on which this is based.

ENCM 339 Fall 2017: Editing and Running Programs in the Lab

STAT 7000: Experimental Statistics I

Working with Basic Linux. Daniel Balagué

CSE115 Lab exercises for week 1 of recitations Spring 2011

Applied Regression Modeling: A Business Approach

Introduction to Unix The Windows User perspective. Wes Frisby Kyle Horne Todd Johansen

Introduction to Linux (Part I) BUPT/QMUL 2018/03/14

Your departmental website

Linux Systems Administration Getting Started with Linux

Transcription:

STA 302 / 1001 Introduction to SAS for Regression (on CQUEST) A.L. Gibbs September 2009 1

Some Basics of CQUEST The operating system in the RW labs (107/109 and 211) is Windows XP. There are a few terminals in RW 213 running Linux. SAS is run from a Linux server. Your account is the same and files are accessible in all labs / operating systems. To logout: Windows: look under Start Linux: look under System To open a web browser: Windows: Firefox is an icon on your desktop. Linux: Firefox is the mouse on a globe icon on the toolbar. If backspace doesn t work in SAS, turn off Num Lock. Linux: If you want a terminal window you can either right click on the desktop or click on Applications and then Accessories. 2

Using SAS: Batch vs Windows Mode SAS Windows Mode Pros: Have access to SAS help. Can avoid learning Unix commands. Better control of graphics. SAS Windows Mode Cons: SAS help isn t great. Have to purchase site license (or Xwindows software) or be in CQUEST lab. Note that SAS Windows mode is the same for both Windows and Linux operating systems on CQUEST. 3

Starting SAS Windows Mode Windows: Under Start choose Statistics Apps and then SAS. Linux: Under Applications choose CQUEST and then SAS or type sas at a command prompt in a terminal window. You will have to enter your CQUEST password. Many windows will open including: a program editor where you type your SAS program a log window an output window an explorer for navigating through your windows and output (can ignore) a graphics window will open once you ve run a program that produces graphics plots 4

To insert a line after the current line in the CQUEST SAS program editor, type an i in the leftmost column over the first zero and hit enter. For more tricks you can use in the SAS program editor, select Help, then Using this Window and click on SAS Text Editor Line Commands. To run a program: Make the program editor window your primary window (click on it) and click on the running man or select Run and then Submit. Running a program writes output to a log window, an output window (if the program ran without errors) and sometimes a graphics window. Your code will also disappear. To get it back under Run select Recall last submit. 5

Batch Mode Useful when working at home by using ssh to access CQUEST at login.cquest.utoronto.ca. Need to open a terminal window to use in a lab. Batch Mode Pros: Accessible by using ssh to access CQUEST. Batch Mode Cons: No access to SAS help files. Need to know some Unix. 6

Running SAS Programs in Batch Mode Store your program in a (text) file called filename.sas and at a command prompt type sas filename. This creates filename.log and, if your run was successful, filename.lst. If your data are in a separate file, you need to first save the data in a text file on your CQUEST account. If you are using batch mode in an Xwindows terminal, graphics will pop up in a new window on the screen. To save (or print) graphics plots, see the SAS code on slide 19. Note also that to print files on CQUEST at home you will have to first sftp them to your home computer. (If any of your SAS batch jobs have a problem and keep running, you must kill them. Type ps at a prompt to get the number of the SAS job (the PID), and then type kill -9 thepidnumber to kill a job.) 7

SAS Log Window (or.log file) Messages from SAS about your program written when you run it. Always check for ERRORs. 8

Basics of SAS Programming Every line ends with a semi-colon. SAS is not case sensitive; ALisoN and alison are the same within a program. Every line ends with a semi-colon. Anything between /* and */ is a comment. You can also comment out a single line by starting with a * but then it must end with a semi-colon. A typical SAS program has data steps (to create and manage your datasets) and procedures starting with the reserved word proc. 9

The last line of the program should be run;. You can add titles to your output with the command title This is my title ; at the beginning of your program or within any procedure. SAS has crazy ideas about line lengths. To fix this make options linesize=79; the first line of your program. (There are many other options you can set, but this is the only one that is essential.) Every line ends with a semi-colon. 10

The Data Step The code below creates a dataset called to1993. data to1993; infile maunaloadata.txt firstobs=33; input year 1-4 month 6-7 day 9-10 time1 $ date $ time2 $ cfc11 nd sd f cs rem; if cfc11 < -9999 then cfc11=.; time=year+(month-1)/12; drop day time1 date time2 nd sd f cs rem; The data are read from the file maunaloadata.txt. The data file is a text file, with a new line for each observation. Because the first 33 lines of this file contains information and not data, we need to specify firstobs=33. Because the dates are in the form 1993-01-01, we specified the columns of the data file that contain the year, month, and day. 11

The rest of the variables in the datafile are delimited by spaces. A $ after a variable name on the input line indicates that the variable is character (rather than numeric) data. (I did that here since the times and dates contain special characters, but I don t care about the information in the data file for time, date, and time2.) The missing data code in SAS is. (a period). Values of cfc11 less than 9999 were replaced with the missing value code. The variables that were read in from the data file but that we will not be using were dropped from the dataset. A new variable, time, was created from year and month. 12

More on the Data Step The code below creates a new dataset which starts with the dataset to1993 but only includes observations for which the time is before 1990. data premp; set to1993; if time < 1990; The following code concatenates the two datasets. The datalines from after1994 are appended to the datalines from to1993 in the new dataset all. If a variable exists in one of the datasets and not in the other, it is created and given missing values for the other dataset. data all; set to1993 after1994; 13

Store all of your files in your home directory. To download a data file from the web onto your CQUEST account, right click and save (under This Frame if the data are within a frame). If using the MSWindows version of SAS at home, you will need to specify the entire file path to the data in your SAS program. 14

Some SAS Procedures By default any procedure operates on the last dataset you created. To change this, add data=datasetname on any proc line before the semi-colon. gplot and plot Graphics plots go into a graphics window (from which they can be printed or saved individually) or pop up in their own window if using batch mode in an Xwindow. For a variable1 versus variable2 scatterplot: proc gplot; plot variable1*variable2; For an ugly lineprinter plot that goes in the output window (or.lst) file: proc plot; plot variable1*variable2; 15

proc print; Prints the contents of the last dataset. Useful to see if the data have been correctly read into SAS. proc corr; Prints summary statistics and pairwise correlations for all quantitative variables. proc univariate plot; var variablename; Gives many summary statistics and basic plots (or no plots if you leave out the word plot). The minimum commands for regression where variable1 is the independent variable and variable2 is the dependent variable: proc reg; model variable2=variable1; 16

More on proc reg There are many more commands you can add to all procedures, but we will focus on proc reg. More will arise in class as needed but here are some commonly used plotting commands: proc reg simple; model y=x; plot y*x; /* scatterplot with fitted line */ plot r.*x; /* plot of residuals vs x */ plot r.*p.; /* residuals vs predicted */ The keyword simple gives basic summary statistics for y and x. Note that the plots produced within proc reg are graphics plots. To have proc reg produce non-graphics plots, add the keyword lineprinter on the proc reg line. Variable names that end in a period are variables created by SAS. (proc glm can also be used for regression but has fewer options.) 17

Saving Residuals and Predicted Values to a Dataset (Useful if you want to do more analysis on the residuals or plot them using proc plot.) proc reg; model variable2=variable1; output out=outdata p=pred r=resid; creates a new dataset called outdata that includes all the variables in the original dataset plus variables called pred (the predicted values) and resid (the residuals). 18

Saving Graphics Plots to a File In SAS windows mode, choose File from the toolbar and then to print select Print or to save select Export as Image. If you are working at home by using ssh to access CQUEST you will want to save graphics plots in files within the SAS program so that you can sftp them to your home computer or print them when you come to campus. The following commands save graphics plots in gif format to files in your home directory. filename grafout. ; goptions device=gif gsfname=grafout; The files are named by the SAS procedure that created them. For example, if you have requested 3 plots within proc reg the plots will be stored in the files reg.gif, reg1.gif, and reg2.gif. Instead of gif you can specify jpeg and other image formats. 19

Example: Smoking and Cancer options ls=79; /* Data within program rather than in separate file */ data smoking; input occupational_group $ smoking mortality ; datalines; Farmers_foresters_fisherman 77 84 Miners_quarrymen 137 116 Gas_coke_chemical_makers 117 123 Glass_ceramics_makers 94 128 ; /* Note that there is no semi-colon after each of the data lines, but just one at the end. (Also, to save space for this example, I ve left out most of the data.) Also note that since occupational_group is character data there is a $ after its name in the input statement. */ 20

proc corr data=smoking; * Change plotting symbol to a black dot ; symbol1 v= dot ; proc gplot; plot mortality*smoking; proc reg data=smoking; run; model mortality=smoking; plot mortality*smoking; plot r.*smoking; plot r.*p.; 21

A Few Unix Commands These can be executed from a terminal window on the three Linux machines in RW 211 or from home when accessing CQUEST through ssh. exit Logs you off. passwd Change your password. man command Unix help for command. ls List the files in the current directory. mkdir directoryname Make a new directory. cd directoryname Change directory to directoryname. 22

cd Change to your home directory. less filename Displays filename on your screen. (q to quit.) rm filename Removes filename. cp filename1 filename2 Copies filename1 to filename2. mv filename1 filename2 Moves (renames) filename1 to filename2. Editors: On the Linux machines in RW 211, try gedit filename (the default editor). (filename can be a new file.) (Gedit is available as Text Editor when you right-click on an existing file.) If you are using ssh, try pico filename. My favourite Unix editor is vi but you need to learn about it before you use it. If you google vi unix editor you ll find some good tutorials. 23