Chapter 2: Getting Data Into SAS

Similar documents
SAS 101. Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23. By Tasha Chapman, Oregon Health Authority

DSCI 325: Handout 2 Getting Data into SAS Spring 2017

SAS PROGRAMMING AND APPLICATIONS (STAT 5110/6110): FALL 2015 Module 2

Other Data Sources SAS can read data from a variety of sources:

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS

Accessing Data and Creating Data Structures. SAS Global Certification Webinar Series

Base and Advance SAS

Getting Your Data into SAS The Basics. Math 3210 Dr. Zeng Department of Mathematics California State University, Bakersfield

Chapter 7 File Access. Chapter Table of Contents

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics

Stat 302 Statistical Software and Its Applications SAS: Data I/O

17. Reading free-format data. GIORGIO RUSSOLILLO - Cours de prépara)on à la cer)fica)on SAS «Base Programming» 386

SAS Certification Handout #6: Ch

Reading data in SAS and Descriptive Statistics

16. Reading raw data in fixed fields. GIORGIO RUSSOLILLO - Cours de prépara)on à la cer)fica)on SAS «Base Programming» 364

Level I: Getting comfortable with my data in SAS. Descriptive Statistics

Chapter 6: Modifying and Combining Data Sets

SAS CURRICULUM. BASE SAS Introduction

The Programmer's Solution to the Import/Export Wizard

Use That SAP to Write Your Code Sandra Minjoe, Genentech, Inc., South San Francisco, CA

ECLT 5810 SAS Programming - Introduction

Using Dynamic Data Exchange

Writing Programs in SAS Data I/O in SAS

The INPUT Statement: Where

Eventus Example Series Using Non-CRSP Data in Eventus 7 1

Importing CSV Data to All Character Variables Arthur L. Carpenter California Occidental Consultants, Anchorage, AK

Introduction OR CARDS. INPUT DATA step OUTPUT DATA 8-1

April 4, SAS General Introduction

SAS/FSP 9.2. Procedures Guide

Contents. About This Book...1

The INPUT Statement: Where It

Biostatistics 600 SAS Lab Supplement 1 Fall 2012

ASSIGNMENT #2 ( *** ANSWERS ***) 1

MATH 707-ST: Introduction to Statistical Computing with SAS and R. MID-TERM EXAM (Writing part) Fall, (Time allowed: TWO Hours)

An Everyday Guide to Version 7 of the SAS System

MISSOVER, TRUNCOVER, and PAD, OH MY!! or Making Sense of the INFILE and INPUT Statements. Randall Cates, MPH, Technical Training Specialist

Intermediate SAS: Working with Data

An Introduction to SAS University Edition

How to Import Part Numbers to Proman

Using an ICPSR set-up file to create a SAS dataset

Objectives Reading SAS Data Sets and Creating Variables Reading a SAS Data Set Reading a SAS Data Set onboard ia.dfwlax FirstClass Economy

SAS/ASSIST Software Setup

This Text file. Importing Non-Standard Text Files using and / Operators.

Introduction to SAS. Cristina Murray-Krezan Research Assistant Professor of Internal Medicine Biostatistician, CTSC

Chapter 1 The DATA Step

IT 433 Final Exam. June 9, 2014

Chapter 1: Introduction to SAS

Using SAS Enterprise Guide to Coax Your Excel Data In To SAS

Chapter 2 Entering Data. Chapter Table of Contents

Moving Data and Results Between SAS and Excel. Harry Droogendyk Stratia Consulting Inc.

Introduction to SAS Mike Zdeb ( , #1

HAVE YOU EVER WISHED THAT YOU DO NOT NEED TO TYPE OR CHANGE REPORT NUMBERS AND TITLES IN YOUR SAS PROGRAMS?

STAT 540 Computing in Statistics

Lecture 1 Getting Started with SAS

APPENDIX 4 Migrating from QMF to SAS/ ASSIST Software. Each of these steps can be executed independently.

FSEDIT Procedure Windows

FSVIEW Procedure Windows

SAS Studio: A New Way to Program in SAS

Understanding the ViewPoint WFM Flat File

DBLOAD Procedure Reference

.txt - Exporting and Importing. Table of Contents

Visual Streamline FAQ

Importing Career Standards Benchmark Scores

Customizing Your SAS Session

PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2

Formulas and Functions

Using SAS Files CHAPTER 3

Using Microsoft Excel

CENTRAL SUSQUEHANNA INTERMEDIATE UNIT Application: Payroll

ERROR: ERROR: ERROR:

Dictionary.coumns is your friend while appending or moving data

Technical Paper. Using SAS Studio to Open SAS Enterprise Guide Project Files. (Experimental in SAS Studio 3.6)

SAS Display Manager Windows. For Windows

Producing Summary Tables in SAS Enterprise Guide

The SAS interface is shown in the following screen shot:

Hidden in plain sight: my top ten underpublicized enhancements in SAS Versions 9.2 and 9.3

Using GSUBMIT command to customize the interface in SAS Xin Wang, Fountain Medical Technology Co., ltd, Nanjing, China

Innovative SAS Techniques

Course contents. Overview: Goodbye, calculator. Lesson 1: Get started. Lesson 2: Use cell references. Lesson 3: Simplify formulas by using functions

Understanding the ViewPoint MPI Flat File

So, Your Data are in Excel! Ed Heaton, Westat

Module 6 - Structured References - 1

Identifiers. Identifiers are the words a programmer uses in a program Some identifiers are already defined. Some are made up by the programmer:

STAT:5400 Computing in Statistics

MARK CARPENTER, Ph.D.

SAS Programs Read the raw data

Basic Concept Review

Introduction. How to Use this Document. What is SAS? Launching SAS. Windows in SAS for Windows. Research Technologies at Indiana University

Statements. Previous Page Next Page. SAS 9.2 Language Reference: Dictionary, Third Edition. Definition of Statements. DATA Step Statements

CHAPTER 13 Importing and Exporting External Data

Optimizing System Performance

CMSC 201 Fall 2016 Lab 09 Advanced Debugging

Choosing the Right Tool from Your SAS and Microsoft Excel Tool Belt

A Step by Step Guide to Learning SAS

Getting Started with Excel

THE FORMULAS TAB, CELL REFERENCING,THE VIEW TAB & WORKBOOK SECURITY THE FORMULAS TAB, CELL REFERENCING, THE VIEW TAB & WORKBOOK SECURITY OBJECTIVES

Chapter 2 (was Ch. 8) (September 21, 2008; 4:54 p.m.) Constructing a data set for analysis: reading, combining, managing and manipulating data sets

Beginning Tutorials. Introduction to SAS/FSP in Version 8 Terry Fain, RAND, Santa Monica, California Cyndie Gareleck, RAND, Santa Monica, California

Importing Local Contacts from Thunderbird

Uncommon Techniques for Common Variables

Transcription:

Chapter 2: Getting Data Into SAS Data stored in many different forms/formats. Four categories of ways to read in data. 1. Entering data directly through keyboard 2. Creating SAS data sets from raw data files 3. Converting other software s data files (e.g., Excel) into SAS data sets 4. Reading other software s data files directly (often need extra SAS/ACCESS products) University of South Carolina Page 1

1. Entering Data with Viewtable Window Tools Table Editor Enter Data Right click on Column Header for Column Attributes File Save As to save data Tools Table Editor to Browse/Edit data PROC PRINT to see data University of South Carolina Page 2

Import Window Allows you to import various types of data files (esp. Microsoft Excel) Default is for first row to be variable names. Can change using options button. Work library data set deleted after exiting SAS. Other library data set saved after exiting SAS. Can save PROC IMPORT statements used to import the data. University of South Carolina Page 3

Reading in Raw Data If you type data directly into SAS program, this is indicated with a statement like any of these: cards; datalines; lines; If your raw data is in an external file, use an INFILE statement to tell SAS where it is. Specify full path name SAS assumes each line in a data file is 256 characters or less If your lines are longer use LRECL =... Example: INFILE filepath.dat LRECL = 2000; University of South Carolina Page 4

Data Separated by Spaces Use INPUT statement to name variables. Include a $ after names of character variables. Data Arranged in Columns This is when each value of a variable is found at the same spot on the data line. Can read character or standard numeric data this way. Advantages: 1. Don t need space between values 2. Missing values don t need special symbol (can be blank) 3. Character data can have blanks 4. Can skip variables you don t need to read into SAS Example: INPUT var1 1-10 var2 11-15 var3 $ 16-30; University of South Carolina Page 5

Data Not in Standard Format Types of non-standard data: 1. Numbers with commas 2. Numbers with dollar signs 3. Hexidecimal data 4. Dates (in various formats) 5. Times of Day We can read nonstandard data using codes known as informats. Informats come in 3 categories: character, numeric, date. p. 44-45 lists many SAS informats. University of South Carolina Page 6

The period must be included!! Otherwise SAS may interpret this as a variable name. If several consecutive variables are of the same type, put names of variables in parentheses, and only enter informat (in separate parentheses) once. Example: (Score1 Score2 Score3 Score4 Score5) (4.1) University of South Carolina Page 7

Other Inputting Issues: You can mix input styles: read in some variables list-style; others column-style; others using informats. Sometimes need to explicitly move SAS to a specific column number. Example: @50 moves SAS to the 50th column. Messy Data @ character pointer: Helpful if data always begin with a certain character. Example: @ http finds variables whose values start with http colon modifier: Tells SAS exactly how many columns long a variable s field is, but stops when it reaches a space. Example: Deptname :$15. tells SAS to read Deptname for 15 characters or until it reaches a space. University of South Carolina Page 8

Multiple Lines of Data per Observation Sometimes each observation will be on several lines in the raw data file: Use / to tell SAS when to go to the next line. Or use #2, for example, to tell SAS to go to the 2nd line of the observation. Multiple Observations per Line of Raw Data Sometimes several observations will be on one line of data. Use @@ to tell SAS to stay on the raw data line and wait for the next observation. Reading Part of a Data File Sometimes we want to delete some observations based on values of one variable. Can read just the first variable(s) using the @ sign. University of South Carolina Page 9

Options for the INFILE statements (FIRSTOBS = ) FIRSTOBS = 5 tells SAS to begin reading data at the fifth line (useful when data file has header info) (OBS = ) OBS = 100 tells SAS to stop reading data after the 100th line (not necessarily after 100 observations!) MISSOVER If data line ends and there are still more variables in the INPUT statement, tells SAS to fill in rest of variables as having missing values for that observation. TRUNCOVER Tells SAS to read data for a variable only until the end of the line, even if the variable s field extends past end of the data line. University of South Carolina Page 10

Reading Delimited Files DLM= allows you to have something other than spaces separated data values. Comma delimiters: DLM=, Tab delimiters: DLM= 09 X # delimiters: DLM= # This assumes two delimiters in a row is the same as a single delimiter. What if two commas in a row indicate a missing value? What if some data values contain commas? Can use DSD option Note: Data values with commas in them must be in quotes Default with DSD is comma delimiters, but can specify other delimiters with DLM= option. University of South Carolina Page 11

SAS data sets: Temporary and Permanent Data sets stored in Work library are temporary (removed upon exiting SAS) Data sets stored in other libraries are permanent (will be saved upon exiting SAS) You can specify the library when creating a data set in the DATA step: Example: Suppose you have a library called sportlib (this is a libref ). DATA sportlib.baseball creates a data set baseball to be stored in the sportlib library (permanent). DATA work.baseball would store baseball in the work library (temporary). DATA baseball by default, stores baseball in the work library (temporary). University of South Carolina Page 12

Creating New Libraries 1. File New in Explorer window 2. LIBNAME statement. LIBNAME sportlib c:\mysaslib ; 3. Direct referencing specifies file name, lets SAS set up library. Syntax different depending on operating system (see pg. 70-71) University of South Carolina Page 13

Listing Contents of a Data Set using PROC CONTENTS Syntax: PROC CONTENTS data = datasetname; run; Will print certain details about data set and variables in it. Can include LABEL statement to provide more detail about each variable (up to 256 characters) University of South Carolina Page 14