/* CalcESLs.sas */ /* */ /* Created 02 January 2006 by Sarah Gilman */ /* report errors to

Similar documents
Base and Advance SAS

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS

SAS CURRICULUM. BASE SAS Introduction

SAS 101. Based on Learning SAS by Example: A Programmer s Guide Chapter 21, 22, & 23. By Tasha Chapman, Oregon Health Authority

Chapter 6: Modifying and Combining Data Sets

How to use UNIX commands in SAS code to read SAS logs

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

BUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office)

Contents of SAS Programming Techniques

SAS Online Training: Course contents: Agenda:

Accessing Data and Creating Data Structures. SAS Global Certification Webinar Series

Stat 302 Statistical Software and Its Applications SAS: Data I/O

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

Reading data in SAS and Descriptive Statistics

A Macro that can Search and Replace String in your SAS Programs

SAS Programs Read the raw data

Avoiding Macros in SAS. Fareeza Khurshed Yiye Zeng

Chapter 2: Getting Data Into SAS

Automated Checking Of Multiple Files Kathyayini Tappeta, Percept Pharma Services, Bridgewater, NJ

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;

PROGRAMMING ROLLING REGRESSIONS IN SAS MICHAEL D. BOLDIN, UNIVERSITY OF PENNSYLVANIA, PHILADELPHIA, PA

Identifying Duplicate Variables in a SAS Data Set

Level I: Getting comfortable with my data in SAS. Descriptive Statistics

CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

An Introduction to SAS University Edition

2. Don t forget semicolons and RUN statements The two most common programming errors.

The most recent CMS-Wave code developed is Version 3.2. Several new capabilities and advanced features in this version include:

The SAS interface is shown in the following screen shot:

IT 433 Final Exam. June 9, 2014

SAS Graphics & Code. stat 480 Heike Hofmann

To conceptualize the process, the table below shows the highly correlated covariates in descending order of their R statistic.

Hidden in plain sight: my top ten underpublicized enhancements in SAS Versions 9.2 and 9.3

WEB MATERIAL. eappendix 1: SAS code for simulation

The INPUT Statement: Where It

Centering and Interactions: The Training Data

SAS CLINICAL SYLLABUS. DURATION: - 60 Hours

Integrating Large Datasets from Multiple Sources Calgary SAS Users Group (CSUG)

Macros for Two-Sample Hypothesis Tests Jinson J. Erinjeri, D.K. Shifflet and Associates Ltd., McLean, VA

CNH Eigenvector Instructions (General notes)

ABSTRACT INTRODUCTION MACRO. Paper RF

1 Files to download. 3 Macro to list the highest and lowest N data values. 2 Reading in the example data file

= %sysfunc(dequote(&ds_in)); = %sysfunc(dequote(&var));

Moving Data and Results Between SAS and Excel. Harry Droogendyk Stratia Consulting Inc.

A Macro to Create Program Inventory for Analysis Data Reviewer s Guide Xianhua (Allen) Zeng, PAREXEL International, Shanghai, China

Paper S Data Presentation 101: An Analyst s Perspective

Electricity Forecasting Full Circle

EXST SAS Lab Lab #6: More DATA STEP tasks

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Open Problem for SUAVe User Group Meeting, November 26, 2013 (UVic)

EXST3201 Mousefeed01 Page 1

Matt Downs and Heidi Christ-Schmidt Statistics Collaborative, Inc., Washington, D.C.

Top Coding Tips. Neil Merchant Technical Specialist - SAS

Building Sequential Programs for a Routine Task with Five SAS Techniques

Using Dynamic Data Exchange

A Side of Hash for You To Dig Into

Applied Regression Modeling: A Business Approach

An Introduction to Visit Window Challenges and Solutions

Dictionary.coumns is your friend while appending or moving data

DSCI 325: Handout 2 Getting Data into SAS Spring 2017

22S:172. Duplicates. may need to check for either duplicate ID codes or duplicate observations duplicate observations should just be eliminated

Introductory SAS example

ST Lab 1 - The basics of SAS

BIOMETRICS INFORMATION

Using SAS/SCL to Create Flexible Programs... A Super-Sized Macro Ellen Michaliszyn, College of American Pathologists, Northfield, IL

Utilizing SAS for Cross- Report Verification in a Clinical Trials Setting

AURA ACADEMY SAS TRAINING. Opposite Hanuman Temple, Srinivasa Nagar East, Ameerpet,Hyderabad Page 1

Eventus Example Series Using Non-CRSP Data in Eventus 7 1

(on CQUEST) A.L. Gibbs

The Proc Transpose Cookbook

Clinical Data Visualization using TIBCO Spotfire and SAS

Using Templates Created by the SAS/STAT Procedures

22S:166. Checking Values of Numeric Variables

A Macro to Keep Titles and Footnotes in One Place

The Ugliest Data I ve Ever Met

SAS Linear Model Demo. Overview

17. Reading free-format data. GIORGIO RUSSOLILLO - Cours de prépara)on à la cer)fica)on SAS «Base Programming» 386

Biostatistics & SAS programming. Kevin Zhang

A Step by Step Guide to Learning SAS

A Table Driven ODS Macro Diane E. Brown, exponential Systems, Indianapolis, IN

Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview

Importing CSV Data to All Character Variables Arthur L. Carpenter California Occidental Consultants, Anchorage, AK

Use That SAP to Write Your Code Sandra Minjoe, Genentech, Inc., South San Francisco, CA

Using PROC SQL to Generate Shift Tables More Efficiently

Write SAS Code to Generate Another SAS Program A Dynamic Way to Get Your Data into SAS

LIBNAME CCC "H:\Papers\TradeCycle _replication"; /*This is the folder where I put all the three data sets.*/ RUN;

A Cross-national Comparison Using Stacked Data

STAT 7000: Experimental Statistics I

Tales from the Help Desk 6: Solutions to Common SAS Tasks

SUGI 29 Data Warehousing, Management and Quality

Intermediate SAS: Working with Data

A Macro that Creates U.S Census Tracts Keyhole Markup Language Files for Google Map Use

Using an ICPSR set-up file to create a SAS dataset

Untangling and Reformatting NT PerfMon Data to Load a UNIX SAS Database With a Software-Intelligent Data-Adaptive Application

PharmaSUG Paper TT11

Top Award and First Place Best Presentation of Data Lan Tran-La. Scios Nova, Inc. BLOOD PRESSURE AND HEART RATE vs TIME

Quality Control in SAS : Checking Input, Work, and Output

Introduction to SAS. Cristina Murray-Krezan Research Assistant Professor of Internal Medicine Biostatistician, CTSC

Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment

Transcription:

/* CalcESLs.sas */ /* ------------ */ /* Created 02 January 2006 by Sarah Gilman */ /* report errors to gilmans@u.washington.edu */ /* ---------------------------------------------------------------- */ /* This is a sample SAS program to calculate ESL regressions over */ /* a range of drop thresholds and wave height metrics. */ /* It was written for SAS v9.1 for PC and should be compatible */ /* with SAS v.8 as well. To run on SAS v6 for Macintosh, some */ /* dataset and variable names may need to be shortened */ /* The program assumes you have pairs of files contaning drop time */ /* and height data of the type created by SiteParser and are */ /* using hourly wave height data in the format used by the National */ /* Data Buoy Center (www.ndbc.noaa.gov) */ /* In most cases you should need to make changes only to Part 1. */ /* Read through Part 1 carefully as it contains the information */ /* for the program to find the input data files and locations to */ /* save output */ /* The general procedure is to submit each of the 4 parts to SAS */ /* separately, checking the log for errors between each part. */ /* ---------------------------------------------------------------- */ /* Part1 - Source & Output Files */ /* ----------------------------- */ /* Modify the following sections to point to the relevant file */ /* locations on your computer */ /* Output Files */ /* ------------ */ /* The program will save the regression summary statistics to */ /* a user-specified location. The file will be saved in CSV */ /* format and can be imported into Excel or other programs. This */ /* data is also in a SAS dataset labelled 'reg_results'. The */ /* raw data for regression is located in a SAS dataset labelled */ /* 'toreg' */ /* replace with the path for your folder */ %let fpath=c:\user\my Documents\myloggers\; /* filename for regression statistics, omit extension */ %let outname=eslregs; /* SiteParser file locations */ /* ---------------------- */ /* This program assumes a separate set of output files for */ /* each year of data and drop threshold. If you have only one */ /* year, set fyear and lyear to same value. Similarly, if */ /* there is only one drop threshold, set firstdrop and lastdrop */ /* to same value */ %let firstdrop = 2; /* lowest drop threshold */ %let lastdrop = 7; /* highest drop threshold */ %let fyear=2000; /* first year of data */ %let lyear=2004; /*last year of data */ /* Actual file locations (use full path). Use a separate entry for */ /* the droptime and droptide files for each year and drop value. */

/* The variable names must be of the format tide_y####_d#, where */ /* #### is the a year between fyear and lyear and # is an integer */ /* between firstdrop and lastdrop. Add more rows if necessary */ %let tide_y2000_d2="c:\documents and Settings\Administrator\Desktop\droptide_HMSMussel2002.txt"; %let time_y2000_d2="c:\documents and Settings\Administrator\Desktop\droptime_HMSMussel2002.txt"; /* Buoy file locations */ /* ------------------- */ /* List the locations of all the NDBC buoy files (usually 1 per */ /* calendar year) Add additional rows if necessary */ %let b1="c:\my Documents\NDBC data\46042h2000.txt"; %let b2="c:\my Documents\NDBC data\46042h2001.txt"; %let buoys=&b1 &b2; /* add more variables to this list if needed */ /* CONGRATULATIONS! You have made it to the end of Part 1!!!! */ /* Part2 - Reading source files */ /* ---------------------------- */ /* Short routines to read in the SiteParser and NDBC data files */ /* SiteParser data */ /* ------------ */ proc format; invalue twomiss 'N/A'=. 'No Data'=.m -99999-99999=[best8.] ; invalue tmiss 'N/A'=. 'No Data'=.m other=time5. ; %macro putlgvars (fnum, lnum, type); %if (&type=number) %then %do; %do k=&fnum %to &lnum; log&k :twomiss. %else %if (&type=time) %then %do; %do k=&fnum %to &lnum; log&k :tmiss. %mend putlgvars; %macro read1_sp (filei,varname,vtype); data line2var; format namelist $500.; format textname $30.; format logname $8.; infile &filei firstobs=3 obs=3 truncover ls=500 dsd; input namelist :$500.; ctr=1; textname=upcase(scan(namelist,ctr,' ')); do while(textname NE ""); logname="log" trim(left(ctr)); output;

put logname= textname=; ctr=ctr+1; textname=upcase(scan(namelist,ctr,' ')); call symput("flog",trim(left(1))); call symput("llog",trim(left(ctr-1))); put firstlog= ctr=; keep logname textname; options errors=5; data readcol; format datetxt $20.; infile &filei dlm=' ' firstobs=5 dsd missover; input datetxt :$20. %putlgvars(&flog,&llog,&vtype); dateval=input(datetxt,mmddyy10.); format dateval date7.; drop datetxt; proc transpose data=readcol out=long1(rename=(_name_=logname col1=&varname)); by dateval; proc sort data=long1; by logname; proc sort data=line2var; by logname; data long&varname; merge line2var long1; by logname; attrib _all_ label=""; %mend read1_sp; %macro build_pfiles; %do y=&fyear %to &lyear; %do d=&firstdrop %to &lastdrop; %let timefile=&&time_y&y._d&d; %let tidefile=&&tide_y&y._d&d; %read1_sp (timefile,dtime,time); %read1_sp (tidefile,dtide,number); proc sort data=longdtime; by dateval textname; proc sort data=longdtide; by dateval textname; data oneset; merge longdtime longdtide; by dateval textname; drop logname; if (dtime=. or dtime=.m) then delete; realtime=dhms(dateval,0,0,0) + dtime; rthour=round(realtime,hms(1,0,0)); dropnum=&d; proc append base=alldrops new=oneset; data alldrops; set alldrops;

rename textname=logname; %mend build_pfiles; %build_pfiles; /* buoy data */ /* ------------ */ %macro elimdoubles(dataso,timevar,valuevar); /* will delete duplicate time intervals in a dataset */ /* assume the time interval with the smaller data value is the one to delete */ /* this should default to removing missing values over non-missing values */ proc sort data=&dataso; by &timevar descending &valuevar; data &dataso; set &dataso; lagtime=lag(&timevar); if lagtime=&timevar then delete; drop lagtime; %mend elimdoubles; %macro read_buoy(varlist,start,stop,fname,interpol); %put read_buoy called to read &varlist; %put from file &fname; %put start=&start stop=&stop; %let dirwind=%index(&varlist,winddir); %if (&dirwind) %then %let varlist= &varlist nwind ewind; filename bfile (&fname); VIS; data rawbuoy; infile bfile ; input Year mo Day hr winddir wind GST Wave DPD APD MWD BAR tair WTeMP DEWP if (hr=.) then delete; if winddir=999 then winddir=.; if wind=99. then wind=.; if gst=99. then gst=.; if wave=99. then wave=.; if dpd=99. then dpd=.; if apd=99. then apd=.; if mwd=999. then mwd=.; if bar=9999 then bar=.; if tair=999. then tair=.; if wtemp=999. then wtemp=.; if dewp=999. then dewp=.; if vis=99. then vis=.; realtime=dhms(mdy(mo,day,year),hr,0,0); /*convert from UTC to PST*/ realtime=realtime-dhms(0,8,0,0); year=year(datepart(realtime)); mo = month(datepart(realtime)); day = day(datepart(realtime)); hr= hour(timepart(realtime)); min= minute(timepart(realtime)); if (&dirwind) then do;

nwind=cos(winddir*constant('pi')/180); ewind=sin(winddir*constant('pi')/180); keep year mo day hr min realtime &varlist; %elimdoubles(rawbuoy,realtime,&varlist); %if &interpol=yes %then %do; proc expand data=rawbuoy out=rbuoy3 from=hour.; id realtime; convert &varlist /method=join; %if (&dirwind) %then %do; %let varlist2 = %sysfunc(tranwrd(&varlist,winddir,)); proc expand data=rbuoy3 out=rbuoy4 from=hour. to=minute10.; id realtime; convert &varlist2/method=join; convert winddir/method=step; %else %do; proc expand data=rbuoy3 out=rbuoy4 from=hour. to=minute10.; id realtime; convert &varlist/method=join; data buo_var; set rbuoy4; if (&start>realtime OR &stop<realtime) then delete; %else %do; data buo_var; set rawbuoy; if (&start>realtime OR &stop<realtime) then delete; %mend read_buoy; %read_buoy(wave,dhms(&fday,0,0,0),dhms(&lday,0,0,0),&buoys,no); data obswave; set buo_var; rthour=realtime; keep rthour wave; /* Part3 - Regressions */ /* ------------------- */ /* Calculate different wave time window */ proc expand data=obswave out=windows method=none; id rthour; convert wave=w01max; convert wave=w03max/transform=(movmax 3); convert wave=w06max/transform=(movmax 6); convert wave=w09max/transform=(movmax 9); convert wave=w12max/transform=(movmax 12); convert wave=w24max/transform=(movmax 24); proc sort data=windows;

by rthour; proc sort data=alldrops; by rthour; data toreg; merge alldrops(in=sd2) windows(in=wd); by rthour; if sd2 and wd; /* calculate the mean wave for each time window */ /* necessary to calculate average run-up */ proc sort data=toreg; by rthour; proc summary data=toreg noprint; by rthour; var w01max w03max w06max w09max w12max w24max; output out=ndb1 max=; proc summary data=ndb1 noprint; var w01max w03max w06max w09max w12max w24max; output out=ndb2 mean=; data _null_; set ndb2; call symput("avn01",trim(left(w01max))); call symput("avn03",trim(left(w03max))); call symput("avn06",trim(left(w06max))); call symput("avn12",trim(left(w12max))); call symput("avn09",trim(left(w09max))); call symput("avn24",trim(left(w24max))); /* run all possible regressions */ proc sort data=toreg; by dropnum logname; proc reg data=toreg outest=allregs outseb edf noprint; by dropnum logname; var dtide w01max w03max w06max w09max w12max w24max; w01max: model dtide=w01max; w03max: model dtide=w03max; w06max: model dtide=w06max; w09max: model dtide=w09max; w12max: model dtide=w12max; w24max: model dtide=w24max; data allregs2; set allregs; attrib _all_ label=" "; data allregs2a; set allregs2(where=(_type_="parms")); select (_model_); when('w01max') slope=w01max; when('w03max') slope=w03max; when('w06max') slope=w06max; when('w09max') slope=w09max; when('w12max') slope=w12max;

when('w24max') slope=w24max; otherwise; nobs=_p_+_edf_; drop _type_ w01max w03max w03max w06max w09max w12max w24max _p in edf_; data allregs2b; set allregs2(where=(_type_="seb")); select (_model_); when('w01max') se_slope=w01max; when('w03max') se_slope=w03max; when('w06max') se_slope=w06max; when('w09max') se_slope=w09max; when('w12max') se_slope=w12max; when('w24max') se_slope=w24max; otherwise; se_int=intercept; drop _type_ w01max w03max w06max w09max w12max w24max _p in edf_ intercept _rsq rmse_; data reg_results; merge allregs2a allregs2b; by dropnum logname _model_; attrib _all_ label=""; select (_model_); when('w01max') meanwave=&avn01; when('w03max') meanwave=&avn03; when('w06max') meanwave=&avn06; when('w09max') meanwave=&avn09; when('w12max') meanwave=&avn12; when('w24max') meanwave=&avn24; otherwise ; runup_avg=abs(slope*meanwave); wavehr=int(substr(_model_,2,2)); format _rmse_ intercept _rsq_ slope se_slope se_int meanwave runup_avg 6.3; /* Part4 - Output */ /* -----------------------------*/ /* Prints the top 6 regressions by logger and also plots */ /* all R-squared values. Saves regression output to */ /* the designated file location */ /* print the top six Rsquared values for each logger */ proc sort data=reg_results; by logname descending _rsq_; data topten; set reg_results; rank + 1; by logname; if first.logname then rank=1; proc print data=topten(where=(rank<6)) noobs; title "Top 6 Regressions (based on Rsq) by logger"; var logname dropnum _model_ nobs intercept se_int slope se_slope _rsq_ runup_avg; title;

proc sort data=reg_results; by dropnum logname; /* Plot Rsquared values for all regression */ proc gplot data=reg_results; by dropnum; plot _rsq_*wavehr=logname/vaxis=0 to 1 by 0.1; quit; /*Export the summary file */ proc sort data=reg_results; by dropname logname wavehr; proc export data=reg_results outfile="&fpath&outname..csv" dbms=csv replace;