Logging the Log Magic: Pulling the Rabbit out of the Hat

Similar documents
Program Validation: Logging the Log

Base and Advance SAS

Introduction. Getting Started with the Macro Facility CHAPTER 1

A Time Saver for All: A SAS Toolbox Philip Jou, Baylor University, Waco, TX

Paper CC16. William E Benjamin Jr, Owl Computer Consultancy LLC, Phoenix, AZ

Exporting & Importing Datasets & Catalogs: Utility Macros

title1 "Visits at &string1"; proc print data=hospitalvisits; where sitecode="&string1";

SAS Macro Language: Reference

Run your reports through that last loop to standardize the presentation attributes

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS

PharmaSUG Paper TT11

CC13 An Automatic Process to Compare Files. Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ

A Macro To Generate a Study Report Hany Aboutaleb, Biogen Idec, Cambridge, MA

SAS CURRICULUM. BASE SAS Introduction

Copy That! Using SAS to Create Directories and Duplicate Files

AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike Zdeb, School of Public Health

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

Basic Macro Processing Prepared by Destiny Corporation

Going Under the Hood: How Does the Macro Processor Really Work?

Acknowledgments xi Preface xiii About the Author xv About This Book xvii New in the Macro Language xxi

Why & How To Use SAS Macro Language: Easy Ways To Get More Value & Power from Your SAS Software Tools

So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines

SAS Macro. SAS Training Courses. Amadeus Software Ltd

A Mass Symphony: Directing the Program Logs, Lists, and Outputs

Using SAS to Control the Post Processing of Microsoft Documents Nat Wooding, J. Sargeant Reynolds Community College, Richmond, VA

DSCI 325: Handout 15 Introduction to SAS Macro Programming Spring 2017

ABSTRACT INTRODUCTION THE GENERAL FORM AND SIMPLE CODE

3. Almost always use system options options compress =yes nocenter; /* mostly use */ options ps=9999 ls=200;

Foundations and Fundamentals. SAS System Options: The True Heroes of Macro Debugging Kevin Russell and Russ Tyndall, SAS Institute Inc.

Debugging. Where to start? John Ladds, SAS Technology Center, Statistics Canada.

Using GSUBMIT command to customize the interface in SAS Xin Wang, Fountain Medical Technology Co., ltd, Nanjing, China

Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.

A Macro that can Search and Replace String in your SAS Programs

A Macro to Create Program Inventory for Analysis Data Reviewer s Guide Xianhua (Allen) Zeng, PAREXEL International, Shanghai, China

MOBILE MACROS GET UP TO SPEED SOMEWHERE NEW FAST Author: Patricia Hettinger, Data Analyst Consultant Oakbrook Terrace, IL

Unlock SAS Code Automation with the Power of Macros

Taming a Spreadsheet Importation Monster

Internet, Intranets, and The Web

ABSTRACT DATA CLARIFCIATION FORM TRACKING ORACLE TABLE INTRODUCTION REVIEW QUALITY CHECKS

Simplifying Your %DO Loop with CALL EXECUTE Arthur Li, City of Hope National Medical Center, Duarte, CA

2. Don t forget semicolons and RUN statements The two most common programming errors.

More About SAS Macros

A Tutorial on the SAS Macro Language

Arthur L. Carpenter California Occidental Consultants, Oceanside, California

SAS File Management. Improving Performance CHAPTER 37

CONQUERING THE DREADED MACRO ERROR

Create a Format from a SAS Data Set Ruth Marisol Rivera, i3 Statprobe, Mexico City, Mexico

PharmaSUG Paper PO12

Using UNIX Shell Scripting to Enhance Your SAS Programming Experience

The %let is a Macro command, which sets a macro variable to the value specified.

Checking for Duplicates Wendi L. Wright

Lecture 1 Getting Started with SAS

DATA Step Debugger APPENDIX 3

LST in Comparison Sanket Kale, Parexel International Inc., Durham, NC Sajin Johnny, Parexel International Inc., Durham, NC

Developing Data-Driven SAS Programs Using Proc Contents

Syntax Conventions for SAS Programming Languages

QUEST Procedure Reference

BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS

Using a Control Dataset to Manage Production Compiled Macro Library Curtis E. Reid, Bureau of Labor Statistics, Washington, DC

T.I.P.S. (Techniques and Information for Programming in SAS )

Get Started Writing SAS Macros Luisa Hartman, Jane Liao, Merck Sharp & Dohme Corp.

Routing Output. Producing Output with SAS Software CHAPTER 6

Better Metadata Through SAS II: %SYSFUNC, PROC DATASETS, and Dictionary Tables

Tired of CALL EXECUTE? Try DOSUBL

SAS Data Integration Studio Take Control with Conditional & Looping Transformations

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY

SURVIVING THE SAS MACRO JUNGLE BY USING YOUR OWN PROGRAMMING TOOLKIT

CHAPTER 7 Using Other SAS Software Products

SAS Programming Techniques for Manipulating Metadata on the Database Level Chris Speck, PAREXEL International, Durham, NC

Your Own SAS Macros Are as Powerful as You Are Ingenious

Contents. Overview How SAS processes programs Compilation phase Execution phase Debugging a DATA step Testing your programs

Top-Down Programming with SAS Macros Edward Heaton, Westat, Rockville, MD

Customizing Your SAS Session

Sandra Hendren Health Data Institute

Statistics, Data Analysis & Econometrics

Paper SAS Programming Conventions Lois Levin, Independent Consultant, Bethesda, Maryland

How to use UNIX commands in SAS code to read SAS logs

Efficiency Programming with Macro Variable Arrays

SAS Macro Dynamics - From Simple Basics to Powerful Invocations Rick Andrews, Office of the Actuary, CMS, Baltimore, MD

TLF Management Tools: SAS programs to help in managing large number of TLFs. Eduard Joseph Siquioco, PPD, Manila, Philippines

Seminar Series: CTSI Presents

Optimizing System Performance

SAS Viya 3.1 FAQ for Processing UTF-8 Data

BEA Tuxedo. System Messages CMDFML Catalog

ABSTRACT INTRODUCTION TRICK 1: CHOOSE THE BEST METHOD TO CREATE MACRO VARIABLES

Make Your Life a Little Easier: A Collection of SAS Macro Utilities. Pete Lund, Northwest Crime and Social Research, Olympia, WA

How a Code-Checking Algorithm Can Prevent Errors

SAS Macro Dynamics: from Simple Basics to Powerful Invocations Rick Andrews, Office of Research, Development, and Information, Baltimore, MD

Cisco IOS Shell. Finding Feature Information. Prerequisites for Cisco IOS.sh. Last Updated: December 14, 2012

ODS/RTF Pagination Revisit

Contents of SAS Programming Techniques

Receiving Items. Purchasing: Using CounterPoint. Overview. Introduction to Receiving. Receive P.O. s

Using SAS 9.4M5 and the Varchar Data Type to Manage Text Strings Exceeding 32kb

SAS Data Libraries. Definition CHAPTER 26

The Demystification of a Great Deal of Files

Posters. Workarounds for SASWare Ballot Items Jack Hamilton, First Health, West Sacramento, California USA. Paper

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA

ET01. LIBNAME libref <engine-name> <physical-file-name> <libname-options>; <SAS Code> LIBNAME libref CLEAR;

Symbol Table Generator (New and Improved) Jim Johnson, JKL Consulting, North Wales, PA

Transcription:

ABSTRACT PharmaSUG2010 - Paper TT08 Logging the Log Magic: Pulling the Rabbit out of the Hat Adel Fahmy, BenchWorkzz, Austin, Texas Program Validation includes checking both program Log and Logic. Program Log should be free of any Error/Warning/Note messages, which is the focus of this paper. Program Logic is revised manually or by writing an independent rough code that confirms its results. During the development phase, programmers use SAS system options (MPRINT, MTRACE, MLOGIC, MACROGEN, SYMBOLGEN), and PROC SQL option (FEEDBACK), which result in lengthy log files. Although moving the final program into production should involve switching off all of those options, this is not the case all the time. Trying to visually scan thousands of pages of a log file is a tedious work that involves searching individual strings (Error, Warning, etc.) one at a time. All possible messages haven't necessarily been addressed, particularly those that do not start with the Error or Warning word. The generic macros presented here automate the whole process, by filtering all the logs in the same project, searching for all errors and producing a separate report for each processed log. A log of thousands of pages is trimmed to merely few lines of messages that need to be resolved. The user is given full control to de-activate less serious messages, or add user-defined ones. Over 200 search features have been considered. KEYWORDS Log, Error, Warning, Validation, Compliance, Audit, Quality Control, QC, Quality Assurance, QA. AUDIENCE Programmers, Validators, Auditors & Quality Control (QC) Professionals with limited or advanced SAS experience. THE LOG FILE Some SAS options will help giving execution details during the development phase of a program. Those options may result in lengthy log files which are difficult to eye scan for error messages. A good practice is to turn off those options before moving the final program into production. A non-sas user should be able to look at a short and clean log. Whatever the resulting log, the macro will process it and filter the error messages, with the exception of displaying SAS Help full details in the log. Turning on one or more of the SAS system options MPRINT, MTRACE, MLOGIC, MACROGEN, or SYMBOLGEN will show how macro variables have been resolved and how the macros have been executing. OPTIONS MPRINT MTRACE MLOGIC MACROGEN SYMBOLGEN ; OPTIONS NOMPRINT NOMTRACE NOMLOGIC NOMACROGEN NOSYMBOLGEN ; The system options STIMER and FULLSTIMER specify that SAS write to the log a list of computer resources that were used for each step and for the entire SAS session. You can turn them off if you are not producing performance metrics. OPTIONS STIMER FULLSTIMER ; OPTIONS NOSTIMER NOFULLSTIMER ; The option FEEDBACK in PROC SQL shows how the procedure was executing. PROC SQL FEEDBACK ; PROC SQL ; Avoid using SAS reserved words for messages such as a macro name "DUMMY", or a dataset name "WARNING", or a counter called "ERRORS". 1

Free Text: ODS Help displayed in the log. Specifying the OPTIONS suboption ( DOC= HELP' ) prints help for the ODS TAGSETS specified panel to the SAS log. It will print description of each available suboption in full details, which results in a very lengthy log file. Saving all the system help information in the log MUST be disabled before using this macro. FILENAME xml "&dir\output\tables\logs\xxx1.xml" ; ODS TAGSETS.ExcelXP OPTIONS ( DOC="HELP" ) FILE=xml; Free Text: Site-Specific News. Your system may be set up to display site-specific news in the log. The news message is contained in the "News" file, which is in "MISC/BASE" directory. To prevent news display, edit this file and use the command line option "-NONEWS". SAVING A LOG IN AN EXTERNAL FILE During development, it is more convenient to display the log in the Program Log window. Use PRINTTO procedure to automate saving the log in a file. %LET progname = T_1_1 ; %LET dir = %STR ( c:\sponsor\protocol\study ) ; %LET savelog = %STR ( &dir\output\tables\logs\&progname..log ) ; PROC PRINTTO NEW LOG = "&savelog" ; SAVING ALL LOGS AND ALL VALIDATIONS It is more convenient to save all related logs of Tables, Listings, and Derived Datasets, each in a separate LOGS directory. Also, create sub-directories LOGSVLD to save the log validation reports in. For example: &dir\output\tables\logs &dir\output\listings\logs &dir\output\derived\logs &dir\output\tables\logs\logsvld &dir\output\listings\logs\logsvld &dir\output\derived\logs\logsvld AVOIDING UNNECESSARY LOG MESSAGES Some messages may not cause any harm to the results but could be avoided by good coding practice. For example: NOTE: Input data set is already sorted, no sorting done. NOTE: Input data set is already sorted; it has been copied to the output data set. Avoiding repeated sort of the same dataset in the same order could eliminate this, and will improve efficiency. NOTE: 0 observations with duplicate key values were deleted. This may result from unnecessary repeated sorts, with the PROC SORT options NODUP or NODUPKEY. NOTE: Missing values were generated as a result of performing an operation on missing values. Consider the statement ( y = 2 * x ; ). This will result in the previous log message when x value is blank. This could be avoided by modifying the statement to the following simple code to handle the situation when the encountered value of the field is blank. IF x =. THEN y =. ; ELSE y = 2 * x ; NOTE: Character values have been converted to numeric values. NOTE: Numeric values have been converted to character values. Avoid using the bad programming habit to convert a character value into a numeric one, n = 1 * c. If n is numeric and c is character, instead of n = c, or c = n, use: LENGTH n 3 c $3 ; n = INPUT ( c, 3. ) ; c = PUT ( n, $3. ) ; 2

NOTE: No observations in data set WORK.RAW1. NOTE: The data set WORK.RAW1 has 0 observations and 0 variables. This could be avoided by using %SYSFUNC and the numeric attribute function ATTRN to capture the Number of Observations, then conditionally execute the succeeding steps based on whether the dataset is empty or not. %LET dsid = %SYSFUNC ( OPEN ( dsname, i ) ) ; %LET nobs = %SYSFUNC ( ATTRN ( &dsid, NOBS ) ) ; %LET dsid = %SYSFUNC ( CLOSE ( &dsid ) ) ; WARNING: Format GENDER is already on the library. This can be eliminated by excluding multiple declaration of the same VALUE statement inproc FORMAT or multiple inclusions of the same format library. ERROR: Unable to clear or re-assign the library MAC because it is still in use. This message appears while using compiled macros in a session and trying to free all libraries (which means the macro library as well). The compiled macro library can not be cleared until the SAS session is terminated. LIBNAME _ALL_ CLEAR ; EXCLUSIONS The following program statements may appear in the final report because of the SAS key words they are using, but it is understood that they are valid programming statements. Many error messages are Free Text (not titled as ERROR, WARNING, or NOTE). The macro does not exclude those, unless the user de-activate them, so as to prevent other error statements from slipping by. SAS reserved keywords _ERROR_ & ERRORS appearing in source code are specially handled. MISSING m i u ; [ in a data step ] IF x =. THEN DELETE ; [ in a data step ] dashes = REPEAT ( "-", 20 ) ; [ in a data step ] IF type = "teen" & age > 19 THEN _ERROR_ = 1 ; [ in a data step ] The code sets the Automatic variable _ERROR_ to 1, so that the current observation data is written to the log. OPTIONS ERRORS = 0 ; This statement sets the maximum number of observations for which complete error messages are printed to the log, while processing continues. My Comment: This is a user comment that says Accuracy to avoid data entry ERROR. A Comment that started at an earlier line using "*" or "%*" or "/*", and terminated at a later line using ";" or "*/". READING THE LOG FILE SAS Log file has a different record length for each line. Reading variable record length techniques should be used to avoid missing any of the information that needs to be captured. Also, SAS logs have many absolutely blank lines, without even a single space (just a carriage return). Reading variable record length will avoid the following message in our own log. NOTE: SAS went to a new line when INPUT statement reached past the end of a line. KEYWORD REFERENCE FILES The Keyword Reference files contain the most common keywords and text strings that appear in log messages. For flexibility, this file is split into 3 parts; KeyRef, UsrOpt, and UsrSpc. The Active field, in both UsrOpt and UsrSpc files, gives the option to activate a keyword by setting it to 1 (Yes), or de-activate it by setting it to 0 (No). No need to delete any keyword. Reading variable record length techniques should be used in reading-in the single words and text strings. All status events will be categorized into 6 sub-groups (ERROR, WARNING, NOTE, INFO, TEXT & CONTINUE). 1. _KeyRef This is our master Keywords Reference file, with the most common and serious keywords (over 200). This file should have 1 field (Key) only. All keywords are mandatory, and should be set up to ACTIVE = 1 (Yes). This file should be a Read Only file, with no change unless for upgrade. The Administrator may add any keywords to it, which arise from working on a different platform or environment. 3

This file will be "INTERNALLY" copied to a temporary KeyRef work file. ABORT, AMBIGUOUS, APPARENT,....., ERROR, INFO, WARNING,....., YIELD. 2. _UsrOpt This is the User Option for exclusion of less serious keywords that may be ignored. Those messages do not affect mathematical calculations or expected outcome. Some words are being used in SAS options and other valid statements as well as in reporting errors in the log. Since it is impossible to predict how each user will use those words, they are included in this optional part. This file has 2 fields Active (Numeric 1 length) and Key (Character 25 length). Those optional keywords are initially set by default to ACTIVE = 1 (Yes), unless user de-activate any to 0 (No). From this file, select only those keywords with Active=1 (Yes). Keep only the keywords and drop the Active field. This file has the following optional keywords. 0 OBSERVATIONS, 0 VARIABLES, ACCESS, ALREADY, CONVERTED, COPIED, DELETED, DUPLICATE, EXPIRED, MISSING, NO OBSERVATIONS, NO SORTING, NO VARIABLES, WHEN. 3. UsrSpc This is the User Specifics for any additional keywords that the user wants to search for. This file structure and logic is identical to the UsrOpt file. From experience this feature is rarely used, because obviously it will increase the log file size. A limit should be put on the maximum number of words that a user may activate, otherwise the log will be larger. A user may enter words in lower, upper, or mixed case. The macro will Uppercase them. From this file, select only those keywords with Active=1 (Yes). Keep only the keywords and drop the Active field. There are some sample keywords that I use for testing the functionality of this feature. RESOLVES, SECONDS. CONSOLIDATING ALL KEYWORD FILES Concatenate all the 3 resulting files (KeyRef, UsrOpt & UsrSpc) into one temporary Internal file, KeyRef file. Sort this file, excluding duplicates. Now, we have the complete, cleaned and organized KeyRef file. CATEGORIZING OF STATUS EVENTS This could be done only on the Captured error lines in the log. This logic will not need to be updated in future, when we add more keywords, or when SAS adds new messages. SAS has 4 standard categories (ERROR, WARNING, NOTE & INFO) that precede the message text. A free TEXT category has been added for those messages that are not preceded by any SAS standard category. Many log messages are split on multiple lines. A CONTINUE category is added to capture continuation lines. We should have 6 categories. See Appendix 2. (1) ERROR (2) WARNING (3) NOTE (4) INFO (5) TEXT (6) CONTINUE PURPOSE OF PROGRAM To scan and filter all log files, located at the same directory, for ERROR, WARNING, NOTE, INFO, and TEXT messages, that may include issues with syntax or logic that need to be addressed in the source program. The extensive search engine uses over 200 keywords to ensure capturing ALL log status events. A separate report is produced for each processed log. In Appendix 0, you have an overall project summary. In Appendix 1 & 2, sample output for typical 2 log files, one is clean, and another one full of errors. DESCRIPTION The process is automated through a set of macros and keyword reference files. With all the %MACRO statements, the following options are made to store the compiled macro and give it a description. / STORE DES = "my description" ; For efficiency, all macros constantly use: - DROP statement to drop unnecessary variables, - PROC DATASETS to delete no longer needed datasets, - LIBNAME and FILENAME statements to clear no longer needed library and file refs. 4

(1) THE CALLING PROGRAM Define the compiled macro library, only ONCE in a session. Under Microsoft Windows platform, macro library cannot be deleted or redefined within the same session. %LET dsk = c ; %LET dir = %STR (&dsk:\_research\logzlog) ; %LET mac = %STR (&dir\macros) ; LIBNAME mac "&mac" ; OPTIONS MSTORED SASMSTORE = mac ; Define the following directory and file locations: - Where the log files are located, - Where to save the validation output LOG_progname.LST, - Where to save a list _LogList.TXT that will capture the names of all identified log files, - Where the Keyword Reference files (_KeyRef.TXT, _UsrOpt.TXT, _UsrSpc) are located. %LET DirLog = %STR ( &dir\output\tables\logs ) ; %LET DirVld = %STR ( &DirLog\logsvld ) ; FILENAME _LogList "&DirVld\_LogList.txt" ; FILENAME _KeyRef "&dir\data\_keyref.txt" ; FILENAME _UsrOpt "&dir\data\_usropt.txt" ; FILENAME _UsrSpc "&dir\data\_usrspc.txt" ; Execute compiled macros and save the Output. %process ; (2) COUNTIT Macro Count the Number of Observations in a dataset. It is used to count the number of Key Words in the KeyRef file and the number of Log files to be processed. This macro is very efficient, particularly in large datasets, since it does not read in every record of the file. %MACRO countit ( dsname = ) ; %GLOBAL nobs ; %LET dsid = %SYSFUNC ( OPEN ( &dsname, i ) ) ; %LET nobs = %SYSFUNC ( ATTRN ( &dsid, nobs ) ) ; %LET dsid = %SYSFUNC ( CLOSE ( &dsid ) ) ; %MEND countit ; (3) KEYREF Macro Build a Keyword Reference Array. Read the 3 Keyword Reference files, concatenate, and sort, eliminating duplicates, if any. Find how many key words are there. The purpose of creating the array of keywords is to make All of keywords available at once in front of any line of the log file that we are going to read later. In such case, we can compare elements of the array, one at a time, until we find a match. %MACRO keyref ( keyref =, usropt =, usrspc = ) ; DATA &keyref ( DROP = char1 zrest varlen ) ; INFILE &keyref LENGTH = linelen ; INPUT @1 char1 $1. @ ; varlen = linelen - 1 ; INPUT @2 zrest $VARYING24. varlen ; LENGTH key $25 ; key = char1 zrest ; 5

DATA &usropt ;.... (Similar to KeyRef) DATA &usrspc ;.... (Similar to KeyRef) DATA &keyref ; SET &keyref &usropt &usrspc ; PROC SORT NODUPKEY OUT = &keyref ; BY key ; %countit ( dsname = &keyref ) ; %LET nkey = %EVAL ( &nobs ) ; DATA keyarray ( DROP = i key ) ; SET &keyref END = eof ; RETAIN m1-m&nkey ; ARRAY m (*) $25 m1-m&nkey ; DO i=1 TO &nkey ; IF i = _n_ THEN m(i) = key ; END ; IF eof ; %MEND keyref ; (4) GETLOGS Macro By using the X command, you can capture all the names of the log files. The user has to provide only the directory where the logs have been saved, rather than each individual log file to be processed. Use X_Command to execute DOS Commands (for SAS under Microsoft Windows platform). This will capture a list of all log files in a Directory. Save the list in a Text file _LOGLIST.TXT. In Enterprise Guide, SAS Learning Edition 4.1, under MS Windows, the X-command does not work. So, if you get the following message in the log, either execute the following DOS commands directly in DOS command prompt, or manually create and type the list of log files in _LOGLIST.TXT, before running the macros. WARNING: Shell escape is not valid in this SAS session. %MACRO getlogs ( dirname =, listname = ) ; OPTIONS NOXWAIT NOXSYNC ; X "dir/-p/b/o:n &dirname\*.log > &dirvld\&listname..txt" ; X "exit" ; %MEND getlogs ; (5) LOGARRAY Macro Count how many log files will be processed. Create an array of macro variables that contains the file names when resolved. The file will be read as a variable length format, since file names in MS Windows can vary from short to long names. Exclude the.log file extension (4 characters) from the names. %MACRO logarray ( listname = ) ; DATA &listname ( DROP = char1 zrest varlen ) ; RETAIN linum 0 ; 6

INFILE &listname LENGTH = linelen END = eof ; INPUT @1 char1 $1. @ ; varlen = linelen - 1 ; INPUT @2 zrest $VARYING100. varlen ; line = char1 zrest ; LENGTH linum 3 ; linum + 1 ; line = SUBSTR ( line, 1, LENGTH ( line ) - 4 ) ; IF eof THEN CALL SYMPUT ( "nmember", PUT ( linum, 3. ) ) ; DATA _NULL_ ; SET &listname ; CALL SYMPUT ( "memno" LEFT ( PUT ( linum, 3. ) ), TRIM ( line ) ) ; %MEND logarray ; (6) GETMSG Macro Get Only the True Error Messages. The output file will have the same name of the log file preceded by LOG_. Because of the different Line Length in log files, they have to be read as Variable Record Length format. This technique reads the line in two steps. Initially, we read only the first character of the line, in order to capture the line length. Then we read exactly the remainder, which has the line length minus one character. Finally, we concatenate the two pieces together to create the full input line. Reading the log file as fixed length will result in SAS skipping to the next line and the loss of input data. For easy readability, SAS log contains many completely blank lines that do not contain even one space. Those blank lines have to be completely excluded before processing. If you don t, as a result, when reading the.log file in ZLOG Dataset, you will get the following NOTE statement. NOTE: SAS went to a new line when INPUT statement reached past the end of a line. Some blank lines in the original log will have just a sequence number. Those lines will also be excluded. SAS log does not give a sequence number for each line within its program. If you run a second program in the same session, line numbers will continue from where the first program has left off. So, we will give each line an exact sequence number in its own file. %MACRO getmsg ( prg ) ; FILENAME zlog "&dirlog\&prg..log" ; %LET out = %str ( &dirvld\log_&prg..lst ) ; DM OUTPUT 'FILE "&out" REPLACE' ; DATA zlog ( DROP = char1 zrest varlen ) ; RETAIN linum 0 ; INFILE zlog LENGTH = linelen ; INPUT @1 char1 $1. @ ; varlen = linelen - 1 ; INPUT @2 zrest $VARYING200. varlen ; line = char1 zrest ; LENGTH linum 5 ; linum + 1 ; 7

Exclude Blank lines from input Log file. Merge the KeyRef Array with the Log file. DATA zlog ; SET ; IF COMPRESS (line) = " " OR ( LENGTH ( COMPRESS (LINE) ) <= 8 AND VERIFY ( COMPRESS (line), "0123456789" ) = 0 ) THEN DELETE ; PROC SQL ; CREATE TABLE zlog AS SELECT * FROM zlog, keyarray QUIT ; Compare Array elements one by one to the Line until the first match. Once you find the first match, stop and capture that line. Exclude Titles, Footnotes, Display Manager Command, Output Delivery System statements, and Comment lines from comparison. SAS reserved words _ERROR_ and ERRORS are specially handled. DATA zerr ( DROP = i m1-m&nkey lin start ) ; SET zlog ; ARRAY m ( * ) $ 25 m1-m&nkey ; lin = UPCASE (line) ; start = SUBSTR ( lin, 1, 4 ) ; DO i = 1 to &nkey ; IF start IN ( "ERRO", "WARN", "INFO") OR ( INDEX ( lin, TRIM( m{i} )) > 0 AND INDEX ( lin, "_ERROR_" ) = 0 AND INDEX ( lin, "ERRORS" ) = 0 AND INDEX ( lin, "TITLE" ) = 0 AND INDEX ( lin, "FOOTNOTE" ) = 0 AND INDEX ( lin, "DM" ) = 0 AND INDEX ( lin, "ODS" ) = 0 AND INDEX ( lin,"http://" ) = 0 AND INDEX ( lin, "*" ) = 0 AND INDEX ( lin, "'" ) = 0 AND INDEX ( lin, '"' ) = 0 THEN DO ; OUTPUT ; RETURN ; END ; END ; Categorize the messages, applied only to Captured lines from the log. DATA zerr ( DROP = start ) ; SET ; start = UPCASE ( SUBSTR ( line, 1, 4 ) ) ; IF start = "ERRO" THEN cat = "ERROR" ; ELSE IF start = "WARN" THEN cat = "WARNING" ; ELSE IF start = "NOTE" THEN cat = "NOTE" ; ELSE IF start = "INFO" THEN cat = "INFO" ; ELSE IF start^= " " AND INDEX(start,":")=0 THEN cat = "TEXT" ; ELSE cat = "CONTINUE" ; 8

Count the number of captured lines with errors, in each category, and the overall. If the overall count is zero, print a message that the log is clean; otherwise, print all captured lines. Print also an overall project summary. %countit ( dsname = zerr ) ; %IF &nobs = 0 %THEN %DO ; DATA zerr ; line= REPEAT ("_",54) ; OUTPUT ; line= " This Log File is Clean of Any Status Event Errors " ; OUTPUT; line= " " REPEAT ("_",52) " " ; OUTPUT ; LABEL line = " \ " ; %END ; PROC PRINTTO NEW PRINT = "&out" ; OPTIONS PAGENO = 1 ; PROC PRINT N NOOBS LABEL SPLIT="\" DATA = zerr ; FORMAT line $132. ; TITLE "LogZLog: %UPCASE (&dirlog\&prg..log) - Details * &SYSDATE *" ; PROC PRINTTO; RUN; %MEND getmsg ; (7) PROCESS Macro Process All Log File Names that have been captured in the Log Array. Add counts to project overall report. %MACRO process ; %keyref ( keyref = _KeyRef ) ; %getlogs ( dirname = &DirLog, listname =_LogList ) ; %logarray ( listname = _LogList ) ; %DO i = 1 %TO &nmember ; %getmsg ( &&memno&i ) ; PROC APPEND BASE=alllogs DATA=overall ; %END ; %MEND process ; CONCLUSION These macros will build a powerful search and filter engine, using a keyword reference file, comparing it to each line of a log file and producing a report that includes only those messages that need to be addressed in the source program code. The macros are generic and will apply to any log in any environment. The user has the option for excluding less serious keywords in the UsrOpt file, or including new keywords in the UsrSpc file. VALIDATION The macros and sample calls have been fully tested (on 1000 s of projects) and validated using SAS Versions 9.2, 8.02, 6.12, and Enterprise Guide 4.1 software, under Microsoft Windows platform. Fewer tests have been made on DEC VAX under VMS and UNIX environments. 9

REFERENCES SAS Macro Language, Reference, Ver. 9.1. SAS SQL Procedure User s Guide, Ver. 9.1. Base SAS Procedures Guide, Ver. 9.1.3, Vol. 1-4, Ed.2. SAS Language Reference, Concepts, Ver. 9.1.3, Vol. 1-2, Ed. 2. SAS Language Reference, Dictionary, Ver. 9.1.3, Vol. 1-4, Ed. 3. SAS Output Delivery System, User s Guide, Ver. 9.1, Vol. 1-2. SAS Enterprise Guide 4.1, SAS Online Help (http://support.sas.com/) AUTHOR Adel Fahmy has been a SAS user for 25 years. He worked as Independent Consultant, CIO, Systems Director, and Sr. Statistical Programmer; at major pharmaceutical companies, contract research organizations, and universities, teaching SAS to faculty and students. His special achievements include Generic Macros, Edit Checks, Database Design, Menu-Driven Systems, and Optimization Techniques. Adel has BS in Mathematics, Graduate Diploma in Systems Design, MS, M.Phil & Ph.D in Computer Science from Nottingham University, UK. He has published 11 papers in SAS conferences. CONTACT Your comments and questions are welcomed. The author can be contacted at: Adel Fahmy, EMS: Adel.Fahmy@netzero.net Voice: (732) 422 03774 COPYRIGHTS SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Windows and DOS are registered trademarks of Microsoft Inc. VAX, VMS & UNIX are registered trademarks of Digital Equipment Corporation (DEC). Other brand and product names are registered trademarks or trademarks of their respective companies. 10

Appendix 0 (Overall Summary Output Report for a Full Project) LogZLog: C:\_RESEARCH\LOGZLOG\OUTPUT\TABLES\LOGS - PROJECT OVERALL SUMMARY * 01MAY10 * Log File Name ERROR WARNING NOTE INFO TEXT CONTINUE OVERALL T_0_0 0 0 0 0 0 0 0 T_1_1 10 10 11 1 4 1 37.......... All Processed 10 10 11 1 4 1 37 Appendix 1 (Output Report for a Clean Log File) LogZLog: C:\_RESEARCH\LOGZLOG\OUTPUT\TABLES\LOGS\T_0_0.LOG - OVERALL * 01MAY10 * This Log File is Clean of Any Status Event Errors Appendix 2 (Output Report for a Log File Full of Errors) LogZLog: C:\_RESEARCH\LOGZLOG\OUTPUT\TABLES\LOGS\T_1_1.LOG - OVERALL * 01MAY10 * Category Count 1_ERROR 10 2_WARNING 10 3_NOTE 11 4_INFO 1 (Line 44) 5_TEXT 4 (Line 27, 38, 39, 40) 6_CONTINUE 1 (Line 136) ===== 37 11

LogZLog: C:\_RESEARCH\LOGZLOG\OUTPUT\TABLES\LOGS\T_1-1 ISS.LOG - DETAILS * 01MAY10 * Linum Captured Lines in the Log 27 missing M P N T; 38 A sharing violation occured while accessing. 39 Connection Failed. See log for details. 40 SYMBOLGEN: Some characters in the above value which were subject to macro quoting have been unquoted for printing. 44 INFO: Character variables have defaulted to a length of 200 at the places given by: (Line):(Column). Truncation may result. 52 NOTE: BY-line has been truncated at least once. 53 NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 56 NOTE: Enter RUN; to continue or QUIT; to end the procedure. 57 NOTE: Extraneous information on %END statement ignored. 70 NOTE: LINE and COLUMN cannot be determined. 71 NOTE: LOST CARD. 77 NOTE: MERGE statement has more than one data set with repeats of BY values. 89 NOTE: SAS went to a new line when INPUT statement reached past the end of a line. 97 NOTE: The SAS System stopped processing this step because of errors. 98 NOTE: Unable to open SASUSER.REGSTRY. WORK.REGSTRY will be opened instead. 99 NOTE: Variable LINENUM is uninitialized. 113 WARNING: Length of character variable has already been set. 114 WARNING: Missing %MEND statement. 115 WARNING: Multiple lengths were specified for the BY variable soc_txt by input data sets. This may cause unexpected results. 123 WARNING: 2 observations omitted due to missing ID values. 124 WARNING: Shell escape is not valid in this SAS session. 128 WARNING: The data set WORK.ZERR may be incomplete. When this step was stopped there were 0 observations and 2 variables. 129 WARNING: The data set SAVE.SMDDOS was only partially opened and will not be saved. 130 WARNING 32-169: The quoted string currently being processed has become more than 262 characters long. 136 You may have unbalanced quotation marks. 140 WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity 141 WARNING: This SAS global statement is not supported in PROC SQL. It has been ignored. 156 ERROR: A lock is not available for TMP.VS.DATA, lock held by another process. 164 ERROR: Expected %THEN statement not found. A dummy macro will be compiled. 166 ERROR: Expected semicolon not found. The macro will not be compiled. 171 ERROR: Invalid macro name ;. It should be a valid SAS identifier no longer than 32 characters. 175 ERROR: More positional parameters found than defined. 176 ERROR: No BY statement used or no BY variables specified. A BY statement must be used with variable names to sort on. 177 ERROR: Open code statement recursion detected. 178 ERROR: Read Access Violation In Task ( Language Processor ) 192 ERROR: The %DO statement is not valid in open code. 196 ERROR 71-185: The COMPRESS function call does not have enough arguments. 12