Tips for Producing Customized Graphs with SAS/GRAPH Software. Perry Watts, Fox Chase Cancer Center, Philadelphia, PA

Similar documents
It s Not All Relative: SAS/Graph Annotate Coordinate Systems

MANAGING SAS/GRAPH DISPLAYS WITH THE GREPLAY PROCEDURE. Perry Watts IMS Health

Coders' Corner. Paper ABSTRACT GLOBAL STATEMENTS INTRODUCTION

INTRODUCTION TO THE SAS ANNOTATE FACILITY

PharmaSUG 2012 Paper CC13

Effective Forecast Visualization With SAS/GRAPH Samuel T. Croker, Lexington, SC

Creating Forest Plots Using SAS/GRAPH and the Annotate Facility

Annotate Dictionary CHAPTER 11

Multiple Forest Plots and the SAS System

Data Annotations in Clinical Trial Graphs Sudhir Singh, i3 Statprobe, Cary, NC

The GANNO Procedure. Overview CHAPTER 12

Move =(+0,+5): Making SAS/GRAPH Work For You

Data Driven Annotations: An Introduction to SAS/GRAPH s Annotate Facility

PharmaSUG Paper TT10 Creating a Customized Graph for Adverse Event Incidence and Duration Sanjiv Ramalingam, Octagon Research Solutions Inc.

Controlling Titles. Purpose: This chapter demonstrates how to control various characteristics of the titles in your graphs.

SparkLines Using SAS and JMP

Making Presentations More Fun with DATA Step Graphics Interface (DSGI) Hui-Ping Chen, Eli Lilly and Company, Indianapolis, Indiana

SciGraphica. Tutorial Manual - Tutorials 1and 2 Version 0.8.0

Excellence with Excel: Quiz Questions Module 6 Graphs and Charts

The GSLIDE Procedure. Overview. About Text Slides CHAPTER 27

... WHERE. AnnotaI8 Data.S... XSYS & YSYS. Harie Annotate: How Not to Lose Your Head When Enhancing BAS/GRAPH output

SAS: Proc GPLOT. Computing for Research I. 01/26/2011 N. Baker

Splitting Axis Text. Splitting Text in Axis Tick Mark Values

SAMLab Tip Sheet #4 Creating a Histogram

Tips and Tricks in Creating Graphs Using PROC GPLOT

A Stand-Alone SAS Annotate System for Figure Generation Brian Fairfield-Carter, PRA International, Victoria, BC

How to Make Graphs with Excel 2007

Something for Nothing! Converting Plots from SAS/GRAPH to ODS Graphics

THE IMPACT OF DATA VISUALIZATION IN A STUDY OF CHRONIC DISEASE

Displaying Multiple Graphs to Quickly Assess Patient Data Trends

USING SAS PROC GREPLAY WITH ANNOTATE DATA SETS FOR EFFECTIVE MULTI-PANEL GRAPHICS Walter T. Morgan, R. J. Reynolds Tobacco Company ABSTRACT

Tips to Customize SAS/GRAPH... for Reluctant Beginners et al. Claudine Lougee, Dualenic, LLC, Glen Allen, VA

Top Award and First Place Best Presentation of Data Lan Tran-La. Scios Nova, Inc. BLOOD PRESSURE AND HEART RATE vs TIME

Innovative Graph for Comparing Central Tendencies and Spread at a Glance

Working with Charts Stratum.Viewer 6

A Generalized Procedure to Create SAS /Graph Error Bar Plots

Combining a Bar Graph with a Line Graph 1

Microsoft Excel 2007

Want Quick Results? An Introduction to SAS/GRAPH Software. Arthur L. Carpenter California Occidental Consultants

Arthur L. Carpenter California Occidental Consultants

Converting Annotate to ODS Graphics. Is It Possible?

Part II: Creating Visio Drawings

USING SAS/GRAPH R SOFTWARE FOR THREE-DIMENSIONAL ILLUSTRATIONS... if} OF AMINO ACID DIVERSITY -ct->,-t- 0

Chapter 5snow year.notebook March 15, 2018

Error-Bar Charts from Summary Data

SAS Visual Analytics 8.2: Working with Report Content

Using Microsoft Word. Working With Objects

Learning to use the drawing tools

ENV Laboratory 2: Graphing

Easing into Data Exploration, Reporting, and Analytics Using SAS Enterprise Guide

3.2 Circle Charts Line Charts Gantt Chart Inserting Gantt charts Adjusting the date section...

Bar Graphs with Two Grouping Variables 1

SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA

Charting 1. There are several ways to access the charting function There are three autolayouts which include a chart.

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables

SAMLab Tip Sheet #5 Creating Graphs

CSV Roll Documentation

Creating a Histogram Creating a Histogram

Hacking FlowJo VX. 42 Time-Saving FlowJo Shortcuts To Help You Get Your Data Published No Matter What Flow Cytometer It Came From

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Using ANNOTATE MACROS as Shortcuts

Using Annotate Datasets to Enhance Charts of Data with Confidence Intervals: Data-Driven Graphical Presentation

Desktop Studio: Charts. Version: 7.3

Working with the RTF Generator

Information Technology and Media Services. Office Excel. Charts

How to annotate graphics

Loading Data. Introduction. Understanding the Volume Grid CHAPTER 2

Usinq the VBAR and BBAR statements and the TEMPLATE Facility to Create side-by-side, Horizontal Bar Charts with Shared Vertical Axes Labels

Capstone Appendix. A guide to your lab computer software

Scottish Improvement Skills

Taming the Box Plot. Sanjiv Ramalingam, Octagon Research Solutions, Inc., Wayne, PA

Chapter 13 Introduction to Graphics Using SAS/GRAPH (Self-Study)

Desktop Studio: Charts

Open a new Excel workbook and look for the Standard Toolbar.

Graphing Single Subject Research Data. AAC Colloquium October 18 th, 2017

A Juxtaposition of Tables and Graphs Using SAS /GRAPH Procedures

IGCSE ICT Section 16 Presentation Authoring

Chapter 6: DESCRIPTIVE STATISTICS

HOUR 12. Adding a Chart

Chapter 2 - Graphical Summaries of Data

Manual. empower charts 6.4

Creating Maps in SAS/GRAPH

Since its earliest days about 14 years ago Access has been a relational

SHOW ME THE NUMBERS: DESIGNING YOUR OWN DATA VISUALIZATIONS PEPFAR Applied Learning Summit September 2017 A. Chafetz

CREATING A POWERPOINT PRESENTATION BASIC INSTRUCTIONS


WORD Creating Objects: Tables, Charts and More

SAS/GRAPH Introduction. Winfried Jakob, SAS Administrator Canadian Institute for Health Information

Presentation Quality Graphics with SAS/GRAPH

SAS Graph: Introduction to the World of Boxplots Brian Spruell, Constella Group LLC, Durham, NC

1.2. Pictorial and Tabular Methods in Descriptive Statistics

The GIMPORT Procedure

Excel 2013 Intermediate

From Getting Started with the Graph Template Language in SAS. Full book available for purchase here.

ABSTRACT INTRODUCTION SESUG RV

Excel Core Certification

Getting Started. What is SAS/SPECTRAVIEW Software? CHAPTER 1

MICROSOFT EXCEL Working with Charts

ODS LAYOUT is Like an Onion

Publishing Electronic Portfolios using Adobe Acrobat 5.0

Transcription:

Tips for Producing Customized Graphs with SAS/GRAPH Software Perry Watts, Fox Chase Cancer Center, Philadelphia, PA Abstract * SAS software is used to produce customized graphics displays by solving a set of related problems. First, the problem of selectively displaying labels at unevenly spaced intervals along a horizontal axis is solved by invoking the FORMAT procedure from PROC GPLOT. Next, a problem that gets in the way of generating the first graph is used for segmenting an axis to highlight the presence of outlying values in a displayed data set. A third more involved problem deals with overlapping labels along a midpoint axis. This time the problem is solved by invoking PROC FORMAT from an ANNOTATE data set rather than from the graphics procedure itself. Text distortion resulting from mapping related graphs to oblong templates is the fourth problem addressed in the presentation. A simple, external solution sends graphics output to Computer Graphics Metafiles (CGMs) containing scaled TrueType fonts. Unfortunately, CGMs have their own set of limitations which must be addressed before the developer can take full advantage of the file format. The presentation concludes by showing how to get one symbol instead of the usual three to identify subgroups in a LEGEND statement. The solution involves a manipulation of the width parameter in the legend s SHAPE clause. Axes Labels at Unevenly Spaced Intervals Occasionally it is necessary to produce a graph with an axis containing major ticks labels positioned at unevenly spaced intervals. Unfortunately, there is no feature in SAS/GRAPH that will automatically handle this situation. One can use the order clause in the Axis Statement to select a subset of intervals, but SAS will plot them as if they were evenly spaced. Using a unit interval, on the other hand, will only clutter the graph and confuse the viewer, since all major tick marks are labeled by default. The serial development of the graph in Figures - shows how a simple application of PROC FORMAT solves the problem. In Figure, the measurements are placed at an accurate distance from each other, but the graph is cluttered, and * This publication was supported by Grant 7- from the National Institute for Allergies and Infectious Diseases, NIAID. it is difficult for the viewer to link the horizontal coordinates to the data display located relatively high up on the response axis. Figure simplifies the graph but distorts the time interval whereas Figure solves the problem by invoking PROC FORMAT from GCHART to selectively label major tick marks. Figure could be further enhanced by removing unlabeled ticks with a graphics editor. Here is the code for PROC FORMAT: proc format; value weekfmt = = = = = 8= 8 = = = + other= ; run; Segmenting an Axis for Outliers In the original abstract for this paper, SKIPMISS, was listed as a means for segmenting an axis in order to emphasize the presence of outlying values in a data set. The current solution bypasses SKIPMISS by adding a small amount of code to an existing ANNOTATE data set used for labeling data points in the graph about ODC activity displayed in Figure. While the unevenly spaced intervals in Figure are defined as a problem, they become an asset in the ODC graph, because this time the data values should not be proportionately spaced. As in Figure, the order clause in the Axis Statement is used to generate the tick marks. There is also no need for PROC FORMAT in the ODC graph, but both axes are modified slightly to accommodate the hatch mark with its intervening space in the response axis. First the offset value for the horizontal axis is set to zero with offset=(,). Setting the horizontal offset to zero ensures that the value for the x-coordinate in the vertical axis is truly zero. Then the hatch mark which is a rotated equal sign (=) is created. The label command in ANNOTATE is used for drawing the hatch mark, and move and draw commands place a thick white line between the equal sign directly over the response axis. Using when= a (after) in ANNOTATE also ensures that the hatch mark will be the last item plotted on the page, so it goes over not under an already formed axis. Creating Visible Axes Labels Ordinarily each bar in a histogram is automatically labeled with its midpoint value in the GCHART procedure ( SAS/GRAPH Software Volume II 7).

This practice is satisfactory when a single bar summarizes data for a range of values. However, when the range is one, as it is in Figure, the numbers are bound to overlap regardless of the point value assigned to the font. This problem is magnified for multiple plot displays where numbers must be made even larger for terminal viewing. As in the segmented axis example, ANNOTATE is used for obtaining a customized axis. Notice that it provides the developer with complete control over what values are plotted along the axis. Zero to five, for example is a longer interval than five to, and is the maximum value for all the runs in set #. Code for the ANNOTATE data set that is needed for drawing the midpoint axis (maxis) in the mutation graph along with anfmt which provides spacing between the numbers is shown below: proc format; value anfmt ='' ='' ='' ='' ='' ='' ='' ='' ='' ='' other=''; data anno; length function color $8; length text $; length position $; retain color 'black' xsys '' ysys '' hsys ''; do i= to ; chi=left(put(i,anfmt.)); if(chi ne '') then do; function='label'; x=i; y=; size=&hh; position='e'; text=chi; output; stop; run; The format, anfmt, highlighted in anno above is used to display text not manage data in the cell mutation graph. The data are still managed from the axis statement s order clause which processes the full range of values in unit steps. Here is the code for axis which governs the data: axis label=none order=( to by ) major=none minor=none value=none; Unlike the Baseline graph displayed in Figure, PROC FORMAT cannot be used for both axis display and data management in the cell mutation graph. Values for intervening mutation numbers exist in Figure whereas only labeled weeks from baseline contain data values in Figure. In fact, Figure shows what happens when anno is removed from the cell mutation program, and axis is altered to display formatted data: Correcting Text Distortion The next example extends the cell mutation graph above by mapping three similar plots to oblong templates so that they can be viewed together on a single page. Again, PROC GCHART is used for creating histograms having unit ranges along a midpoint axis. The same process of invoking PROC FORMAT from an ANNOTATE data set is also used to generate axis numbers, but the axis is no longer linear in scale. The presence of a nonlinear scale is noted by the fact that many more than 9 bars appear between sites and in the Vkappa Shannon plot displayed in Figures 7 and 8. Every tenth bar is highlighted so that the viewer can get an accurate count of the number of sites in a given plot. As shown below, anfmt which numbers the Vkappa axis easily manages the nonlinear scale: value anfmt ='' ='' ='' ='' ='' ='' ='' 7='7' 8='8' 9='9' other=' '; The ANNOTATE data set, anno, on the other hand, is a bit more complicated than anno displayed earlier, because tick marks need to be simulated. PROC GCHART does not support tick marks along the midpoint axis, but the presence of a nonlinear scale in the Shannon plots requires them, so they are inserted with a vertical slash ( ) mark: data anno; /* For X axis values,ticks */ length function color $8; length text $; length position $; retain color 'black' xsys '' ysys '' hsys; i=; do while (i le &maxx.); chi=left(put(i,anfmt.)); if(chi ne ' ') then do; /*major ticks*/ function='label';x=i;y=;size=.7; position='e';text=' ';output; function='label';size=8;position='e'; text=chi;output; else do; /*minor ticks*/ function='label';x=i;y=;size=.; position='e';text=' ';output; i+; stop; run; Position E in both anno and anno centers the text a half cell below the y-coordinate. This way numbers and tick marks are separated from the horizontal axis itself. Figure 7 shows the graph that is printed directly from SAS in a Microsoft Windows environment. Axes numbers become distorted when a single graph is compressed along one axis but stretched out on another.

On the other hand, the title, labels, and footnotes are not so distorted in Figure 7, because they are displayed from a larger fourth template which is less oblong in shape. Figure 8 shows a corrected version of the Shannon graphs. Fortunately CGMs support scaleable fonts which automatically correct the distortion. Otherwise, a developer would be faced with the impossible task of placing each number on all sets of axes from the vantage point of the larger fourth template. Correcting the distortion with a CGM only involves changing a SAS program s default font from SWISS to HWCGM, and updating the GOPTIONS statement with: device=cgmmwc and gsfname=cgmname. CGM Limitations Unfortunately, there are a number of features that limit the effectiveness of CGMs for displaying SAS/GRAPH output. First, the file format is not available across all platforms, and Microsoft Word doesn t support it in the Macintosh environment. Secondly, the line width clause is not supported outside of SAS. A close examination of Figures - in this article, for example, shows that all axes lines have very narrow widths. Error bars in some of the graphs are thicker, because they are generated in ANNOTATE which relies on size rather than width for line width. Possibly the failure to uniformly translate line width to the external environment can be attributed to the fact that units are not specified as a parameter for the width clause ( SAS/GRAPH Software Usage 9). Sometimes axes labels will be truncated when a graph is written to a CGM file. A clumsy fix for this problem involves inserting a blank footnote into a graph in the same manner suggested for moving an axis frame away from a graph border ( SAS/GRAPH Software Usage 9). On some platforms, however, null or blank footnotes will be ignored by the SAS compiler, and the truncation problem won t be fixed. All one has to do in this instance is to add a footnote in the graph s background color containing the characters BAD FIX. Then axes labels will be fully displayed. The truncation problem described here emphasizes the lack of control a SAS developer has over the space surrounding the procedure output area. CGMs also ignore the implicit carriage return that SAS software inserts for multiple line axes labels. For example, when the target device for the SAS program which generates the graph in Figure 9 is set to winprtc, the following simple code will produce multiple line labels for the horizontal axis: value= (tick= h= j=c ' ' j=c 'Pathologist' tick= h=. j=c 'Mean' j=c ' ' ) The word Pathologist is printed out under a blank line whereas Mean appears over the numbers and. The SAS compiler automatically inserts a carriage return every time it encounters a justification (j) clause. Output is then formatted properly when it goes directly to the screen. However, if graphics output is redirected to a CGM, carriage returns must be added to obtain multiple line axes labels. The altered code below is anything but straight forward: /* First create a carriage return character as a macro variable */ data _null_; cr=byte(); call symput('cr',put(cr,$.)); stop; run; /* Next, insert cr into the code and color it white so that it is not displayed as a small box in the output. Add three justification clauses (emboldened j= characters below) to center justify initial characters, and lastly make sure that the color of the text to be seen is black.*/ value= (tick= h= j=c c=white "&cr" c=black j=c 'Pathologist' tick= h=. j=c c=black "Mean" j=r c=white "&cr" c=black j=c ' ' ) In addition to generating a carriage-return, the byte function shown above is also used for creating the math symbol, ±, in Figure 8. A comparison with Figure 7 shows that this is one time when Microsoft Word is superior to SAS in its display of special symbols. Displaying One Symbol in a Legend A discussion of symbols brings us to the last coding example in this presentation. If values are not specified for the shape clause in a LEGEND statement, width and height will be set to and (cells) respectively. Invariably the value of will produce the three symbol legend that is shown in Figure 9. Setting width to a small value such as. generates the single symbol display shown in Figure. The term width in the LEGEND statement does not refer to the width of the symbol per se. Because the symbol is proportionate, height takes care of symbol width as well. Instead, width refers to the amount of space allotted for each legend value in the output ( SAS/GRAPH Software Volume ). If width is a small value, only one symbol will be printed.

Summary and Conclusions While five different graphics problems have been described in this paper, the smaller number of techniques that are used to solve them must be judiciously applied. For example, the same application of PROC FORMAT works well for spacing intervals and preventing overwrites, but FORMAT alone cannot be used to display data above a blank axis label. ANNOTATE is needed in such situations to preserve the integrity of the data. Again, even though an uneven order clause is essential for solving a segmented axis problem, it gets in the way of accurately displaying data at unevenly spaced time intervals. CGMs also don t provide universal solutions to text formatting problems. Nevertheless, the provision of a scaled font improves the quality of multiple graphics displays, and CGMs are very easy to insert into Microsoft Word documents and PowerPoint presentations. References Microsoft Corporation. User s Guide Microsoft Word: The World s Most Popular Word Processor, Version.. Microsoft Corporation, 99. Microsoft Typography Features of TrueType TrueType fonts. December 99. <http://www.microsoft.com/ truetype/what/ttfonts.htm> ( March 997). SAS Procedures Guide, Version, Third Edition. Cary NC: SAS Institute Inc., 99. SAS/GRAPH Software, Usage, Version, First Edition. Cary NC: SAS Institute Inc., 99. SAS/GRAPH Software, Volumes and, Reference, Version, First Edition. Cary NC: SAS Institute Inc., 99. Value. 7... 7 8 9 7 8 9 Weeks from Baseline.. Figure. A display of unit intervals.

Value. 7... 8 + Weeks from Baseline.. Figure. A display of selected intervals, but time is distorted. Value. 7... 8 + Weeks from Baseline.. Figure. A display of unevenly spaced time intervals.

ODC Activity vs Differentiation Grade 7. =..7. O D C A c t i v i t y... ±.8. ±.. ±. Well (n=) Mod (n=) Differentiation Grade Mod-Poor (n=) Poor (n= ). ±. Figure. A segmented response axis emphasizes the presence of outlying values in a data set. Number of Cells by Mutation for Selected Runs from Set # (Grouped by Burst Size) Run= Grp=All #Cells 9 8 7 #Mutations Figure. Numerical values for the number of cell mutations are placed sufficiently far apart from each other to prevent overlapping.

Number of Cells by Mutation for Selected Runs from Set # (Grouped by Burst Size) #Cells 7 Run= Grp=All #Mutations Figure. Incorrect results are displayed when main axis values alone are formatted in PROC GCHART. 7

Figure 7. Shannon Plots produced directly from SAS with targetdevice=winprtc. H 9 VAlpha Shannon(H ± s) Measure of Diversity 7 8 9 9 VBeta 7 8 9 VKappa 7 8 9 Site (Kabat-Wu Numbering) Figure 8. Text distortion is eliminated by sending output to a CGM file. 8 Amino Acids (-) Excluded

% PreOp ChemoRT vs None in Pancreatic Cancer By Pathologist NO ChemoRT ChemoRT 8 Pathologist Mean Median STD MIN MAX Range Cancer Cells Figure 9. Multiple line axes labels as well as multiple legend symbols are displayed. % PreOp ChemoRT vs None in Pancreatic Cancer By Pathologist NO ChemoRT ChemoRT 8 Pathologist Mean Median STD MIN MAX Range Cancer Cells Figure. The graph is improved by having only one symbol in the legend. 9