UTILIZING SAS TO CREATE A PATIENT'S LIVER ENZYME PROFILE Erik S. Larsen, Price Waterhouse LLP In pharmaceutical research and drug development, it is usually necessary to assess the safety of the experimental drug. When the experimental medicine enters the digestive system, it may cause the body, particularly the liver, to produce enzy~es which counteract the agent. The hver produces several enzymes (SGOT, SGPT, alkaline phosphatase and bilirubin) which tend to elevate when nonsteroidal antiinflammatory drugs (NSAIDs) such as aspirin enter the system. When patients participate in clinical trials for NSAIDs, they must have blood tests taken at different time points throughout the trial and at these time points, measurements of liver enzymes are taken. The thought is that if the experimental medicine does not cause the liver enzymes to elevate, the product is safe, at least as far as the liver is concerned. A useful method for displaying the levels of the liver enzymes at different time points is the use of a line graph. One problem is that SGOT, SGPT and alkaline phosphatase are measured on one scale, whereas bilirubin is measured on another. Fortunately, SAS provides all of the tools necessary to display all of the enzyme levels on one page, which is how the Food and Drug Administration desires to see the infonnation. An example, using both SAS macros and SAS/GRAPH, demonstrates how the graphs of all enzymes can be displayed at once in a clear, easy to read format. The following is a fictitious example that was used for data points in the graph. In the example it was assumed already to be a SAS dataset: SUBJECT MY ENZYME AGE ~!!:i5b. ~ ~ XXXX,.yyy 27 SGOT 48 M A 0.5. XXXX,.yyy 27 SGPT 48 M A 0.6 XXXX,.yyy 27 ALI< 48 M A 0.4 XXXX,.yyy 27 BILl 48 M A 1.4 XXXX,.yyy 20 SGOT 48 M A 0.4 XXXX,.yyy 20 SGPT 48 M A 0.5 XXXX,.yyy 20 ALK 48 M A 0.3 XXXX,.yyy 20 BIll 48 M A 1.7... and so on. The DAY variable represents which day in the study the patient was given a blood test, with negative numbers denoting a screening period, zero indicating baseline visit and days greater than zero showing days since the first administration of the experimental drug. Also note that a patient may not have all of the enzyme readings at each visit. The TIlER variable is the therapy code where 'A' is formatted to 'Group A' and 'B' is formatted to 'Group B'. The patient's therapy does not change throughout the trial. The SEX variable is self-explanatory, where 'F' is formatted to Female and 'M' to Male. VAL and V AL2 are the data points on the graph. V AL is calculated prior to the macro which expresses the actual laboratory value in terms of the upper normal limit (UNL). For example, if the SGOT value is 40 and the UNL is 20, then VAL would have a value of 40/20 = 2.0 (x UNL). On the other hand, variable V AL2 contains the actual bilirubin lab reading. The data is manipulated such that VAL is missing for bilirubin readings and likewise, V AL2 is missing for SGOT, SGPT, and alkaline phosphatase observations. The dataset is sorted by SUBJECT and DAY and is sent to an algorithm which selects the minimum and maximum days for each patient and outputs them to a dataset 644
which is used to obtain the key values for the horizontal axis scaling for the DAY variable. The algorithm always creates 10 major ticks on the axis from the maximum and minimum day. SAS Macro variables are used to obtain the MIN, MAX, INT (interval) and LAST (last day on therapy) values and are created in the following code: DATA..NULL..; SETMINMAX; CALL SYMPUTfMAX'. PUT(XMAX. 4.)); CALL SYMPUT{'MIN', PUT{XMIN, 4.»; CALL SYMPUTflNT'. PUT{INTERVAL, 4.)); CALL SYMPUTfLAST', PUT(LASTDAY, 4.)); RUN; The macro used is called %PLOTIT and it takes the parameter SUBJECT (which is the subject identification number you want the profile), After running through the DATA _NULL_ step above to obtain the information for the horizontal axis, the symbol definitions are determined arbitrarily, as shown below. These could have been defined outside of the macro, but were left inside to enhance the readability of the code. SYMBOL1 INTERPOL JOIN COLOR = SLACK VALUE = CIRCLE; SYMBOL2 INTERPOL = JOIN COLOR SLACK VALUE = SQUARE; SYMBOL3 INTERPOL JOIN COLOR = SLACK VALUE = TRIANGLE; SYMBOL4 INTERPOL = JOIN COLOR = BLACK VALUE = DIAMOND; SYMBOLS INTERPOL = JOIN COLOR = SLACK HEIGHT = 4 VALUE = DIAMOND; Note that there are five symbols and only four liver enzymes. In order to get the four enzymes on two different scales, a 'dummy' symbol is necessary to get all four enzymes on the legend, This is why the data are manipulated to have a VAL and a V AL2 variable. The data have a missing value for V AL for bilirubin in order to have all four enzymes on the legend, which is part of the bottom graph. The same holds true for the V AL2 variable, except that bilirubin is the only enzyme that contains a value. This is the actual bilirubin reading, which unlike the previous three enzymes, will be shown in the upper portion of the graph shown in Appendix A. The plot shown in Appendix A is actually two graphs which are superimposed onto one page. Some thought was given to just using one GPLOT procedure with a PLOn statement but since there were two different y-axis scales (upper normal limits vs. actual values) and the fact that the bilirubin readings would make the graph more difficult to read, it was decided that two plots were necessary. The legend was created using the following code: LEGEND1 CBORDER MODE ACROSS LABEL VALUE POSITION SHAPE OFFSET = = = SLACK PROTECT 1 f Alkaline Phosphatase' 'SGOT/AST' 'SGPT/ALT' 'T o1ai Silirubin').. (TOP INSIDE RIGHT) = SYMBOL(S, 2) = (-3,-1); The legend contains all of the enzymes and is produced as part of the bottom graph. SAS issues a warning that there are missing values for the VAL*DAY=ENZYME plot request because bilirubin always has a missing value for VAL. This is ok, and it allows us to have all of the enzymes on one legend. The offset is used to make the legend appear as if it were between the two graphs (Le. it is moved up a small amount). 645
The first axis definition, which follows, is the bottom graph's left vertical axis which denotes the upper normal limits. The second axis (AXIS2) is the horizontal axis for the bottom graph and utilizes the macro variables that were created with the DATA _NULL_ step. Note that the label DAY appears as an axis label at the bottom of the graph. AXlS1 COLOR = BLACK ORDER = (0 TO 11 BY 2) ORIGIN (O.SIN,) OFFSET = (0,0) MINOR = (NUMBER=1) LABEL = VALUE = (FONT=COMPLEX H=2.5); AXIS2 COLOR BLACK ORDER = (&MIN TO &MAX BY &INT) MINOR = (NUMBER=1) LABEL (FONT=COMPLEX H=3 'DAY') VALUE = (FONT=COMPLEX H=2.5); The other two axes are for the upper portion of the graph. The third axis (AXIS3) is simply the left vertical axis which identifies the bilirubin reading, Notice that aside from the axis range (ORDER), it is identical to the AXIS! statement for the upper normal limits, The fourth axis has an interesting point to it It uses the same macro variables that the bottom horizontal axis uses but the MAJOR, MINOR, LABEL and VALUE statements are set to. This, along with the STYLE=O statement, cause the horizontal axis for the upper portion of the graph to be invisible, although on the same scale. The STYLE statement set to zero is what removes the axis line. This axis definition creates the illusion that there is only one graph on the page, with a break in the vertical axis. AXIS3 COLOR.. ORDER = ORIGIN = OFFSET = MINOR = LABEL = VALUE BLACK (0.0 TO 2.5 BY 0.5) (O.S IN,) (0,0) (NUMBEA=1) (FONT =COMPLEX H=2.5); AXIS4 MAJOR = ORDER MINOR = STYLE = o LABEL.. VALUE = ; (&MIN TO &MAX BY &INT) The program is now ready to start plotting the values. First however, the following GOPTIONS statement is executed to suppress the printing of the graphs and remove the borders. GOPTIONS NOBORDER NODISPLAY GUINT=PCT; The frrst (or bottom) GPLOT statement is shown below. Note that the GOUT=GRAFCAT option is used to output the graph to a temporary graph catalog to be used later in a GREPLA Y procedure. Also of note are the VREF, LVREF, HREF, and LHREF statement options. These produce both horizontal dashed reference lines and vertical solid reference lines which were requested by the FDA. The horizontal dashed reference lines indicate critical upper normal limit levels (e.g., 3 times the UNL). The solid vertical reference lines denote when the patient was taking medication (e.g., baseline and fmal day on therapy). The baseline (0) day was hardcoded in, and the &LAST macro variable contains the final day on therapy value. PROC GPLOT DATA--GRAF GOUT=GRAFCAT UNIFORM; TITLE1 C=BLACK H=2 F=COMPLEX ' '; RUN; PLOT (VALj"DAY=ENZYME I LEGEND = LEGEND1 VAXIS.. AXIS1 HAXIS = AXIS2 VREF = (1.238) LVREF = 2 HREF = (0 &LASn LHREF 1; NOTE LANGLE=90 MOVE=(0.25 IN, 2 IN) C=BLACK F=COMPLEX 'SGOTISGPT/Alk Phos x UNL'; WHERE SUBJECT='&SUBJECT'; 646
Finally, notice the TITLE! statement that is just a blank line. This removes any titles (such as The SAS System) which may appear by default. Also of note is that the UNIFORM option on the PROC GPLOT statement tells SAS to use the same axis scaling for all graphs. This is helpful to keep the vertical lines on both graphs lined up correctly. Furthermore, the NOTE statement places the vertical axis label on the graph. The WHERE statement in the procedure is used to select only those observations which are enzyme readings for a particular subject. The second GPLOT procedure produces the upper portion of the graph. Once again, there are specified horizontal and vertical reference lines that were requested by the FDA and, the title of the graph is generated in this procedure. Also, the graph is suppressed from printing and is written out to a temporary graph catalog (GRAFCAT) to be used later. The PLOT statement has a peculiar form of V AL2*DAY =5. The 5 tells SAS to use the fifth symbol statement (SYMBOL5) for the plot. Notice that SYMBOL5 is identical to SYMBOIA except the height in SYMBOL5 is 4 units whereas in SYMBOIA is only 3 units. The larger height is necessary because when the upper portion of the graph is scaled to 40 percent of the output area, the symbols appear smaller if the height of 3 units is used. PROC GPLOT OATA--GRAF GOUT=GRAFCAT UNIFORM; TITLE1 C=BLACK H=7 'Liver Enzyme Summary'; LABEL SUBJECT.. 'PATIENT 10' THER = 'THERAPY'; FORMAT SEX THER PLOT (VAL2)*OAY=5 I VAXIS = HAXIS.. VREF = LVREF.. HREF LHREF = $SEXFMT. $THERFMT.; AXIS3 AXIS4 (1.8) 2 (0 &LAST) 1; NOTE LANGLE=90 MOVE=(O.25 IN, 2 IN) H=3 C=BLACK F=COMPLEX 'Bilirubin (mgldl)'; BY SUBJECT AGE SEX THER; WHERE SUBJECT='&SUBJECT'; RUN; Note that the variables are labelled and formatted in this procedure. This allows us to use the BY statement to output at the top of the graph which subject, the subject's age, sex and the therapy they were on. The macro has now performed two graphs but still has not produced any hard copy output. Following the GPLOT procedures in the macro, a PROC GREPLA Y is executed using the two graphs that were stored in temporary files. The GREPLA Y procedure, shown below, creates a template (TWO) which consists of the upper 40 percent of the output area and the lower 60 percent, respectively. The NOFS option tells SAS to use a full screen device to view the graphs. The DES option in the TDEF statement simply labels the template as STACK4060. 647
GOPTIONS DISPLAY; PROC GREPLAY IGOUT=GRAFCAT TC=TEMPCAT NOFS; TDEF;WO DES='STACK4060' 11 llx=o LLY=60 ULX=O ULY=l00 URX=100 URY=100 LRX=100 LRY=60 21 LlX=O LLY=O ULX--Q ULY=60 URX=100 URY=60 LRX=100 LRY=O; TEMPLATE ;WO; r Upper / r Lower"' References: SAS/GRAPH Software: Reference, Volume I, Version 6, First Edition, SAS Institute, Inc., 1990. SAS/GRAPH Software: Reference, Volume 2, Version 6, First Edition, SAS Institute, Inc., 1990. RUN: QUIT; TREPLAY 1:GPL0T1 2:GPLOT; Notice that prior to the GREPLAY procedure, a GOPTIONS DISPLAY tells SAS to allow for output to the screen or any other hard copy device that was specified, The TREPLA Y statement tells the procedure to print the graphs in the template. GPLOT is the SAS default for the first GPLOT executed and GPLOTI is default for the second. Note that after the TREPLAY, a DELETE _ALL_ statement tells SAS to delete all SAS catalog entries which contain graphs. This allows for the macro to be executed multiple times, as it was performed for all patients who had elevated liver enzymes. The liver enzyme profiles are one example of the vast power that SAS and in particular SAS/GRAPH has. It was thought that to obtain graphs like the one shown in Appendix A, the data would have to be downloaded and brought into a desktop package. SAS/GRAPH allows the user the flexibility to perform such tasks with minimal code and suggests it is a useful tool for meeting most, if not all, graphical needs. SAS Guide to Macro Processing, Version 6, Second Edition, SAS Institute, Inc., 1993. SAS Language: Reference, Version 6, First Edition, SAS Institute, Inc., 1992. Acknowledgement: A special thanks to Dr. Robert Northington for his help in proofreading and refining this paper. Author Contact: Erik S. Larsen Price Waterhouse LLP 1301 K Street NW, 800W Washington, DC 20005-3333 Erik_S._Larsen@notes.pw.com Phone: (202) 414-1443 648
2.5 Liver E:nzy:r11e Su.:rru:n.a.ry PATIENT In _ XXXX-'YYY.AGE -... 8 SEX-JI\4ALE ~-D3tJPROFEN aochdg i..0.5 2 = <:a 2.0 1.6 1.0 0.5 0.0 0'1./lo. \CI ~ Z :::> >< rn 0,..J:l a.. ~ -<... E- o.. c.!] In "E::;- O c.!] In 10 8 6 4 Alkaline Phosphata"e SGOT/.AST SGPTi'ALT Total Bilirubin 2 I ~ l1 e Q* R 6 ~ "~~ b ~ r o I I r I r I -50 o 60 100 150 200 260 300 350 400 DAY