SparkLines Using SAS and JMP

Similar documents
Effective Forecast Visualization With SAS/GRAPH Samuel T. Croker, Lexington, SC

INTRODUCTION TO THE SAS ANNOTATE FACILITY

PharmaSUG 2012 Paper CC13

Data Annotations in Clinical Trial Graphs Sudhir Singh, i3 Statprobe, Cary, NC

It s Not All Relative: SAS/Graph Annotate Coordinate Systems

SAS Graphs in Small Multiples Andrea Wainwright-Zimmerman, Capital One, Richmond, VA

Tips for Producing Customized Graphs with SAS/GRAPH Software. Perry Watts, Fox Chase Cancer Center, Philadelphia, PA

The GANNO Procedure. Overview CHAPTER 12

Why Should We Care? More importantly, it is easy to lie or deceive people with bad plots

Creating Forest Plots Using SAS/GRAPH and the Annotate Facility

Coders' Corner. Paper ABSTRACT GLOBAL STATEMENTS INTRODUCTION

EDWARD TUFTE. The Visual Display of Quantitative Information. Envisioning Information. Edward Tufte

A Variety of SAS Programming Techniques to Produce Sparkline Graphics Paul OldenKamp, POK Programming Co., Seattle, WA

Converting Annotate to ODS Graphics. Is It Possible?

SAS Graphs in Small Multiples. Andrea Wainwright-Zimmerman Capital One, Inc.

ABC s of Graphs in Version 8 Caroline Bahler, Meridian Software, Inc.

Tips to Customize SAS/GRAPH... for Reluctant Beginners et al. Claudine Lougee, Dualenic, LLC, Glen Allen, VA

ABSTRACT INTRODUCTION SESUG RV

Displaying Multiple Graphs to Quickly Assess Patient Data Trends

Multiple Forest Plots and the SAS System

PharmaSUG China. Systematically Reordering Axis Major Tick Values in SAS Graph Brian Shen, PPDI, ShangHai

Tips and Tricks in Creating Graphs Using PROC GPLOT

A Plot & a Table per Page Times Hundreds in a Single PDF file

ABC Macro and Performance Chart with Benchmarks Annotation

Top Award and First Place Best Presentation of Data Lan Tran-La. Scios Nova, Inc. BLOOD PRESSURE AND HEART RATE vs TIME

Using SAS/GRAPH Software to Create Graphs on the Web Himesh Patel, SAS Institute Inc., Cary, NC Revised by David Caira, SAS Institute Inc.

PharmaSUG Paper TT10 Creating a Customized Graph for Adverse Event Incidence and Duration Sanjiv Ramalingam, Octagon Research Solutions Inc.

Chapter 13 Introduction to Graphics Using SAS/GRAPH (Self-Study)

A Juxtaposition of Tables and Graphs Using SAS /GRAPH Procedures

Submission Guideline Checklist

Using MACRO and SAS/GRAPH to Efficiently Assess Distributions. Paul Walker, Capital One

Something for Nothing! Converting Plots from SAS/GRAPH to ODS Graphics

Using Annotate Datasets to Enhance Charts of Data with Confidence Intervals: Data-Driven Graphical Presentation

Innovative Graph for Comparing Central Tendencies and Spread at a Glance

A Stand-Alone SAS Annotate System for Figure Generation Brian Fairfield-Carter, PRA International, Victoria, BC

Picturing Statistics Diana Suhr, University of Northern Colorado

Custom Map Displays Created with SAS/GRAPH Procedures and the Annotate Facility Debra Miller, National Park Service, Denver, CO

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann

Data Driven Annotations: An Introduction to SAS/GRAPH s Annotate Facility

Multiple Graphical and Tabular Reports on One Page, Multiple Ways to Do It Niraj J Pandya, CT, USA

Introduction to SAS/GRAPH Statistical Graphics Procedures

Using ANNOTATE MACROS as Shortcuts

Want Quick Results? An Introduction to SAS/GRAPH Software. Arthur L. Carpenter California Occidental Consultants

The GSLIDE Procedure. Overview. About Text Slides CHAPTER 27

THE IMPACT OF DATA VISUALIZATION IN A STUDY OF CHRONIC DISEASE

Creating Maps in SAS/GRAPH

... WHERE. AnnotaI8 Data.S... XSYS & YSYS. Harie Annotate: How Not to Lose Your Head When Enhancing BAS/GRAPH output

SUGI 29 Posters. Paper A Group Scatter Plot with Clustering Xiaoli Hu, Wyeth Consumer Healthcare., Madison, NJ

PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA

Teaching statistics and the SAS System

Problems With Using Microsoft Excel for Statistics

Using SAS/GRAPH Software to Analyze Student Study Habits. Bill Wallace Computing Services University of Saskatchewan

KEYWORDS Metadata, macro language, CALL EXECUTE, %NRSTR, %TSLIT

SAS/GRAPH Blues? SAS/FRAME to the Rescue Kathy Shelley, Iowa State University, Ames, Iowa

How to annotate graphics

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

SAS Graph: Introduction to the World of Boxplots Brian Spruell, Constella Group LLC, Durham, NC

LAB 2: DATA FILTERING AND NOISE REDUCTION

Ratios and Proportional Relationships (RP) 6 8 Analyze proportional relationships and use them to solve real-world and mathematical problems.

Paper S Data Presentation 101: An Analyst s Perspective

Name Date Types of Graphs and Creating Graphs Notes

Arthur L. Carpenter California Occidental Consultants

LAB 2: DATA FILTERING AND NOISE REDUCTION

Time Contour Plots. David J. Corliss Magnify Analytic Solutions, Detroit, MI

Paper Abstract. Introduction. SAS Version 7/8 Web Tools. Using ODS to Create HTML Formatted Output. Background

Customized Flowcharts Using SAS Annotation Abhinav Srivastva, PaxVax Inc., Redwood City, CA

Taming the Box Plot. Sanjiv Ramalingam, Octagon Research Solutions, Inc., Wayne, PA

PharmaSUG 2015 Paper PO03

The Plot Thickens from PLOT to GPLOT

Learning Microsoft Excel Module 1 Contents. Chapter 1: Introduction to Microsoft Excel

Generating Participant Specific Figures Using SAS Graphic Procedures Carry Croghan and Marsha Morgan, EPA, Research Triangle Park, NC

The triangle

Clip Extreme Values for a More Readable Box Plot Mary Rose Sibayan, PPD, Manila, Philippines Thea Arianna Valerio, PPD, Manila, Philippines

Graphical Techniques for Displaying Multivariate Data

CHAPTER 4: MICROSOFT OFFICE: EXCEL 2010

Advanced data visualization (charts, graphs, dashboards, fever charts, heat maps, etc.)

ODS LAYOUT is Like an Onion

Create Flowcharts Using Annotate Facility. Priya Saradha & Gurubaran Veeravel

Moore Catholic High School Math Department

Microsoft Excel 2007

1. Data Analysis Yields Numbers & Visualizations. 2. Why Visualize Data? 3. What do Visualizations do? 4. Research on Visualizations

Developing a Dashboard to Aid in Effective Project Management

A Generalized Procedure to Create SAS /Graph Error Bar Plots

Chapter 1 Introduction. Chapter Contents

Creating Regional Maps with Drill-Down Capabilities Deb Cassidy Cardinal Distribution, Dublin, OH

Unit 3: Congruence & Similarity

Out of Control! A SAS Macro to Recalculate QC Statistics

CS1100: Computer Science and Its Applications. Creating Graphs and Charts in Excel

Paper Time Contour Plots. David J. Corliss, Wayne State University / Physics and Astronomy

1 Introduction to Using Excel Spreadsheets

Modifying Graphics in SAS

SGPANEL: Telling the Story Better Chuck Kincaid, COMSYS, Portage, MI

Creating Complex Graphics for Survival Analyses with the SAS System

Smarter Balanced Vocabulary (from the SBAC test/item specifications)

Paper SIB-096. Richard A. DeVenezia, Independent Consultant, Remsen, NY

Annotate Dictionary CHAPTER 11

DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TORONTO CSC318S THE DESIGN OF INTERACTIVE COMPUTATIONAL MEDIA. Lecture March 1998

EXCEL DASHBOARD AND REPORTS BASIC SKILLS

Course of study- Algebra Introduction: Algebra 1-2 is a course offered in the Mathematics Department. The course will be primarily taken by

From Getting Started with the Graph Template Language in SAS. Full book available for purchase here.

Transcription:

SparkLines Using SAS and JMP Kate Davis, International Center for Finance at Yale, New Haven, CT ABSTRACT Sparklines are intense word-sized graphics for use inline text or on a dashboard that condense table of numbers into elegant quantitative visualizations. Dr. Edward Tufte introduced sparklines to the information visualization community in his book "Beautiful Evidence". This paper outlines constructing painfree sparklines for both SAS and JMP, including a complete introduction to their construction and the "bank to 45" and other construction rules. Examples in both JMP and SAS, including SAS macros, are presented. INTRODUCTION Sparklines are more than simply small graphics of a single variable. Sparklines are meant to be used inline in documents to convey quantitative information not just in place of the traditional tabular numeric form, but to provide additional analytic knowledge of the underlying process. Because sparklines are analytic exhibits, brute force machine generation or the graphics without critical review is not recommended. These examples are restricted to times series compatible data of uniform interval with no missing measurements. Sparkline construction follows three guiding principles: Sparklines should be the same point-size as the accompanying text and should have an appropriate width for a good aspect ratio; Unintentional optical clutter should be remoted; and the resolution should be of cartographic or typographic quality, usually 1200 dpi. The process of generating sparklines for review is nearly painless in both SAS and JMP once the proper aspect ratio, colors, borders and resolution are defined for standard graphics procedures. ASPECT RATIO The aspect ratio, or Shape Parameter, of a graphic is the ratio of the height to the width. The aspect ratio for any quantitative graph should be chosen precisely to present data in the most objective fashion. As Cleveland [1993, p. 336] demonstrated with the Wolfer sunspot data set, the aspect ratio used can change not only the initial interpretation of data, but allow detections of trends not obvious as an arbitrary aspect ratio. Figure 1 Plot of Sunspot Data, default JMP aspect ratio The plot in Figure 1 seems to demonstrate high volatility in sunspot numbers over two centuries. MATHEMATICS OF THE SHAPE PARAMETER In a two parameter graph, each pair of data points (x i-1,y i-1 and (x i,y i determine one line segment of length s i. The line segments s i is the hypotenuse of the right triangle formed by these points. The base is 1 unit by our assumptions and the height is (y i - y i-1 or dif(y as a standard SAS/JMP formula. The number of interest is i, which in Figure 2 represents the angle opposite the dif leg. 1

Figure 2 Close up of one data triangle Tan( i = dif(y i /1= dif(y i, so i =arctan( dif(y i Clearly, the selection of the physical units for each logical x unit and the logical height will affect the actual value for is I, so any aspect ratio should in some way produce optimal individual i. BANKING TO 45 Cleveland introduced the Banking to 45 as a way to choose an average of 45. By using this principle, the overall average right triangle will be isosceles. There is no analytic solution to solve the summation of arctangents of absolute values, but many numerically attainable approximations have been offered for time series data. The most easily implemented is the median-absolute-slope criterion [Cleveland, 1988], which seems to work well in practice for sparklines. The compromise is to choose so that s i =1 The approximation is: * = range{ y i }/(n*median{ dif(y i } This is easily implemented for a fixed point size sparkline. If the sparkline height is 12pt, then the width of the sparkline is 12pt/ *. In the sunspot example, * = (154.4/(176*13.5 = 0.065, which yields an optimal length of 185 points, and the sunspot graph is now an appropriate size for text. CLUTTER FREE The second principle is a clutter free graphic. This simply means removing all extraneous lines and text that are not an integral part of the information, and choosing colors that allow the graphic to be fully integrated into the document. The first step to a clutter free graphic is to remove all background colors, borders, and extra plot points and text. All borders and plot points have been removed, the background is now transparent, and the line color is now dark gray instead of black. Another option is to use bars instead of plot lines. These bar spark-lines are often called sparkbars. 2

ADD SOME COLOR Once the sparkline has been reduced to a simple graphic, colors can be reintroduced to emphasize certain statistics. Tufte has suggested that the starting and ending points of a sparkline should be represented by points colored green, and the high and low values by red points. %sparkline(dsn=sunspot,yvar=sunspots,xvar=year; %sparkline(dsn=sunspot,yvar=sunspots,xvar=year,anno=dots; %sparkbar(dsn=sunspot,yvar=sunspots,xvar=year; HIGH RESOLUTION The goal of creating sparklines is to include the visualization in the context of a wider analysis. The inline sparklines presented have been generated using JMP for Macintosh and simply copied and pasted using the operating system s default understanding of the graphical capabilities of both JMP and Microsoft Word and the sparkling resolution matches the overall resolution of the document. These graphics can be generated using the SAS macros or JMP script snippets and setting the appropriate graphics options for the output method. The SAS/Graph procedures produce excellent graphics that can be used in webpages, desktop publishing documents and standard word processing documents. CONCLUSIONS With the appropriate preparation and attention to detail, both SAS/Graph and JMP provide an excellent platform to create and disseminate visual information in the form of Sparklines. REFERENCES AND LINKS Tufte, Edward (1983 The visual display of quantitative information, Cheshire, Connecticut: Graphics Press Tufte, Edward (2006 Beautiful Evidence, Cheshire, Connecticut: Graphics Press (http://www.edwardtufte.com Cleveland, William S. (1988. The Shape Parameter of a Two-Variable Graph, Journal of the American Statistical Association, Vol. 83, No. 402 (Jun., 1988, [pp. 289-300 Cleveland, William S. (1993, A model for Studying Display Methods of Statistical Graphics, Journal of Computational and Graphical Statistics, Vol. 2, No. 4 (Dec., 1993, pp. 323-343 Cleveland, William S. (1994. Visualizing Information, Summit, New Jersey: Hobart Press (http://cm.bell-labs.com/cm/ms/departments/sia/wsc/ Robbins, Naomi B. (2005. Creating More Effective Graphs, Hoboken: Wiley Interscience (http://www.nbr-graphs.com ACKNOWLEDGMENTS SAS and JMP is are Registered Trademarks of the SAS Institute, Inc. of Cary, North Carolina. Thank you to Drs. Edward Tufte, William Cleveland and Naomi B. Robbins for their continued efforts to rid the published world of ChartJunk. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Kate Davis International Center for Finance Yale School Of Management 46 Hillhouse Avenue New Haven, CT 06511 Email: Kate@Belisle.org Web: icf.som.yale.edu 3

CODE AND SCRIPTS JMP SCRIPT SNIPLETS Default Plot Overlay Plot( X( :Year, Y( :Sunspot Number, Sort X( 0, Y Axis[1] << {{Scale( Linear, Format( "Best", Min( -5, Max( 160, Inc( 50, Show Minor Ticks( 0 }}, Separate Axes( 1, X Axis << {{Scale( Linear, Format( "Best", Min( 1745, Max( 1930, Inc( 50 }}, Connect Points( 1, SendToReport( "106", {Scale( Linear, Format( "Best", Min( -5, Max( 160, Inc( 50, Show Minor Ticks( 0 }, "101", {Scale( Linear, Format( "Best", Min( 1745, Max( 1930, Inc( 50 } Banking to 45 Difference Formula Dif(Sunspot Number = Abs( Dif( :Sunspot Number, 1 Create Summary Table Data Table( "Subset of Sunspots Data" << Summary( Group, N( :Sunspot Number, Range( :Sunspot Number, Median( :Name( "Dif(Sunspot Number" Plot with correct Aspect Ratio Overlay Plot( X( :Year, Y( :Sunspot Number, Sort X( 0, Y Axis[1] << {{Scale( Linear, Format( "Best", Min( -5, Max( 160, Inc( 50, Show Minor Ticks( 0 }}, Separate Axes( 1, X Axis << {{Scale( Linear, Format( "Best", Min( 1745, Max( 1930, Inc( 50 }}, Connect Points( 1, SendToReport( "Overlay Plot", FrameBox, Frame Size( 185, 12 4

Clutter Free Overlay Plot( X( :Year, Y( :Sunspot Number, Sort X( 0, Y Axis[1] << {{Scale( Linear, Format( "Best", Min( -5, Max( 160, Inc( 150, Show Minor Ticks( 0 }}, Separate Axes( 1, Connect Thru Missing( 1, X Axis << {{Scale( Linear, Format( "Best", Min( 1745, Max( 1930, Inc( 50, Show Minor Ticks( 0 }}, Connect Points( 1, Show Points( 0, :Sunspot Number( Connect Color( 1, SendToReport( "106", {Scale( Linear, Format( "Best", Min( -5, Max( 160, Inc( 150, Show Minor Ticks( 0 }, "101", {Scale( Linear, Format( "Best", Min( 1745, Max( 1930, Inc( 50, Show Minor Ticks( 0 }, "Overlay Plot", FrameBox, {Frame Size( 185, 12, DispatchSeg( LineSeg( 1, {Line Color( "Gray" } } 5

SAS MACROS %macro gamma(dsn=_last_,y=y, x=x,dsnout=dsnout; ** Creates a data file with the summary vars needed to calculate aspect ratio; proc sort data=&dsn. out= raw; by &x. ; data aspect; set raw; by &x.; vdot=dif(&y.; absvdot=(abs(vdot; proc summary nway noprint; var absvdot &y. ; output out= bar min= max= median= /autoname; data &dsnout.; set bar; vrange=(&y._max-&y._min; gammastar=vrange/(_freq_*absvdot_median; %mend gamma; %macro sparkline (dsn, yvar,xvar,height=12, Anno=NONE; %gamma(dsn=&dsn.,y=&yvar.,x=&xvar.,dsnout= stats; data _null_; set stats; call symput('gamma',gammastar; call symput('miny',&yvar._min; call symput('minx',&xvar._min; call symput('maxy',&yvar._max; call symput('maxy',&yvar._max; width=round(&height./gammastar; call symput('width',width; data anno; %if &Anno.=DOTS %then %do; data anno; *Create dots; set raw end=last; by &xvar.; retain function "SYMBOL" text "DOT" when "A" size &height. xsys ysys '2' hsys '3' ; if _n_=1 or last then do; x=&xvar.; y=&yvar.; color="green"; output; end; if &yvar.=&maxy. or &yvar.=&miny. then do; x=&xvar.; y=&yvar.; color="red"; output; end; %end; goptions noborder RESET=ALL vsize=&height.pt hsize=&width.pt; axis1 length=95 pct; symbol1 interpol=join value=none width=0.5 color=gray; proc gplot data= raw; plot &yvar.*&xvar. / noaxis noframe overlay annotate= anno haxis=axis1; quit; %mend sparkline; %Macro SparkBar(dsn, yvar,xvar, height=12; %gamma(dsn=&dsn.,y=&yvar.,x=&xvar.,dsnout= stats; data _null_; set stats; call symput('gamma',gammastar; width=round(&height./gammastar; call symput('width',width; goptions noborder RESET=ALL vsize=&height.pt hsize=&width.pt; pattern value=solid color=gray; proc gchart data=&dsn.; vbar &xvar. / sumvar=&yvar. discrete noaxis noframe ; quit; %mend sparkbar; 6