Graphics #1. R Graphics Fundamentals & Scatter Plots In this lab, you will learn how to generate customized publication-quality graphs in R. Working with R graphics can be done as a stepwise process. Rather than customizing a default graph, you start with a blank canvas, and then add the elements of the graph that you want. You may layer multiple graphs on top of each other (e.g. with different axis scales), or next to each other. We will focus on scatter plots in this lab, because they are actually quite suitable for the presentation of multivariate data (see second part of this lab). Also they are the essential plot type for all kinds of multivariate ordinations and bi-plots. Rather than relying on the default output of multivariate procedures, you will learn to select the elements that you want to display, and customize the graph with symbols and colors, and sizes G1.1. Import and view scatter plot data Set yourself up as usual: (1) copy an empty workspace (e.g. StartR.Rdata) from a previous lab into your working directory (e.g. C:/Lab3/), (2) open R by double clicking StartR.Rdata, (3) in R, from the menu, create and save a new script file (e.g. Lab5.r). Enter the dataset below in Excel and save it as a CSV file (e.g. Scatter.csv ), or download the dataset from the course website. ID SPEC ECOSYS DBH VOL DENSITY AGE 1 Spec1 Ecosys3 11.5 1.09 0.55 23 2 Spec2 Ecosys1 5.5 0.52 0.74 24 3 Spec1 Ecosys3 11 1.05 0.56 27 4 Spec2 Ecosys2 7.6 0.71 0.71 23 5 Spec2 Ecosys3 10 0.95 0.63 22 6 Spec1 Ecosys2 8.4 0.78 0.63 29 7 Spec2 Ecosys2 8.4 0.77 0.64 21 8 Spec1 Ecosys2 9 0.87 0.6 27 Now let s import the data with this familiar code. dat1=read.csv("scatter.csv") # import data head(dat1) # check if it imported correctly fix(dat1) # view the entire dataset if you like, then close the table attach(dat1) # sets this dataset as default As a reminder, the basic plot command has the syntax plot(x,y) or plot(y~x), which will give you the same output. ~ always means as a function of. Let s try: plot(density~vol) Right-click on the graph and choose Copy as metafile, then paste the graph into Word. Then click on the graph in Word, grab it in the corner and reduce it to ½ of the page width an appropriate size for the information it contains.
Does it look like the graph on the right? Font sizes and graph elements are way too small for a thesis- or publication-quality graph. We have to change a few things and customize font sizes, symbol sizes, etc. G1.2. Setting the global graphics parameters An easy way to control font and symbol sizes and the overall look of your graphs is to set a number of graphics parameters that apply to all subsequent plots. First, we ll create a clean slate with the graphics.off() command. This closes all previous graphics windows and resets all graphics parameters to the default. Next you specify the graphics window size with the windows() command. For a simple square scatter, a canvas size of 5 x5 is a good choice. Then, we may want to specify a few other global parameters: cex is a parameter that controls the size of all graph elements (normally keep the default 1, but try some values between 0.7 to 1.3). ps is a global control of the font size of all text elements (the default 12 is fine, but you can try some values between 8 to 16). The parameter family selects the font type (try: sans, serif, or mono ), and mar specifies size of the margins (bottom,left,top,right). The default is 5, but try reducing the upper and right margins, where we don t have labels, to 3: graphics.off() windows(width=5, height=5) par (cex=1, ps=12, family="sans", mar=c(5,5,3,3)) plot(density~vol) Right-click the new graph and choose Copy as metafile, then paste the graph into Word. Again, reduce the graph to ½ of the page width. This does look a bit better now. Effectively, we just reduced the canvas size, and that makes all graph elements appear a relatively larger. Controlling your canvas size is one of the most important tools to scale your final graphs. This global scaling is much preferred over controlling individual graph elements. So, that is the basic set-up. It s a good idea to always program these three lines at the beginning and whenever you execute a subsequent graphics plot. Highlight everything from the graphics.off() function to your plot command when creating a plot. In the next sections, I am not repeating this code, but I assume that you always execute these three lines as well.
G1.3. Building the graph with individual elements The way graphics customization work in R is that you first you take out the items that you actually want to change. We want to customize the data points (gone with type="n") the axes (gone with axes=f), and the axis labels (gone with ann=f). Instead of removing both axes, you can also get rid of the x and y axes individually (xaxt="n" or yaxt="n"). You can define the exact extent of your coordinate system with xlim and ylim. (otherwise the scale is determined by the range of each data series). The following statement leaves us with a completely blank canvas, You would only do this if you want to customize absolutely everything, which we will do for the purpose of this exercise: plot(density~vol, type="n", axes=f, ann=f, xlim=c(0.4,1.2), ylim=c(0.4,0.8)) First, let s bring the axis back in. Axis 1 is the x-axis, axis 2 the y-axis, and you can add axes 3 and 4 and the top and right if you need to. at specifies where you want your tickmarks. You can create this vector with a seq command (from, to, interval) or simply list the numbers inside a vector c(). tcl specifies the tickmark length. Negative numbers give you tick marks to the outside. Duplicate the axis command with positive and negative tickmark length to have the tick marks cross the axis. las (values between 0 and 3) rotates your tick mark labels, which we do for the y-axis. axis(1, at=seq(0.4, 1.2, 0.2), tcl=-0.3) axis(2, at=c(0.4, 0.6, 0.8), tcl=-0.3, las=1) You can modify your tick mark labels, which may come handy if you want to replace, say 1, 2, 3 with Jan, Feb, Mar, and you can place the labels between tick marks. Try these two options instead of the axis 1 command above: axis(1, at=c(0.4, 0.8, 1.2), labels=c("jan","feb","mar"), tcl=-0.3) or: axis(1, at=c(0.4, 0.8, 1.2), labels=c("","",""), tcl=-0.3) axis(1, at=c(0.6, 1), labels=c("jan","feb"), tcl=0) Let s add a title and axis labels. You can get special characters by holding down the Alt-Key on the keyboard, then enter the ASCII Code, then release the Alt-Key. 0179 get s you ³. Google for the three digit ASCII Code of the four digit Windows ALT code for any other symbol or special character. Alternatively, you can also use expressions for symbols, math, or super- and subscripts, etc. title(ylab="density (g/cm³)", xlab="volume (m³)", main="wood") or: title(ylab=expression(italic(d)[wood]~(g/cm^3)), xlab=expression( sqrt(volume~(m^3))), main = "Wood") You can control the font type and size of the labels and titles with font.lab, font.main, cex.lab, and cex.main. Font options are: 1=Regular, 2=Bold, 3=Italic, 4=Bold Italic, and cex typically has a useful range from 0.7 to 1.3 (+/- 30%). Try to add cex.main=0.8 to the title statement above. Finally, let s get the data points back with a points() command. pch specifys the symbol and cex controls the size of the symbol. Try: o, O, *, +, numbers 15-20 for solid symbols, numbers 21-25 for symbols with a background (see the reference sheet on the next page). points(density~vol, pch=16, cex=1) You can add data with more points() commands. The data may come from the same or from different tables that you import. The code below creates some more data and adds it to the graph. VOL2=c(0.5,0.7,0.8,0.9) DENSITY2=c(0.62,0.58,0.54,0.52) points(density2~vol2, pch=21)
We can also add a legend, now that we have two data series. The first two numbers specify the y and x coordinates where we want to place the legend. Then, we repeat our symbol types (pch), symbol size (cex) and add a description (legend). legend(0.8~0.9, pch=c(16,21), cex=0.8,legend=c("site 1","Site 2")) Voila, this is a nicely customized scientific graph! Just as a note, you can also keep the box and/or add grid lines with abline at particular horizontal (h) and vertical (v) positions. You can control all line elements with line type: lty, (1 to 6) and line weight: lwd (1 to 4). Try this code (which does make this particular plot a little too busy) box() # outer box abline(v=0.8, h=0.6, lty=2) # single line abline(v=c(0.6, 1), h=c(0.5, 0.7), lty=2) # multiple lines OK, this is the complete toolkit for building publication-quality black-and-white graphs in R. The same principles apply to all other graph types. Keep this customization cheat-sheet handy: las= 1,2,3,4 (flips x and y axis labels)
G1.4. Adding functions, text and arrows You can add any function with the curve command (but you have to develop the functions first, which we ll learn later). There are arrow commands (with the syntax: from x, from y, to x, to y) to point things out. Similarly, text commands with the syntax: (x, y, text ), let you add add notes or equations. Below, some sample code to try out. plot(density~vol) 1. curve(-0.32*x+0.91, add=t, lty=2) text(0.9,0.7,"y=-0.32*x+0.91") 2. curve(0.585*x^-0.4, add=t, lty=2) text(0.9,0.7,"y=0.58x^-0.4") arrows(0.85,0.69, 0.8,0.65, length=0.1) 3. arrows(0.85,0.69, 0.8,0.65, length=0) 4. plot(density~vol, log="x") curve(-0.32*x+0.91, add=t, lty=2) text(0.9,0.7,"y=-0.32*x+0.91") arrows(0.85,0.69, 0.8,0.66, length=0.1) G1.5. Log scales Below the lazy example for exploring log scales a bit more without importing a new dataset (1:100 simply makes a graph of x/y points with 100 points from 1/1 to 100/100): 1. plot(1:100, xlab="x", ylab="y") 2. plot(1:100, log="y", xlab="x", ylab="y") 3. plot(1:100, log="y", yaxt="n", xlab="x", ylab="y") axis(2, at=c(1,10,100)) 4. plot(1:64, log="x", xaxt="n", xlab="x", ylab="y") axis(1, at=c(1,2,4,8,16,32,64)) 5. plot(1:100, log="xy", axes=f, xlab="x", ylab="y") axis(1, at=c(1,10,100)) axis(2, at=c(1,10,100)) box()
G1.6. Exporting data for presentations, documents or publications There are several options to save your graphs: If you want a quick record of your graph, R allows you to save a low-res image for informal record keeping: save the file in the lossless PNG format (not in JPG format, which is more suitable for photos rather than graphics). Those graphs are also good enough for presentations and websites. Click on the header bar of your graph, then choose File > Save as > PNG. Or you can run the command saveplot("myplot.png", type="png"), while the plot is open. A better option to generate high-quality vector graphics for Word and PowerPoint documents is via Windows metafiles. Right-click your graph, choose Copy as Metafile, then paste into Word. Alternatively you can save Enhanced Windows Metafiles with saveplot("myplot.emf", type="emf"), and drag the files into Word and Powerpoint. Because vector graphics are a set of instructions rather than an image, sometimes the graph may not be rendered correctly with odd fonts, or wrong symbols. If that s the case, see the alternative for working with high resolution image files below. Another good vector graphics format that is often preferred by publishing companies, and compatible with professional layout programs is PDF. This format is also preferred by scientific journals for publications. Save through the menu, or execute saveplot("myplot.pdf", type="pdf") Publishers will use the file to transfer vector graphics directly in the paper. Vector graphics combine maximal quality with the smallest possible file size. If you want better-quality images for presentations and websites, it helps to first create a PDF and subsequently create a screenshot from the PDF. Display your graph in Adobe Reader in the size that you would like the final image, hit the PrtScn key, open Microsoft Paint, hit Ctrl-V to paste the screen shot, mark the area of the graph with the Select tool, then hit Crop, and finally save your image in PNG format. This takes advantage of R s superb PDF rendering engine. You will get nicer graphics for your presentations and websites with this method (compare this with the first option side by side). Finally, if you have Adobe Acrobat Professional, you can generate high-resolution image files for Word documents by saving PNG files at high resolution: File > Save as > Image > PNG, which you can embed into a word document. Don t forget to turn-off automatic compression in Word, first: File > Save as > Tools > Compress Pictures > Options > Uncheck Automatically perform compression. That may be an alternative, if the vector graphics don t work as desired. G1.7. Touching-up graphics in external programs You can touch-up your Windows Enhanced Metafile graphs in PowerPoint. Import the graph to Powerpoint, right-click, choose Ungroup > Yes, repeat the Ungroup a second time, delete the outer frame to get access to the graph elements below. You can now edit the text and move, color, or delete any element. The professional option is to save your R graphics as PDF files, and then edit them in Adobe Illustrator, which is installed on the lab computers. Again, before you can do anything you have to Ungroup, which is done with <CTRL><A> to select all elements, then choosing from the menu Object > Clipping Mask > Release. After that it s much like PowerPoint but with more options. Finally, for low-resolution web and presentation purposes, you can use Microsoft Paint (or more advanced image editors like Adobe Photoshop), but these only work with images not vector files. Open the low or high resolution PNGs, and add, copy, paste or move elements such as legends, labels, titles, etc. The fill-tool is great to re-color elements, and the eraser is handy to remove superfluous labels or elements, which would be hard to program in R.