Dr Raffaella Calabrese, Essex Business School 1. GETTING STARTED Introduction to R R is a powerful environment for statistical computing which runs on several platforms. R is available free of charge. R is now under active development by a group of statisticians called 'the R core team', with a home page at www.r-project.org. The current version is 3.1.2 and was released in October 2014. New releases occur every six months or so, often around April and October. You will find binaries for Windows, Mac OS X and Linux in the Comprehensive R Archive Network. R users can program their own code. Hence, R is useful if you need to implement a new statistical method that is not available in SPSS or Excel. R is interactive to the user thus providing immediate feedback. It is highly versatile and ideal for statistical simulations such as monte carlo, jackknifing and bootstrapping. Finally, R source code is available for both inspection and modification where need be. Furthermore only a few tasks can be accomplished through the menu system and therefore it requires some effort to learn to use R. 1.1 R installation The first step to install R is to go to the site called Comprehensive R Archive Network (CRAN). It is a collection of sites which carry identical material, consisting of the R distribution(s), the contributed extensions, documentation for R, and binaries. Type its full address, http://cran.r-project.org/ or just type CRAN into Google and you will be taken to the site. Select under CRAN CRAN - Mirrors -> UK (http://star-www.st-andrews.ac.uk/cran/) Under Download and Install R click Windows and then base -> R-3.1.2-win32.exe (this will change depending on the latest version) Find the mirror (http://cran.r-project.org/mirrors.html) nearest you and follow the links. The Windows installer is fairly easy to use and, after agreeing to the license terms, lets you choose which components you want to install. Additional packages can always be installed directly from R at a later time. 1.2 Loading and installation of R packages Every function in R is in a package, and packages come with documentation. In addition to those packages included in the initial installation, R comes with several useful contributed packages. You can view the packages available for installation by entering the command library(). Type
>library() To check the functions available on a certain package, type for e.g. >library(help= stats ) will open a help window containing one-line descriptions of all functions in the stats package. When you want to do some specific tasks, for example Analysis of dose-response curves, you will need to install the package that does the intended tasks. In this case you install drc package. The easiest way to install an additional package (eg. drc) is through the Packages menu of the R graphical users interface (RGUI): From the menu select Packages ->Install package(s). A window opens asking you to pick a CRAN mirror site; choose the one nearest you (for our case, its UK- st-andrews); after this, a window will open with the various packages that are available on CRAN. Select the package that you want to install. After selection, click the OK button and the package will be installed (see Figure below). It s not enough to just install the package. You have to load it in memory to make it available for use. You accomplish this by entering the command library( nameofpackage ) at the R console, where nameofpackage is replace with drc for our case type in >library(drc) or from the menu bar packages->load package -> drc
Below is the figure Notice that there is an error message (highlighted). The reason is that drc requires some other package to have been installed for it to run. Here, plotrix is one of them in addition to lattice, MASS and nlme. You will therefore need to install plotrix. You accomplish this using any one of the two procedures of installing packages covered above. In general a good number of packages will not work singly and will require other packages to have been installed. This will be clear from the messages whenever you attempt to load a package and therefore be on the watch. To see the packages currently loaded into memory, type in >search() 1.3 Running the Software To run R click on Start -> All programs -> R-3.1.2 The R icon was created by the setup program during installation. What appears when R is started is called R Graphical Users Interface (RGUI). R issues a prompt where it expects the user to issue a command. The default prompt is (see Figure below).
When R starts you will see a window called the RConsole. This is where you type your commands and see the text results. (Graphs appear in a separate window.) First thing that appears is the version number of R and the date of the version. Warning: some packages work only with the recent versions of R. 1.4 Changing your workspace The first thing you probably want to check is your working directory. You can find this from R by issuing the command getwd() at the R prompt. To change the directory type in >setwd( c://rcourse ) (this will change the directory to data on drive c) or from the menu click on File -> Change dir
1.5 Getting help in R and also online help Online documentation for most of the functions and variables in R exists and can be printed on screen. The easiest way for getting help in R is to click on the Help button on the toolbar of the R general user interface (RGUI). R comes with several official manuals. The following manuals for R are downloadable as PDF files or can be directly browsed as HTML: An Introduction to R, The R language definition, Writing R Extensions, R Data Import/Export, R Installation and Administration, R Internals and The R Reference Index and FAQs. These should be your primary source of information. You should also know about the help function, which opens a help window. This function can be called with arguments to obtain help about specific features of R, for example >help(write.table). A shortcut for help on a topic is a question mark followed by the topic as in >?plot. You obtain the screen below
other times, you only know the subject for which you want help and not the function (eg. anova). In this case use the help.search function and enclose your query with double quotes as follows; >help.search( anova ) or from the menu bar Help->Search help. (enter the anova in the Question dialogue box) The find function tells us what package the object or item you are looking for is contained. For example if you want to find plot, you type; >find( plot ) (remember double quotes on plot) When you use find function, the item that you are looking for must be a function in one of the installed packages and also unique. If you know the name of a package and you need further information concerning the package, you type help(nameofpackage). For example I know there is a package called stats and I want more information about it. At the R console type >help(stats) (and select index to get further details on the package) (see Figures below).
1.6 Importing data We usually measure or observe more than one attribute from a study unit also called a sampling unit. Therefore we organise data sets into collections of variables (vectors). In R, such collections are called data frames in R. You generate a data frame by combining variables whereby each variable becomes a separate column. In order that a data frame to represents the data properly, the sequence in which observations appear in the vectors (variables) must be the same for each vector and each vector should have the same number of observations. For example, the first observations from each of the vectors to be included in the data frame must represent observations collected from the same sampling unit. First, recall that R uses the working directory (or the startup directory). Your data file must be located in the working directory for it to be accessible. Find out what is your working directory and change it if necessary. Use getwd() and setwd() commands. If the file you want to read is not in the working directory, you will need to specify it s complete path. In excel, you can create a comma separated values file (csv). To read a csv file, use the command read.csv. Suppose we have a csv file named Spatial located at c:\rworkshop\ directory. The data will appear as spreadsheet with rows as cases and columns as variables. The columns will have names equal to those in the header of the text datafile. Cases will equal total number of lines excluding header. Then to read it into R, issue the command >mydata<-read.csv(file="c://rworkshop/spatial.csv", header = TRUE) >mydata # List the data for visualization If you want to import a spss file, you need the package foreign >library(foreign) >read.spss("spatial.sav", to.data.frame=t)->data If you want to export your R dataset into excel for viewing/editing, you use the write.table
(for a text file) or write.csv (for a csv file) commands >write.csv(mydata, file="c://rworkshop//csvmdata"). 2. IS SPATIAL MEMORY AGE-RELATED? There is a commonly held belief that the younger you are the more easily you can absorb new material. In fact we are used to parents or grandparents being forgetful having a senior moment. In order to investigate this for a particular kind of memory, spatial, an investigation has been carried out to test the spatial memory of two groups of people, designated as Elderly (over 65) and Young (20-25). In the test, 18 objects were arranged randomly on a 10x 10 grid. Participants were allowed to examine the positions of the objects for as long as they liked and were subsequently asked to recall these positions by replacing the objects on the grid. Their study times and two measures of performance the number positioned correctly and the error in positioning - are recorded. We analyse the dataset Spatial.