R Commander Tutorial

Similar documents
Introduction to GIS software

Introduction to Remote Sensing

An Introduction to the R Commander

LAB #1: DESCRIPTIVE STATISTICS WITH R

Lab 1: Getting started with R and RStudio Questions? or

Using the GLOBE Visualization System

R for IR. Created by Narren Brown, Grinnell College, and Diane Saphire, Trinity University

CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup

ST Lab 1 - The basics of SAS

Printing Batch Unofficial Transcripts

Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set

Excel Tips and FAQs - MS 2010

.txt - Exporting and Importing. Table of Contents

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

1 Introduction to Using Excel Spreadsheets

Introduction to R Commander

Importing Local Contacts from Thunderbird

Using Microsoft Excel

HOW TO EXPORT BUYER NAMES & ADDRESSES FROM PAYPAL TO A CSV FILE

1 All of this was put in tables to make it easier to control the layout and format.

Designed by Jason Wagner, Course Web Programmer, Office of e-learning NOTE ABOUT CELL REFERENCES IN THIS DOCUMENT... 1

Import Data. Dataset. Comma-Separated Value File (.csv)

Code Plug Management: Contact List Import/Export. Version 1.0, Dec 16, 2015

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

The R and R-commander software

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

IN-CLASS EXERCISE: INTRODUCTION TO R

WinFlexOne - Importer MHM Resources LLC

Working with Lists: Adding and Importing Addresses

USING SONRIS DATA ACCESS

Minitab 17 commands Prepared by Jeffrey S. Simonoff

NEOMIN Webmail Instructions

MeltLab Reporting Text, CSV or Excel

Reporter Tutorial: Intermediate

Module 1: Introduction RStudio

Getting Started with Python and the PyCharm IDE

Section 4 General Factorial Tutorials

MS Excel Advanced Level

I>clicker 7. Synchronization. Guide for Instructors

Using RExcel and R Commander

Reference Guide. Adding a Generic File Store - Importing From a Local or Network ShipWorks Page 1 of 21

Introduction to R & R Commander

Recipe Costing: How To Import Your Supplier's Price Data - 1

Tutorial: SeqAPass Boxplot Generator

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here:

QUICKBOOKS TO ACCOUNTEDGE CONVERSION GUIDE

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT

Manual Physical Inventory Upload Created on 3/17/2017 7:37:00 AM

Question: How do I move my mobile account from the Corporate to my Personal Account?

IDS 101 Introduction to Spreadsheets

HOW TO USE THE EXPORT FEATURE IN LCL

TABLE OF CONTENTS. TECHNICAL SUPPORT APPENDIX Appendix A Formulas And Cell Links Appendix B Version 1.1 Formula Revisions...

Budget Reports for All Users

Importing in Offertory Donations from Spreadsheets into Connect Now

Generating a Custom Bill of Materials

Using the WorldCat Digital Collection Gateway

EXCEL BASICS: MICROSOFT OFFICE 2007

Applied Regression Modeling: A Business Approach

TAI Indicator Database User Instructions for Version 1.0

Working with Standards and Duplicates in Geochemistry for ArcGIS

Overview. Experiment Specifications. This tutorial will enable you to

Textbook Inventory Cycle

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

STATS PAD USER MANUAL

User s Guide The SimSphere Biosphere/Atmosphere Modeling Tool

How to import text files to Microsoft Excel 2016:

What s New in Spotfire DXP 1.1. Spotfire Product Management January 2007

Week - 01 Lecture - 04 Downloading and installing Python

Microarray Excel Hands-on Workshop Handout

Creating a Website Using Weebly.com (July 2012 Update)

Introduction to Cognos Participants Guide. Table of Contents: Guided Instruction Overview of Welcome Screen 2

Creating a new form with check boxes, drop-down list boxes, and text box fill-ins. Customizing each of the three form fields.

Code::Blocks Student Manual

Fitting data with Matlab

HP StorageWorks Command View TL TapeAssure Analysis Template White Paper

Lab 1 Introduction to R

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

Quick Start Guide - Contents. Opening Word Locating Big Lottery Fund Templates The Word 2013 Screen... 3

Advisor Workstation Training Manual: Working in the Research Module

Statistics for Biologists: Practicals

HERA and FEDRA Software User Notes: General guide for all users Version 7 Jan 2009

Chapter A8: Scheduled reports

Using Large Data Sets Workbook Version A (MEI)

Data Analysis Through Modeling: Thinking and Writing in Context Supplement: Using R and R Commander

Transitioning Teacher Websites

Chapter 1 Introduction to using R with Mind on Statistics

HPHConnect for Providers. Member Roster User Guide

N2KExtractor. Maretron Data Extraction Software User s Manual

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

How to Remove Duplicate Rows in Excel

[ Getting Started with Analyzer, Interactive Reports, and Dashboards ] ]

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

The Survey System Tutorial. CATI Surveys

QUEEN MARY, UNIVERSITY OF LONDON. Introduction to Statistics

Step-By-Step Instructions for Using InDesign

Import/Export Options

Expedient User Manual Getting Started

sohodox Quick Start Guide

User Guide. Web Intelligence Rich Client. Business Objects 4.1

Transcription:

R Commander Tutorial Introduction R is a powerful, freely available software package that allows analyzing and graphing data. However, for somebody who does not frequently use statistical software packages, the big drawback of R is that it is command line based and thus not very intuitive to use. For users who do not use statistical software very often, R commander might be a good alternative. The R commander is a software package that allows running R from a graphical user interface. This makes analyzing and graphing your data in R a lot easier. Objective The objective of this tutorial is to give you a basic introduction to R Commander and how to use it to run basic statistics and create graphs. 1. Start the R Commander Open R by either clicking on the R icon on your desktop or by navigating to R in your programs folder. Once you opened R, go to Packages/Load Packages on the R menu bar and find Rcmdr in the R packages list (R packages are similar to software programs that have been written by different contributors for R). Highlight Rcmdr by clicking on it and click OK. R might give you a warning message. If so, just ignore it and click No. The R Commander console should now appear on your screen and you are ready to run some statistics and make some graphs in R. 1

2. Reading your data into R After you come back from the field, your notebook shows the following data recordings: Now you want to create a digital copy of your data. To do this, start your computer and type the data table into Excel. Very important: Make sure your column headings do not have any spaces e.g., write soil_moisture instead of Soil Moisture since spaces confuse most statistics programs such as R. To be able to read your dataset into R Commander for statistical analysis, you have to save the data table as either a comma delimited (.csv file, see figure below) or tab delimited file (.txt file, see figure below) on your hard drive (To save the file as a.csv or.txt file in Excel, go to Save As/Other Formats and select CSV (Comma delimited)(*.csv) or Text (Tab delimited) (*.txt) from the Save as type pull down menu). Make 2

sure you remember where you save the data on your computer so you can navigate to the dataset later on. Now we are ready to read our data into R Commander. On the R Commander menu bar, go to Data/Import data and select from text file, clipboard, or URL which should bring up the window below. Make the same selections as shown in the window below e.g., name your data set cover_moisture and select either Commas (if you saved your file as a comma delimited (*.csv) file) or Tabs (if you saved your file as a tab delimited (*.txt) file). Click OK and a window appears that allows you to navigate to your data file. Once you navigated to your data file, highlight it by clicking on it and click Open. You can now view your data by clicking on View data set on the R Commander menu bar. 3

You can also directly enter your data into R by selecting Data from the R Commander menu bar and clicking on New dataset.this will bring up the following window. The Data Editor window appears that allows you to directly enter your data into R. By clicking on the column header, you can change the variable name of each column (e.g. change var1 to location, var2 to cover, and var3 to soil moisture). The variable editor also allows you to select the type of your variables you are entering. Since you are entering numeric values, select numeric under variable type. Type in your data as shown below. 3. Summary statistics To get some summary statistics of your data, go to Statistics/Summaries and select Numerical summaries. Now you should see the following window: 4

Pick cover and soil.moisture (Note: to select more than one variable you have to hold down the Ctrl key) and click OK. A summary table will appear that shows the mean, standard deviation, and the 0, 0.25, 0.50, 0.75, 1 quantiles of the cover and soil.moisture data. 4. Scatterplot To see if there is a relationship between cover and soil moisture it is a good idea to first look at a scatterplot of the data. To create a scatterplot, go to Graphs on the R Commander menu bar and select Scatterplot. This will bring up a table. Select cover as you x-variable and soil moisture as your y- variable. Under x-axis label and y-axis label, label your x- and y-axis Cover (%) and Soil Moisture (%), respectively. Under Options, deselect Marginal boxplots, Smooth line, and Show spread. Next, click OK and a scatterplot will appear (Important: Make sure you highlight the R Console by clicking on it to be able to see the scatterplot). You can save the scatterplot (or any other plot you create) by clicking on the plot (Important: if you do not select the plot you won t be able to save it) and on the R menu bar (Note: R menu bar and not the R commander menu bar) going to File/Save as/jpeg and click on 100% quality. This will bring up a window that allows you to specify the location on your computer where you want to save the plot as a Jpeg image. 5. Fitting a linear regression model The scatterplot above shows us that there is a positive relationship between soil moisture and cover. However, the scatterplot does not tell us how strong the relationship is, if the relationship is significant etc. To get this information we do have to fit a linear regression model. To fit a linear regression model go to Statistics/Fit models on the R Commander menu bar and select Linear model. Select soil 5

moisture as your response variable (aka y- variable or dependent variable) and cover as your explanatory variable (aka x-variable or independent variable) and click OK. The following output will appear in the Output Window of the R Commander: We will talk in class how to interpret the output table (e.g. what do those numbers mean).to check the basic model diagnostics for the linear model you just fit, go to Models/Graphs on the R Commander menu bar and select Basic diagnostic plots. This brings up the following window (We will discuss in class how to interpret the model diagnostic plot): 6

6. Fitting multiple regression models In this part of the tutorial you learn how to fit a multiple regression model. Your hypothesis is that air temperature, solar radiation, and wind speed are significant predictors of ozone. To test this hypothesis, you collected the data called airquality that are available for download from our class website (http://ecosensing.org/teaching/css-560/digital-library/data) (Note: The data was taken from Daalgard, 2002). Let's import the data into R commander and call the dataset airquality (if you can't remember how to import data please refer to section 2 in the document). Let's take a look at the data to familiarize ourselves with the data by selecting airquality from the Data set dropdown menu. Next, let's plot the relationships between the different variables in the dataset. To do this, make the R Console active by clicking on it and type the following command into the R Console command line prompt: pairs(airqualit). 7

Now you should see the following figure: This is how you read the figure: It looks like there is some sort of relationship between ozone and temperature and ozone and wind. However, there seems to be no relationship between ozone and solar radiation. OK - let's now fit a multiple regression model to test if solar radiation, wind, and temperature are significant predictors of ozone. To fit a multiple regression model let's go to Statistics/Fit models... on 8

the R Commander menu bar and select Linear model.... A window appears that should be somewhat familiar to you from section 5 of this tutorial. The model you want to fit basically says that ozone is a function of solar radiation, air temperature, and wind. Mathematically, we can write this model as follows: Ozone ~ Solar.R + Temp + Wind [1] After typing model [1] in the appropriate section of the linear model window (see above) click OK. You should now see the following output: 9

Let's also take a look at the model diagnostics: We will discuss the interpretation of the model output as well the interpretation of the model diagnostics in more detail in class. 7. Paired t-test Next, we will to conduct a paired t-test to see if there is a statistical significant difference in soil moisture before and after a rain event. The data for the paired t-test called paired_t_test is available for download from our class websiste (http://ecosensing.org/teaching/css-560/digital-library/data). Import the data into R by following the steps you learned about at the beginning of this tutorial and name the dataset soil_moisture (Hint: Open the paired_t_test.txt file in a text editor. You will see that the paired_t_test.txt file is a tab delimited file and not comma delimited file. You need that information to properly import the data into R). Before conducting a paired t-test (and any other t-test) it might be a good idea to look at a boxplot of the data first. To do this you do have to stack your data first (you just re-arranging the data so they are in a format that can be used by the computer to create a boxplot of your data for more detail on stacking, please refer to the Appendix of this tutorial) by going to Data/Active data set on the R Commander menu bar and click on Stack variables in active data set. 10

You should now see the Stack Variables window shown below. Select both the soil.moisture.after and soil.moisture.before variables and name the stacked dataset stacked_soil_moisture. Keep the rest of the default settings as shown below and click OK. Next, go to Graphs/Boxplots on the R Commander menu bar. In the window that pops up select Plot by groups and group your variables by factor and click OK. Now you should see the following boxplot: Based on the boxplot, do you think the soil moisture changed significantly after the rain event? After visually looking at the data we are ready to run a paired t-test. To do this, let s go back to our original, unstacked dataset by going to Data set on the R Commander menu bar and selecting soil_moisture. Click OK. 11

Next, go to Statistics/Means on the R Commander menu bar and select Paired t-test. Next, select soil.moisture.before as your first variable and soil.moisture.after as you second variable. Keep the rest at the default settings as shown below. After clicking OK you should get the following output. We will discuss in class how to interpret the output. 8. Two-sample t-test In this section of the tutorial we will learn how to conduct a two sample t-test. We want to test the following hypothesis: soil ph of the non treated stand in the Ponderosa State Park is statistically 12

significantly different than the soil ph in the treated part of the Park. The hypothetical dataset called ph that was collected is available for download from our class website (http://ecosensing.org/teaching/css- 560/digital-library/data). Let's import the data into the R commander and create a boxplot of the data as we learned in section 7 of this tutorial (remember: you first have to stack the data in order to create the boxplot below. For more details please refer to section 7 of this tutorial). OK - it looks like the soil ph in the non treated part of the forest is lower than in the treated part. Let's now do a two-sample t-test to see if the soil ph are statistically significantly different from each other. To do this, keep your stacked ph dataset active and go to the R Commander menu bar and select Statistics/Means and select Independent samples t-test... (in case Independent samples t-test... option is greyed out make sure you i) stacked the ph dataset and ii) that the stacked ph dataset is the active dataset). 13

The window that now appears should look similar to the one below: Keep the default settings and click OK. Now you should see the following output: We will discuss in the class how to interpret the output. 9. Customize your graphs If you want to customize your figures, you do have to do a little bit of programming. For example, the boxplot you creaed in section 8 of this tutorial is associated with the following line of code in your R Commander script window: boxplot(variable ~ factor, ylab = "ph", xlab="factor", data = ph_stacked) 14

We can now change this line of code some to make the boxplot a little nicer. For example, we could type the following into the R Console: boxplot(variable ~ factor, ylab = "Soil ph", xlab = "", names = c("treated Forest", "Untreated Forest"), data = ph_stacked) If you write the code above into the R Console and hit enter you should see the following boxplot: It becomes clear that you need some R programming experience and knowledge to change the appearance of the figure beyond what the R Commander allows you to do. If you do want to learn more about how to program in R, the R website is a good starting point (http://www.r-project.org/ ) as well as Peter Dalgaard's book "Introductory Statistics in R". 10. Closing R Commander and R To close the R Commander and R, go to File/Exit and select From Commander and R. 15

Next, the R Commander will ask you if you want to exit the program. Click OK. Next it will ask you if you want to save the script file and the output file. Click No in both cases. Congratulations - you successfully finished the R Commander tutorial. Other resources Getting started with the R Commander. You can find a pdf of this tutorial on our class website (http://ecosensing.org/teaching/css-560/digital-library/tutorials). If you want to learn more about the R commander I recommend you working through this tutorial. Literature cited Dalgaard, Peter. 2002. Introductory Statistics in R. Springer Science and Business Media, Inc. Important: If you used a MOSS computer for this tutorial, please make sure you delete all the files you created from the computer after you are done with the tutorial. Thanks! Disclaimer Always consult a trained statistician to validate the correctness of the statistical approach you are taking. Please e-mail any suggestions of how to potentially improve this document to Jan Eitel (jeitel@ uidaho.edu). Use of trade names does not constitute an official endorsement by the McCall Outdoor Science School. 16

Appendix A) Data Stacking what s that? When you stack the data in R Commander, you are simply re-arranging the data so it can be properly read in by R. So what happens when you stack the data? Well, you simply re-arrange the data (e.g., in the example below dissolved oxygen, see Figure I) so all your data are within a single column (see Figure II) and you create a second column with factors (e.g., 1, 2 etc.) that let the R know where each of the observation originated from (e.g., in the example below, all data associated with 1 originated from plot 1, and all data associated with 2 originated from plot 2). Figure I. Unstacked data. Column one (entitled DO_oxygen_plot1 ) shows dissolved oxygen values collected at plot 1 and column two (entitled DO_oxygen_plot1 ) shows dissolved oxygen values collected at plot 2. Figure II. Stacked data. Column one (entitled Data ) shows all the collected dissolved oxygen data (here in this example from plot 1 and 2). Column two (entitled Factor ). 17