Chapter 1 Introduction to using R with Mind on Statistics

Similar documents
How to Download and Install R The R software can be downloaded from: Click on download R link.

STAT 571A Advanced Statistical Regression Analysis. Introduction to R NOTES

Lecture 1: Getting Started and Data Basics

An Introduction to R- Programming

Module 1: Introduction RStudio

Minitab Notes for Activity 1

Week 1: Introduction to R, part 1

GETTING STARTED WITH MINITAB INTRODUCTION TO MINITAB STATISTICAL SOFTWARE

Tutorial: SeqAPass Boxplot Generator

Lab: Supplying Inputs to Programs

STAT 113: R/RStudio Intro

The Very Basics of the R Interpreter

A brief introduction to R

R Website R Installation and Folder R Packages R Documentation R Search R Workspace Interface R Common and Important Basic Commands

Refresher workshop in programming for polytechnic graduates General Java Program Compilation Guide

Barchard Introduction to SPSS Marks

The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up.

PAYGLOBAL EXPLORER INSTALLATION GUIDE

Introduction to MATLAB

STAT 20060: Statistics for Engineers. Statistical Programming with R

Introduction to SPSS

CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup

Excel Tips and FAQs - MS 2010

Using Microsoft Excel

Getting Started with DADiSP

Using the Zoo Workstations

Wow Admin Panel. Version Probably the best WoW Private Servers trainer out there

2 The Stata user interface

For many people, learning any new computer software can be an anxietyproducing

Statistics for Biologists: Practicals

Getting To Know Matlab

EXCEL BASICS: MICROSOFT OFFICE 2007

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Regression III: Advanced Methods

APPM 2460 Matlab Basics

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.

QUEEN MARY, UNIVERSITY OF LONDON. Introduction to Statistics

Barchard Introduction to SPSS Marks

1 Interface Fundamentals

Running Minitab for the first time on your PC

WinView. Getting Started Guide

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

download instant at

General Guidelines: SAS Analyst

Gradekeeper Version 5.7

Chapter 3. Finding Sums. This chapter covers procedures for obtaining many of the summed values that are

Operating System Interaction via bash

15-122: Principles of Imperative Computation

Digital Humanities. Tutorial Regular Expressions. March 10, 2014

Stretching and Flexibility Manual

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #43. Multidimensional Arrays

EXCEL BASICS: MICROSOFT OFFICE 2010

Introduction to Excel

MATLAB Project: Getting Started with MATLAB

VISUAL GUIDE to. RX Scripting. for Roulette Xtreme - System Designer 2.0. L J Howell UX Software Ver. 1.0

Title of Resource Introduction to SPSS 22.0: Assignment and Grading Rubric Kimberly A. Barchard. Author(s)

SPSS 11.5 for Windows Assignment 2

Getting Started with JMP at ISU

Goals of this course. Crash Course in R. Getting Started with R. What is R? What is R? Getting you setup to use R under Windows

Skills Funding Agency

Part I. Introduction to Linux

To create a notebook on desktop version 1. Open One Note 2. File > New. o FILE. Options

Interface. 2. Interface Adobe InDesign CS2 H O T

The R and R-commander software

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015

Introduction to R. base -> R win32.exe (this will change depending on the latest version)

Introduction. SSH Secure Shell Client 1

A Guide for the Unwilling S User

Microsoft Excel Microsoft Excel

Worldox GX Cheat Sheet

SAS Training Spring 2006

Adobe Dreamweaver CS5 Tutorial

ADOBE DREAMWEAVER CS4 BASICS

Editors in Unix come in two general flavours:

10kTrees - Exercise #2. Viewing Trees Downloaded from 10kTrees: FigTree, R, and Mesquite

TDNet Discover User Manual

lecture notes September 2, How to sort?

Who should use this manual. Signing into WordPress

Basic tasks in Excel 2013

,!7IA3C1-cjfcei!:t;K;k;K;k ISBN Graphing Calculator Reference Card. Addison-Wesley s. Basics. Created in conjuction with

Creating a Phone Survey

Service Minder Plus Features/Helpful Hints

Chapter 2 The SAS Environment

Creating and Publishing Faculty Webpages

Windows 8.1. Tiles come in four shapes: small, medium, wide, and large. The red outlined tiles are live tiles.

Beginners Guide to Snippet Master PRO

R EIN V E N TIN G B U S I N E S S I L E M A. MARK5 Basic guide. - All rights reserved

Appendix A. Introduction to MATLAB. A.1 What Is MATLAB?

The MAXQDA Stats Data Editor

Introduction to R & R Commander

R Commander Tutorial

Logger Pro 3. Quick Reference

IN-CLASS EXERCISE: INTRODUCTION TO R

CounselLink Reporting. Designer

PCB Design utilizing Cadence Software. Application Note

dbdos PRO 2 Quick Start Guide dbase, LLC 2013 All rights reserved.

Administrator Quick Guide

An Introduction to Using R

GEO 425: SPRING 2012 LAB 9: Introduction to Postgresql and SQL

StreetSync Basic Quick-Start

Transcription:

Chapter 1 Introduction to using R with Mind on Statistics Manual overview The purpose of this manual is to help you learn how to use the free R software to perform the graphs, simulations, and data analyses presented by Mind on Statistics. This chapter will describe how to obtain R and give a brief introduction to the software. The following chapters will provide specific commands for the book s examples. Line-by-line examples of R code will be provided for many of the examples demonstrated in the text. Two methods will be used to demonstrate the R code. The first method uses screen copies. The screen copy shows exactly what you will see when running R. The red text shown in the screen copy is what you type. You will not need to type the initial > prompt at the start of each line. The blue text in the screen copy is output produced by R. A simple example is the first bullet under Basic Commands of this chapter. The second method is simply a display of the text you would type. You would type the command after the > prompt. R would respond after you type the [Enter] or [Return] key. An example of this is the third bullet under Basic Commands of this chapter. What is R? R is a computer language and environment that was developed with statistical graphics and analysis in mind. Consequently it is commonly thought of as a statistical software package, like the proprietary Minitab and SPSS packages. In the growing atmosphere of free software, scientists are constantly making available new packages that enable R to perform very advanced modern statistics. This manual, however, will focus on the more elementary aspects of R needed to learn statistical concepts and successfully perform the statistics that you are likely to encounter in your future careers. There are several consequences of R being free software developed by scientists for scientists. First of all, it is very powerful. If you decide to continue in a career that depends heavily on statistics such as economics, biology, medicine, marketing, etc., R will allow you to develop your own statistical functions specific to your own immediate needs. Secondly, it was created as a tool for scientists rather than for mass marketing to make money. Thus it is line command driven and lacks features as pull down menus and point-and-click commands. This results in a software has a high nerd factor as you will notice when looking at the help commands and manuals. S-Plus, a proprietary software package, is almost identical to R with respect to line commands, but includes pull down menus and some point-and-click commands. Obtaining R R is freely available via the website www.r-project.org as are its online manuals. To install R on your computer, follow the links which originate from the CRAN link found on the left-hand side of the R project homepage. This link leads you to a web page where you select the location of a mirror site closest to your location;.e.g., University of California at Berkeley, California, USA. If you are using Microsoft Windows, for example, you would then click on the Download R for Windows link, followed by the base link, and then the Download R 2.13.1 for Windows link to install R.

Basic commands The most essential features and commands to keep in mind to use R are: R is a line command driven software with the commands typed at the > prompt followed by the [Enter] or [Return] key. Values are assigned using the R the two key arrow <-. It is created by typing < followed by a dash. To see the value of a variable, simply type it and hit [Enter]. The following screen copy shows the basic command of assigning the value 10 to the variable n. (Note: The bracketed [1] is used to count the values in the output as you will notice when working with larger datasets.) The # sign is used to add comments to a command line. R ignores everything on the command line typed after #. For example, at the prompt, type the following command to assign 4, 6, and 3 to the variable heights. > heights <- c(4,6,3) #c()assigns a string of numbers Although comments are provided throughout this manual to give extra instruction, it is not necessary for you to type the comments to run the commands.

R commands are followed by parentheses with variables and options put within the parentheses. Example: > sort( heights ) [1] 3 4 6 Typing the command without the parentheses will result in the software code flashing on the screen, but causes no harm. R is case sensitive which means that R will distinguish between the variables Heights and heights. To list what objects you have available, use the ls() command. Example: > ls( ) [1] "heights" "Heights" "n" To delete an object use the remove command, remove(). Example: > remove( heights ) > ls() [1] "Heights" "n" If you want to change a previous command you can hit the up arrow key and edit your old commands. If you type [Enter] before a command is completed, R will go to the next line and respond with a + to denote the command is not finished and you can continue typing. To terminate the command, type the [Escape] or [Esc] key. Help can be obtained using either the pull-down help menu, using the function help( function ) or the help.search( key word ) commands. For example, help(boxplot) will provide information about the function boxplot. To get a list of commands that regard regression, try help.search( regression ). To quit R, type q(). Importing data from the Mind on Statistics cd into R R is best designed to import and export text data. The Mind on Statistics cd provides R data. First you need to install the datasets from the Mind on Statistics cd to your computer. First, go to Install Datasets and at the next level also click on Install Datasets. Select R to load the data on your computer. For this manual, the data were saved to the C drive in the location C:\Mind on Statistics\R. The load() command is used to import data. R uses forwards rather than backward slashes to denote a folder location. Try importing the Temperature dataset. >load("c:/mind on Statistics/R/temperature.RData") Storing data: Objects, vectors, matrices, and data frames. Anything stored by R is called an object. Thus the function sort( ) is an object, as were the variables height and n in the previous examples. We will focus primarily on objects that store data, such as did height and n. Data will most commonly be stored in either vectors or data frames. A vector is simply a string of numbers. n is a vector of length 1 and heights a vector of length 3. If you are typing a variable with only a few data values you will often simply type the values into a vector. R allows mathematical operations to be carried out on an entire vector. Example:

> x <- c( 5,2,4 ) > x + 6 [1] 11 8 10 Elements of a vector can be specified by use of the hard brackets [ ]. Either a single element or more than one can be specified. Examples: > x[3] [1] 4 > x[2:3] [1] 2 4 > x[c(1,3)] [1] 5 4 A matrix is a string of vectors of the same mode (numeric, character, or factor) and of the same length bound together. The dimension of a matrix is typically described by its number of rows and columns. Analogous to a vector, the elements within a matrix are described by their row and column position in the matrix; e.g., X[ row, column ]. Leaving either the row or column unspecified is the same as specifying them all. Examples: > X <- c(2,4,6,8,10,12,14,16,18,20,22,24) #vector > X <- matrix( X, nrow=3, byrow=f ) # turn into matrix > X [,1] [,2] [,3] [,4] [1,] 2 8 14 20 [2,] 4 10 16 22 [3,] 6 12 18 24 > X[2,3] # row 2, column 3 [1] 16 > X[1:2, 3:4] # rows 1 through 2 and columns 3 and 4. [,1] [,2] [1,] 14 20 [2,] 16 22 > X[1,] # row 1, all columns [1] 2 8 14 20 > X[,3] # all columns, row 3 [1] 14 16 18 > X+3 # add 3 to all values of X [,1] [,2] [,3] [,4] [1,] 5 11 17 23 [2,] 7 13 19 25 [3,] 9 15 21 27 A data frame is like a matrix, but one column of the data frame may consist of numbers and another column words. The load()function automatically puts the data into a data frame.

The first column (1,2,,20) are simply the row labels while the data are the columns listed under c, l, etc. Columns within a data frame can be accessed directly using the $ symbol. The variables within a data frame can be accessed more directly by attaching the data frame. For example, suppose you typed j. R would say that it could not find the variable. That is because it looks in the directory (displayed by ls() ) and sees the data frame temperature, but does not look inside temperature for the variable j. Using the attach(temperature) command tells R to also look inside data frame temperature if it cannot find j in the directory. Once finished using the dataset, detach the data frame. Examples: > j Error: Object j not found > attach( temperature ) > j [1] 67 50 50 43 54 58 40 39 49 32 28 31 25 25 32 29 22 12 40 7 > detach( "temperature" ) > j Error: Object j not found

Editing data Objects in R can be edited using the edit() function where the output is assigned either to a new object or the original object. A spread sheet editor will appear for data frames and a simple text editor for vectors. To end the editing session, simply click on the X icon in the upper right hand corner of the window. Examples: Exporting data > newtemp <- edit( temperature ) > x <- edit( x ) The function save()is used to export a data frame to an R file on your computer. There are many advanced optional features to the command, but you must provide an R object to export and a file destination for the object. Example: save(temperature,file="c:/statdata/temperature.rdata") Learning more about R In this manual, you will continue to learn more about R as you progress through the examples corresponding to each chapter in Mind on Statistics. Although this manual will focus on more introductory statistics, the potential for using R for statistical analysis is almost endless. There always seems to be more about this software and statistics that a person can learn no matter how introductory or advanced the user. Besides the R manuals available through the Help icon at the top of R, there are a number of books written at introductory and advanced levels which describe how to use R and the similar S-Plus package.