Short Introduction to R

Similar documents
Mails : ; Document version: 14/09/12

This document is designed to get you started with using R

An Introduction to R- Programming

Instruction: Download and Install R and RStudio

Getting Started. Slides R-Intro: R-Analytics: R-HPC:

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R

R (and S, and S-Plus, another program based on S) is an interactive, interpretive, function language.

R basics workshop Sohee Kang

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio

Introduction to R: Part I

Intro to R. Some history. Some history

R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:

Computational statistics Jamie Griffin. Semester B 2018 Lecture 1

Lecture 1: Getting Started and Data Basics

R package

EPIB Four Lecture Overview of R

Goals of this course. Crash Course in R. Getting Started with R. What is R? What is R? Getting you setup to use R under Windows

A brief introduction to R

II.Matrix. Creates matrix, takes a vector argument and turns it into a matrix matrix(data, nrow, ncol, byrow = F)

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources

Matrix algebra. Basics

Tutorial (Unix Version)

Introduction to Statistics using R/Rstudio

Statistical Software Camp: Introduction to R

Statistics for Biologists: Practicals

BGGN 213 Working with R packages Barry Grant

GS Analysis of Microarray Data

Introduction to MatLab. Introduction to MatLab K. Craig 1

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

Getting Started with R

Introduction to RStudio

R Short Course Session 1

Regression III: Advanced Methods

An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012

Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center

The History and Use of R. Joseph Kambourakis

STAT 571A Advanced Statistical Regression Analysis. Introduction to R NOTES

Introduction to R. Daniel Berglund. 9 November 2017

Introduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center

Introduction to R. Introduction to Econometrics W

Introduction to R. base -> R win32.exe (this will change depending on the latest version)

Introduction to R: Using R for statistics and data analysis

Introduction into R. A Short Overview. Thomas Girke. December 8, Introduction into R Slide 1/21

Basic R Part 1. Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York U.S.A. by Aureliano Bombarely Gomez

Introduction to Engineering gii

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical

Bioinformatics Workshop - NM-AIST

Stat 302 Statistical Software and Its Applications Introduction to R

Chapter 1 Introduction to MATLAB

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics

Introduction to R. Biostatistics 615/815 Lecture 23

STAT 540: R: Sections Arithmetic in R. Will perform these on vectors, matrices, arrays as well as on ordinary numbers

Module 1: Introduction RStudio

A Brief Introduction to R

1 Introduction to Matlab

Introduction to R (BaRC Hot Topics)

Basic R Part 1 BTI Plant Bioinformatics Course

BIO5312: R Session 1 An Introduction to R and Descriptive Statistics

Getting Started with MATLAB

Functions and data structures. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first

Class 2: Statistical computing using R (programming)

Introduction to Matlab

Matlab and Octave: Quick Introduction and Examples 1 Basics

Introduction to R 21/11/2016

Stochastic Models. Introduction to R. Walt Pohl. February 28, Department of Business Administration

Matlab Tutorial, CDS

Using R for statistics and data analysis

MATLAB Basics. Configure a MATLAB Package 6/7/2017. Stanley Liang, PhD York University. Get a MATLAB Student License on Matworks

University of Wollongong School of Mathematics and Applied Statistics. STAT231 Probability and Random Variables Introductory Laboratory

Introduction to R: Using R for statistics and data analysis

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3)

An introduction to R WS 2013/2014

History, installation and connection

LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N.

Statistics 251: Statistical Methods

Constraint-based Metabolic Reconstructions & Analysis H. Scott Hinton. Matlab Tutorial. Lesson: Matlab Tutorial

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises

A (very) short introduction to R

A system for statistical analysis. Instructions for installing software. R, R-studio and the R-commander

A 30 Minute Introduction to Octave ENGR Engineering Mathematics Tony Richardson

Finite Math - J-term Homework. Section Inverse of a Square Matrix

A VERY BRIEF INTRODUCTION TO R

A (very) short introduction to R

IST Computational Tools for Statistics I. DEÜ, Department of Statistics

Introduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics

Introduction to MATLAB 7 for Engineers

A (very) short introduction to R

Introduction to MATLAB

Solving the Unsolvable Through Scientific Computing: Explorations in the Best Uses of Popular Mathematics Software

A Quick Tutorial on MATLAB. Zeeshan Ali

A Guide for the Unwilling S User

Matlab Tutorial: Basics

GS Analysis of Microarray Data

Stat 302 Statistical Software and Its Applications Introduction to R

SISG/SISMID Module 3

1 Matrices and Vectors and Lists

An introduction to WS 2015/2016

Colorado State University Department of Mechanical Engineering. MECH Laboratory Exercise #1 Introduction to MATLAB

Transcription:

Short Introduction to R Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México June, 2015. CIMMYT, México-SAGPDB Short Introduction to R 1/51

Contents 1 Introduction 2 Simple objects 3 User defined functions 4 Graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 2/51

Contents Introduction R? 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 3/51

R? Introduction R? R is a software for statistical analysis and graphics. It was developed by Ross Ihaka y Robert Gentleman. R was specially designed to perform data analysis, but can be used also as a programming language. R is distributed freely under the GNU license(general Public Licence). The development is done by the R Development Team. R is available as source code and binary files compiled for windows and Mac. The R language allows the user to use loops to perform complex analysis. CIMMYT, México-SAGPDB Short Introduction to R 4/51

Contents Introduction R and other sofwares 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 5/51

R and other softwares Introduction R and other sofwares Why R? R is a free software, it can be executed in Windows, Linux and Mac. Excellent documentation and graphical capabilities. Most of the programs written in S plus can be run in R. It is powerful and easy to learn. It can be extended through the use of packages. CIMMYT, México-SAGPDB Short Introduction to R 6/51

Disadvantages Introduction R and other sofwares Graphical user interface not as good as in other softwares. Lack of commercial support (partially true). CIMMYT, México-SAGPDB Short Introduction to R 7/51

Introduction R and other sofwares Installation in a Windows environment Go to http://www.r-project.org and download the windows binary, Figure 1: R web site. CIMMYT, México-SAGPDB Short Introduction to R 8/51

Continue... Introduction R and other sofwares Go to the Downloads section and select CRAN, Figure 2: CRAN mirrors CIMMYT, México-SAGPDB Short Introduction to R 9/51

Continue... Introduction R and other sofwares Select the software for you OS, Figure 3: Executables for various platforms. CIMMYT, México-SAGPDB Short Introduction to R 10/51

Continue... Introduction R and other sofwares Download the base software, Figure 4: Download R-base. CIMMYT, México-SAGPDB Short Introduction to R 11/51

Continue... Introduction R and other sofwares Double click in the installer, Figure 5: Installing R. CIMMYT, México-SAGPDB Short Introduction to R 12/51

Contents Introduction Manuals and help 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 13/51

Manuals Introduction Manuals and help Once that you install R, you will have access to the following manuals in PDF format: An Introduction to R R Reference Manual R Data Import/Export R Language Definition Writing R Extensions R Internals R Installation and Administration CIMMYT, México-SAGPDB Short Introduction to R 14/51

Continue... Introduction Manuals and help Furthermore: Contributed Docs (http://cran.r-project.org/other-docs.html). R-help mailing list archives (http://cran.r-project.org/search.html). Mailing list. Reference card. Summary of most useful R commands (http://www.rpad.org/rpad/rpad-refcard.pdf) S Programming, W. Venables and B. Ripley. See http://www.stats.ox.ac.uk/pub/mass3/sprog. CIMMYT, México-SAGPDB Short Introduction to R 15/51

Contents Introduction Sample R session 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 16/51

Sample R session Introduction Sample R session Go to a Start->Programs->R->R-3.x.y, the working environment is as that shown in the next Figure. Figure 6: R ready to process commands. CIMMYT, México-SAGPDB Short Introduction to R 17/51

Introduction Sample R session The symbol > is the command prompt. We can write command there, for example to show some help about matrices,?matrix then Enter The help system can be accessed through the command line using the following functions:?text help.start() help.search("text to search") apropos("search for some thing similar to...") CIMMYT, México-SAGPDB Short Introduction to R 18/51

Code editors Introduction Sample R session A set of R commands is usually known as script". There are several text editors. The one included by default in Windows installations is not as fancy as others that have syntax highlighting, for example Tinn-R, or R-studio. The standard text editor in R can be accessed from the File menu, File->New Script Figure 7: Code editor in R CIMMYT, México-SAGPDB Short Introduction to R 19/51

Introduction Example (performing basic calculations): 7+4 2*3*(1+2) Sample R session The commands are written in the text editor, then the commands are selected and the pop-up menu is activated, one of the entries in the menu has an option to execute the code. The result will appear in the R console. Figure 8: Executing R commands from the text editor. CIMMYT, México-SAGPDB Short Introduction to R 20/51

Introduction Sample R session Commands written in the text editor can be saved and restored for editing later. There exists a lot of text editors for writing R scripts, for example: WinEdit (shareware) SciViews (freeware) Tinn-R (freeware) Emacs (free) Rstudio (free) CIMMYT, México-SAGPDB Short Introduction to R 21/51

Continue... Introduction Sample R session Figure 9: Rstudio. CIMMYT, México-SAGPDB Short Introduction to R 22/51

Simple objects R works manipulating objects. The objects are manipulated using functions and operators. The most basic objects are: vectors (type numeric or character) Matrix data.frame Lists Functions Some useful functions... General pourpouse: sqrt(),log(),exp(),sin(),cos(), etc. Related to statistics: mean(), sd(), var(), quantile(), etc. The assignment operator is = with R>=1.4.0 or <- in any R version. CIMMYT, México-SAGPDB Short Introduction to R 23/51

Simple objects Notes: R distinguish between upper and lowercase letters The symbol "#" is used to comment the code Object s names can contain any combination of characters, except spaces and special symbols, for example "$","%","#", etc. Missing data can be represented with the special symbol "NA" (Not Available), and errors in computations for example dividing by 0 with the special symbol "NaN" (Not a Number) or "Inf" CIMMYT, México-SAGPDB Short Introduction to R 24/51

Contents Simple objects Vectors 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 25/51

Vectors Simple objects Vectors Vectors are created using the functions c(),seq(),:, rep(). Examples: a=c(1,2,3,4,5) a b=c("a","b","c") b d=1:10 d e=seq(1,10,by=0.5) e f=seq(1,10,length.out=20) f g=rep(10,3) g h=c(e,f) h CIMMYT, México-SAGPDB Short Introduction to R 26/51

Vector operations Simple objects Vectors We can perform most common operations using vectors with the same length. Operations are performed element wise. Examples: a=c(1,2,3) b=c(2,3,5) a+b a-b a*b a^b 3*a+2*b a/b a^2 #sum a and b #a-b #element wise product #power function #product and sum #element wise quotient #takes the square of each element It is also possible to apply a function to a vector, for example: exp(a) log(a) d=sqrt(a)+log(b) d #Exponential function #logarithm function #square root and logarithm #shows d CIMMYT, México-SAGPDB Short Introduction to R 27/51

Simple objects Vectors To extract some elements from the vectors we can use the [] operator, for example: a=c(1,2,5,7,9) b=a[1:3] #first 3 elements in a, assign the result to b b #shows b c=a[-1] #take all elements in b except the first one #and crete a new object a[c(1,5)] #first and last component in a There exists a lot of operations that can be performed using vectors, for example: w=c(1,2,3,na,-1,2) which(is.na(w)) #Missing values which.max(w) #Position of the maximum which.min(w) #Position of the minumum w>2 #numbers that are bigger than 2? which(w>2) #which numbers are bigger than 2? sort(w) #sort in ascending order sort(w,decreasing=t) #sorts in decreasing order CIMMYT, México-SAGPDB Short Introduction to R 28/51

Contents Simple objects Matrices 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 29/51

Creating matrices Simple objects Matrices Matrices can contain numbers or characters. The creation of matrices is shown in the examples below. #a)identity matrix, nxn #Identity matrix of order Identity=diag(c(1,1,1,1)) Identity 4x4 #Alternatively... Identity=diag(rep(1,4)) Identity #b)j matrix #J matrix, order 3x3 J=matrix(1,nrow=3,ncol=3) J #c)in general #matrix(data = NA, nrow = 1, ncol = 1,byrow = FALSE, dimnames = NULL) A=matrix(nrow=3,ncol=3) A[1,]=c(1,2,3) A[2,]=c(4,5,6) A[3,]=c(7,8,9) CIMMYT, México-SAGPDB Short Introduction to R 30/51

Simple objects Matrices #Alternatively... A=matrix(c(1:9),nrow=3,ncol=3,byrow=TRUE) A #Alternatively... A=matrix(c(1,4,7,2,5,8,3,4,8),nrow=3,ncol=3,byrow=FALSE) A CIMMYT, México-SAGPDB Short Introduction to R 31/51

Matrix operations Simple objects Matrices #Sum and substraction C=A+J C D=A-J D #Matrix product (%*%) Dsq=D%*%D Dsq DsqA=D%*%D%*%(A) DsqA #Transpose, use t(d) #Determinant, det(d) t() det() #Inverse, use the function InvI=solve(Identidad) InvI solve() CIMMYT, México-SAGPDB Short Introduction to R 32/51

Simple objects Matrices #Rangk, use the QR decomposition qr(identidad) qr(identidad)$rank Noninvertible=matrix(c(1,2,3,1,2,1,2,4,6),nrow=3,ncol=3,byrow=TRUE) det(noninvertible) qr(noninvertible)$rank CIMMYT, México-SAGPDB Short Introduction to R 33/51

Simple objects Matrices We can also extract some elements of the matrix, for example: A A[1,1] A[1,] A[c(1,2),] A[-3,] A[,1] A[,c(1,3)] A[,-3] #Shows A #Element in row 1, column 1 in A #First row of A #First and second row of A #All rows except the third one #First column of A #Columns one and third in A #All columns except third one Note: When we extract a row or column it is automatically converted to a vector. CIMMYT, México-SAGPDB Short Introduction to R 34/51

Contents Simple objects data.frame 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 35/51

data.frame Simple objects data.frame Tables are created using the function data.frame(v1,...,vn), where v1 is the vector 1 and vn is the vector n. The rows usually represents individuals and the columns covariates. Examples ID=c("genO","genB","genZ") subj1=c(10,25,33) subj2=c(na,34,15) oncogen=c(true,true,false) loc=c(1,30,125) data1=data.frame(id,subj1,subj2,oncogen,loc) data1 #If you want to display the column names in a #data.frame, use the function names names(data1) #To show or extract a column use the operator $ data1$subj2 data1$subj2 data1$oncogen CIMMYT, México-SAGPDB Short Introduction to R 36/51

User defined functions User defined functions R implements a lot of statistical methodologies using functions. The functions are organized in libraries. The base library contains all the functions that we have been using so far. The libraries can be downloaded freely from the internet. We can create our own functions for data analysis. The syntax for creating a new function is as follows: funcion_name=function(arg1,...,argn) { function body; return the value; } Examples: CIMMYT, México-SAGPDB Short Introduction to R 37/51

User defined functions A function to compute f (x) = x 2 f=function(x) { x^2 } f(2) f(c(1,2,3)) A function to compute n i=1 i my_sum=function(n) { tmp=c(1:n) sum(tmp) } #The result should be identical to n=100 my_sum(n) n*(n+1)/2 n(n+1)/2 CIMMYT, México-SAGPDB Short Introduction to R 38/51

Graphs R includes many functions to produce high quality graphics ready for publication. We can explore the graphical capabilities of the software included in the demos. Type demo() in the command prompt and the software will display a list of demos that we can execute, for example: graphics image persp plotmath demo() demo(graphics) demo(image) demo(persp) demo(plotmath) #Some graphical capabilities #Working with images #Mathematical symbols in graphs CIMMYT, México-SAGPDB Short Introduction to R 39/51

Contents Graphs Plotting user defined functions 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 40/51

Graphs Plotting user defined functions Plotting user defined functions To plot user defined functions we can use the functions curve(), or plot(). We will give more details about the later in the next slides. Examples #1: Plotting f(x)=x^2, -4<=x<=4 curve(x^2,-4,4) #2: Plotting f(x)=-x^3, -4<=x<=4 curve(-x^2,-4,4) #3: op=par(mfrow=c(2,2)) curve(x^3-3*x, -2, 2) curve(x^2-2, add = TRUE, col = "violet") plot(cos, xlim =c(-pi,3*pi), n = 1001, col = "blue") chippy=function(x) sin(cos(x)*exp(-x/2)) curve(chippy, -8, 7, n=2001) curve(chippy,-8, -5) #4: Standard normal curve(1/sqrt(2*pi)*exp(-1/2*x^2),-3,3) CIMMYT, México-SAGPDB Short Introduction to R 41/51

Contents Graphs Plot function 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 42/51

Plot function Graphs Plot function One of the most useful function for plotting is the plot function. With this function we can plot points (scatterplot), lines (time series), or functions. Examples y<-c(1,2,3,4,5) x<-c(1,4,9,16,25) plot(x,y,main="",ylab="", xlab="") plot(x,y,type="l") plot(dnorm, -3,3,col = "blue") CIMMYT, México-SAGPDB Short Introduction to R 43/51

Contents Graphs More functions for graphs 1 Introduction R? R and other sofwares Manuals and help Sample R session 2 Simple objects Vectors Matrices data.frame 3 User defined functions 4 Graphs Plotting user defined functions Plot function More functions for graphs 5 Importing data 6 Installing packages 7 Questions CIMMYT, México-SAGPDB Short Introduction to R 44/51

Graphs More functions for graphs More functions for graphs barplot pie histogram boxplot CIMMYT, México-SAGPDB Short Introduction to R 45/51

Importing data Importing data There are several routines to import data into the R environment. For ASCII files we can use: 1 read.table 2 read.csv The function setwd is useful for setting the working directory so that we do not have to write the entire PATH of a file each time that we want to read it. R can save and load objects in a native binary format. The functions load and save can be used to that end. CIMMYT, México-SAGPDB Short Introduction to R 46/51

Examples Importing data This data set is from CIMMYT global Wheat breeding program and comprises phenotypic, genotypic and pedigree information of n = 599 wheat lines. The data set was made publicly available by Crossa et al. (2010). Lines were evaluated for grain yield at four different environments. Each of the lines were genotyped for p = 1, 279 Diversity Array Technology (DArT) markers. At each marker two homocygous genotypes were possible and these were coded as 0/1. Marker genotypes are given in the object X. Finally a matrix A provides the pedigree-relationships between lines computed from the pedigree. CIMMYT, México-SAGPDB Short Introduction to R 47/51

Continue... Importing data rm(list=ls()) library(doby) setwd("~/0. R-Intro/examples") #Load genotypic data load("pedigree_markers.rdata") #Load phenotypic data pheno=read.table(file="599_yield_raw-1.prn",header=true) colnames(pheno) pheno=pheno[,c(2,5,6)] out=summaryby(gy~env+gen1,data=pheno,fun=mean) Y=data.frame(yield=out$GY.mean,VAR=out$gen1,ENV=out$env) CIMMYT, México-SAGPDB Short Introduction to R 48/51

Installing packages Installing packages Many users all over the world are creating software packages for R, Figure 10: Packages. CIMMYT, México-SAGPDB Short Introduction to R 49/51

Continue... Installing packages A package is just a collection of routines used to perform some calculations. Usually these routines are made available to the user through functions well documented. The function install.packages() is used to install software from the CRAN website, for example: install.packages("blr") install.packages("bglr") install.packagses("doby") Once that a package is installed, it should be loaded with the function library, library(doby) library(bglr) CIMMYT, México-SAGPDB Short Introduction to R 50/51

Questions? Questions CIMMYT, México-SAGPDB Short Introduction to R 51/51