An Introduction to Statistical Computing in R
|
|
- Georgia Harris
- 6 years ago
- Views:
Transcription
1 An Introduction to Statistical Computing in R K2I Data Science Boot Camp - Day 1 AM Session May 15, 2017 Statistical Computing in R May 15, / 55
2 AM Session Outline Intro to R Basics Plotting In R Data Manipulation Statistical Computing in R May 15, / 55
3 R Basics Here we will give a quick overview of the R language and the RStudio IDE. Our emphasis will be to explore the most used features of R, especially those used in later courses. This won t cover all the details, but will the most important parts. Statistical Computing in R May 15, / 55
4 Working with Rstudio Before beginning with R let s orient ourselves with RStudio. Statistical Computing in R May 15, / 55
5 Our initial view of RStudio is: Statistical Computing in R May 15, / 55
6 Go to: File -> New File -> R Script. This gives: Statistical Computing in R May 15, / 55
7 Statistical Computing in R May 15, / 55
8 Try It Out Type the following into console?lm??linear plot(1:20, 1:20) Statistical Computing in R May 15, / 55
9 There are several useful shortcut keys in RStudio. A few popular ones: Ctrl+Enter - When pressed in Editor, sends current line to console. Ctrl+1, Ctrl+2 - switch between editor and console Ctrl+Shift+Enter - run entire script in console tab completion - this is perhaps the most used feature For vim/emacs users Tools -> Global Options -> Code -> Keybindings will give you your prefered bindings. Statistical Computing in R May 15, / 55
10 It s important to know our working directory. Given a file name, R will assume it is located in your current working directory. R will also save output to the working directory by default. It is important to set your working directory to the correct location or specify full path names. Statistical Computing in R May 15, / 55
11 Try out the following in the console window: getwd() list.files() To change your working directory go to: Session -> Set Working Directory -> Choose Directory Alternatively, setwd("/path/to/directory") Statistical Computing in R May 15, / 55
12 Reading, Writing, Saving, and Loading Here we ll look at bringing data into R and getting it out We ll also see how to save R objects and environments Statistical Computing in R May 15, / 55
13 Reading In Data read.table read.csv read.fwf Check out options for each?read.table Statistical Computing in R May 15, / 55
14 Syntax?read.table?read.csv read.table("/path/to/your/file.ext", header=true, sep=",", stringsasfactors = FALSE) Statistical Computing in R May 15, / 55
15 Most Common Options sep tells how fields/variables are separated. Commons values are:, (comma) (single space) \t (tab escape character) stringsasfactors tells whether to treat non numeric values as factor/categorical variables. header tells whether first line of file has variable names na.strings tells how missing values are encoded in the file. Statistical Computing in R May 15, / 55
16 Standard Procedure Open file in text editor Check items relevant to options. Header? Separator type? For big files, Linux tools are helpful: head -n10 BigFile.txt > OpenMe Statistical Computing in R May 15, / 55
17 Try it Out Let s read in the ReadMeInX.txt files into R. Try it on your own before looking at the answer on the next slides. Example workflow: 1 Set your working directory to the directory containing the files. 2 Examine the files in a text editor to check for common options (header, separator, etc.) Statistical Computing in R May 15, / 55
18 # read.table's default seperator ok for this one set0 <- read.table("readmein0.txt", header=true) # specify new seperator set1 <- read.table("readmein1.txt", header=true, sep=',') # Or use read.csv set1 <- read.csv("readmein1.txt", header=true) Statistical Computing in R May 15, / 55
19 # another change of seperator set2 <- read.table("readmein2.txt", header=true, sep=';') # check for missing set3 <- read.table("readmein3.txt", header=false, sep=',', na.strings = '') Statistical Computing in R May 15, / 55
20 Writing Data write.table write.csv Statistical Computing in R May 15, / 55
21 Syntax and Common Options?write.csv write.csv(myrobject, file="/path/to/save/spot/file.csv", row.names=false) Options largely the same as their read counterparts row.names = FALSE is helpful to avoid have 1,2,3,... as a variable/column Statistical Computing in R May 15, / 55
22 Try It Out Write out one of the files you imported. Try to varying options like sep, quote. Statistical Computing in R May 15, / 55
23 Saving Objects saverds/readrds are used to save (compressed version of) individual R objects # save our data set saverds(set1,file="tstobj.rds") # get it back newtst <- readrds("tstobj.rds") # can save any R object. Try a vector my.vector <- c(1,8,-100) saverds(my.vector, file="justavector.rds") Statistical Computing in R May 15, / 55
24 Saving Environment We can save all variables in the current R workspace with save.image We can load in a saved workspace with load R will ask you save your work when you exit # Save all our work save.image("allmywork.rdata") # Reload it load("allmywork.rdata") # name given to default save load(".rdata") Statistical Computing in R May 15, / 55
25 The Basics of R Let s do a whirlwind tour of R: it s syntax and data structures This won t cover all the details, but will the most important parts Statistical Computing in R May 15, / 55
26 Basic R Data Types # numeric types: interger, double 348 # character "my string" # logical TRUE FALSE # artithmetic as you'd expect * 2^4 # so too logical operators/comparison TRUE FALSE 1 + 7!= 7 # Other logical operators: # &,,! # <,>,<=,>=, ==,!= Statistical Computing in R May 15, / 55
27 Data Types Cont. # variables assignment is done with the <- operator my.number <- 483 # the '.' above does nothing. we could have done: # mynumber <- 483 # instead # it's an Rism to use.'s in variable names. # typeof() tells use type typeof(my.number) ## [1] "double" # we can convert between types my.int <- as.integer(my.number) typeof(my.int) ## [1] "integer" Statistical Computing in R May 15, / 55
28 R Data Structures - Vectors # the vector is the most important data structure # create it with c() my.vec <- c(1,2,67,-98) # get some properties str(my.vec) ## num [1:4] length(my.vec) ## [1] 4 # access elements with [] my.vec[3] ## [1] 67 my.vec[c(3,4)] ## [1] # can do assignment too my.vec[5] < Statistical Computing in R May 15, / 55
29 Vectors - Cont. # other ways to create vectors x <- 1:6 y <- seq(7,12,by=1) # Operations get recycled through whole vector x + 1 ## [1] x > 3 ## [1] FALSE FALSE FALSE TRUE TRUE TRUE # Can do component wise operations between vectors x * y ## [1] x / y ## [1] y %/% x ## [1] Statistical Computing in R May 15, / 55
30 Try It Out # Try guess what the following lines will do # Will it run at all? If so, what will it give? # Think about it and run to confirm 7 -> w w <- z < TRUE 0 15 & 3 my.vec[2:4] my.vec[-2] my.vec[c(true,false,false,true,false)] my.vec[ sum( c(true,false,false,true,true) ) ] <- TRUE my.vec[3] <- "I'm a string" as.numeric(my.vec) x[x>3] x + c(1,2) Statistical Computing in R May 15, / 55
31 Matrices # matricies are 2d vectors. # create using matrix() my.matrix <- matrix(rnorm(20),nrow=4,ncol=5) # rnorm() draws 20 random samples from a n(0,1) distribution my.matrix ## [,1] [,2] [,3] [,4] [,5] ## [1,] ## [2,] ## [3,] ## [4,] # note matricies loaded by column # Get details dim(my.matrix) ## [1] 4 5 nrow(my.matrix) ## [1] 4 ncol(my.matrix) ## [1] 5 Statistical Computing in R May 15, / 55
32 Matrices - Cont. # Indexing is similar to vectors but with 2 dimensions # get second row my.matrix[2,] ## [1] # get first,last columns of row three my.matrix[3,c(1,4)] ## [1] # transposing done with t() Statistical Computing in R May 15, / 55
33 Lists # lists similar to vectors but contain different types # create with list my.list <- list("just a string", 44, my.matrix, c(true,true,false)) # access items via double brackets [[]] my.list[[4]] ## [1] TRUE TRUE FALSE # access multiple items my.list[1:2] ## [[1]] ## [1] "just a string" ## ## [[2]] ## [1] 44 # list items can be named too named.list <- list(item1="my string", Item2=my.list) # access of named item is via dollar sign operator # [[]] also works c(named.list$item1,named.list[[1]]) ## [1] "my string" "my string" Statistical Computing in R May 15, / 55
34 Putting it together Let s practice with R data types by doing PCA on the iris data. data("iris") head(iris) str(iris) Note iris is a data.frame data type; this is simply a list. Statistical Computing in R May 15, / 55
35 PCA outline Save the numeric columns of iris as a matrix. (Hint:?as.matrix) Center and scale the matrix (Hint:?scale) Compute the correlation matrix R = 1 n 1 X T X Here X is our (centered and scaled) data matrix, n is the number of rows/observations in our data, and X T is the transpose of X. (Hint: t(x) is transpose operator and A%*%B performs matrix multiplication on the matricies A and B) Statistical Computing in R May 15, / 55
36 PCA outline cont. Obtain the two leading eigenvectors of the correlation matrix R. Denote these as v 1, v 2. (Hint:?eigen) Compute the first and second principle components via z 1 = X v 1 z 2 = X v 2 Produce a scatter plot of z 1 vs z 2 (Hint:?plot) Take a few moments to try it yourself before looking at the answers on the next slides. Statistical Computing in R May 15, / 55
37 PCA from scratch data("iris") # get numeric portions of list and make a matrix X <- as.matrix(iris[1:4]) # center and scale X <- scale(x,center = TRUE,scale=TRUE) # get the number of rows n <- nrow(x) # compute correlation matrix R <- (1/(n-1))*t(X)%*%X # perform eigen decomposition Reig <- eigen(r) # get eigen vectors Reig.vecs <- Reig$vectors # create principle components pc1 <- X%*%Reig.vecs[,1] pc2 <- X%*%Reig.vecs[,2] Statistical Computing in R May 15, / 55
38 PCA from scratch cont. # compare to R's PCA function their.pcs <-prcomp(iris[1:4],center = TRUE,scale. = TRUE) head(their.pcs$x[,1:2]) ## PC1 PC2 ## [1,] ## [2,] ## [3,] ## [4,] ## [5,] ## [6,] # our result head(cbind(pc1,pc2)) ## [,1] [,2] ## [1,] ## [2,] ## [3,] ## [4,] ## [5,] ## [6,] Statistical Computing in R May 15, / 55
39 PCA from scratch cont. plot(pc1,pc2,col=iris$species) pc pc1 Statistical Computing in R May 15, / 55
40 Factors # Factors are like vector, but with predefined allowed values called levels # Factors are used to represent categorical variables in R # create a factor factor1 <- factor(c('good','bad','ugly')) # find it's levels levels(factor1) ## [1] "Bad" "Good" "Ugly" # below gives warning, but not error factor1[4] <- 17 ## Warning in [<-.factor ( *tmp*, 4, value = 17): invalid factor level, NA generated # see what happened factor1 ## [1] Good Bad Ugly <NA> ## Levels: Bad Good Ugly factor1[4] <- 'Bad' # get the breakdown table(factor1) ## factor1 ## Bad Good Ugly ## Statistical Computing in R May 15, / 55
41 Note one of our previous examples R filled in the improper factor value with NA NA is R s way of specifying missing data Note the missing data is handled differently than ordinary values, as we will see as we go along. Statistical Computing in R May 15, / 55
42 Questions What will the following lines of code do? my.matrix[3:4,1:2] <- c(4,5) my.matrix[4,5] <- 'string' mf.strings <- c('f','f','m','f') factor2 <- as.factor(mf.strings) c(factor1, factor2) factor1 == 'Ugly' my.list[[3]][2,] sum(c(1,2,3,na)) sum(c(1,2,3,na),na.rm = TRUE) Statistical Computing in R May 15, / 55
43 Data Frames The data.frame is how R represents data sets. They are simply lists, with a few additional restrictions. # create your own my.df <- data.frame( age = c(45,27,19,59,71,13,5), gender = factor(c('m','m','m','f','m','f','f')) ) str(my.df) ## 'data.frame': 7 obs. of 2 variables: ## $ age : num ## $ gender: Factor w/ 2 levels "F","M": Statistical Computing in R May 15, / 55
44 Data Frames - Cont. Individual variables can be accessed via $ operator my.df$age ## [1] summary(my.df$age) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## table(my.df$gender) ## ## F M ## 3 4 # data frames are really just lists my.df[[2]] ## [1] M M M F M F F ## Levels: F M Statistical Computing in R May 15, / 55
45 Data Frames - Cont. # data.frames can be subsetted like matrcies my.df[1:3,c("age")] ## [1] # logical subsetting especially useful for.data.frames # get ages over 40 age.logic <- my.df$age > 40 # take a subset of these rows my.df[age.logic,] ## age gender ## 1 45 M ## 4 59 F ## 5 71 M # create a new variable age.sq my.df$age.sq <- my.df$age^2 Statistical Computing in R May 15, / 55
46 Try It Out Let s use R s internal iris data set to practice with data frames my.iris <- iris my.iris 1 Create two new variables Length.Sum and Width.Sum which are the sum of Sepal and Petal length/width respectively. 2 Use subsetting and R s mean function to find the average Length.Sum of setosa species Statistical Computing in R May 15, / 55
47 my.iris$length.sum = my.iris$sepal.length + my.iris$petal.length my.iris$width.sum = my.iris$sepal.width + my.iris$petal.width setosa.inds <- my.iris$species == 'setosa' mean(my.iris[setosa.inds,]$length.sum) ## [1] Statistical Computing in R May 15, / 55
48 Control Structures R has all the typical control structures: if-else statements for loops while loops Statistical Computing in R May 15, / 55
49 Syntax if(logical_expression){ execute_code } else{ executre_other_code } for(value in sequence){ work_with_value } while(expression_is_true){ execute_code } Statistical Computing in R May 15, / 55
50 Functions Defining functions is R is easy # use function key word with assignment <- my.mean <- function(input.vector){ sum = 0 for(val in input.vector) { sum = sum + val } # the expression get retuned return.me <- sum / length(input.vector) } my.mean(1:10) Statistical Computing in R May 15, / 55
51 Functions cont. my.mean <- function(input.vector){ sum = 0 for(val in input.vector) { sum = sum + val } # returns 1 now retrun.me <- sum / length(input.vector) 1 } my.mean(1:10) ## [1] 1 Statistical Computing in R May 15, / 55
52 Try It Out Create a function my.summary which inputs a vector, x, calculates the mean, standard deviation, max, and min of x, and returns these in a list Try out R s internal functions mean, sd, max,min Statistical Computing in R May 15, / 55
53 my.summary <- function(x) { list( mean = mean(x), sd = sd(x), max = max(x), min = min(x) ) } Statistical Computing in R May 15, / 55
54 Try It Out cont. Loop through the variables in my.iris, evaluating my.summary on each (provided the variable is numeric) and printing the maximum. Hint: Use is.numeric to test each variable before applying my.summary Statistical Computing in R May 15, / 55
55 for(var in my.iris) { if(is.numeric(var)){ tmp <- my.summary(var) print(tmp$max) } } Statistical Computing in R May 15, / 55
R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)
R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine
More informationBIO5312: R Session 1 An Introduction to R and Descriptive Statistics
BIO5312: R Session 1 An Introduction to R and Descriptive Statistics Yujin Chung August 30th, 2016 Fall, 2016 Yujin Chung R Session 1 Fall, 2016 1/24 Introduction to R R software R is both open source
More informationSTAT 540 Computing in Statistics
STAT 540 Computing in Statistics Introduces programming skills in two important statistical computer languages/packages. 30-40% R and 60-70% SAS Examples of Programming Skills: 1. Importing Data from External
More informationfile:///users/williams03/a/workshops/2015.march/final/intro_to_r.html
Intro to R R is a functional programming language, which means that most of what one does is apply functions to objects. We will begin with a brief introduction to R objects and how functions work, and
More informationLab 1. Introduction to R & SAS. R is free, open-source software. Get it here:
Lab 1. Introduction to R & SAS R is free, open-source software. Get it here: http://tinyurl.com/yfet8mj for your own computer. 1.1. Using R like a calculator Open R and type these commands into the R Console
More informationSISG/SISMID Module 3
SISG/SISMID Module 3 Introduction to R Ken Rice Tim Thornton University of Washington Seattle, July 2018 Introduction: Course Aims This is a first course in R. We aim to cover; Reading in, summarizing
More informationIntroduction to R. Daniel Berglund. 9 November 2017
Introduction to R Daniel Berglund 9 November 2017 1 / 15 R R is available at the KTH computers If you want to install it yourself it is available at https://cran.r-project.org/ Rstudio an IDE for R is
More informationComputer lab 2 Course: Introduction to R for Biologists
Computer lab 2 Course: Introduction to R for Biologists April 23, 2012 1 Scripting As you have seen, you often want to run a sequence of commands several times, perhaps with small changes. An efficient
More informationIntroduction to Statistics using R/Rstudio
Introduction to Statistics using R/Rstudio R and Rstudio Getting Started Assume that R for Windows and Macs already installed on your laptop. (Instructions for installations sent) R on Windows R on MACs
More informationR Basics / Course Business
R Basics / Course Business We ll be using a sample dataset in class today: CourseWeb: Course Documents " Sample Data " Week 2 Can download to your computer before class CourseWeb survey on research/stats
More informationIntermediate Programming in R Session 1: Data. Olivia Lau, PhD
Intermediate Programming in R Session 1: Data Olivia Lau, PhD Outline About Me About You Course Overview and Logistics R Data Types R Data Structures Importing Data Recoding Data 2 About Me Using and programming
More informationLab 1: Getting started with R and RStudio Questions? or
Lab 1: Getting started with R and RStudio Questions? david.montwe@ualberta.ca or isaacren@ualberta.ca 1. Installing R and RStudio To install R, go to https://cran.r-project.org/ and click on the Download
More informationIntroduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010
UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview
More informationPOL 345: Quantitative Analysis and Politics
POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard
More informationReading in data. Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen
Reading in data Programming in R for Data Science Anders Stockmarr, Kasper Kristensen, Anders Nielsen Data Import R can import data many ways. Packages exists that handles import from software systems
More informationData input & output. Hadley Wickham. Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University.
Data input & output Hadley Wickham Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University June 2012 1. Working directories 2. Loading data 3. Strings and factors
More informationData Input/Output. Andrew Jaffe. January 4, 2016
Data Input/Output Andrew Jaffe January 4, 2016 Before we get Started: Working Directories R looks for files on your computer relative to the working directory It s always safer to set the working directory
More informationA Brief Introduction to R
A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.
More informationIntroduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio
Introduction to R and R-Studio 2018-19 Toy Program #1 R Essentials This illustration Assumes that You Have Installed R and R-Studio If you have not already installed R and RStudio, please see: Windows
More informationIntroduction to R Reading, writing and exploring data
Introduction to R Reading, writing and exploring data R-peer-group QUB February 12, 2013 R-peer-group (QUB) Session 2 February 12, 2013 1 / 26 Session outline Review of last weeks exercise Introduction
More informationReading and wri+ng data
An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look
More informationIntroduction to R Commander
Introduction to R Commander 1. Get R and Rcmdr to run 2. Familiarize yourself with Rcmdr 3. Look over Rcmdr metadata (Fox, 2005) 4. Start doing stats / plots with Rcmdr Tasks 1. Clear Workspace and History.
More informationgetting started in R
Garrick Aden-Buie // Friday, March 25, 2016 getting started in R 1 / 70 getting started in R Garrick Aden-Buie // Friday, March 25, 2016 INFORMS Code & Data Boot Camp Today we ll talk about Garrick Aden-Buie
More informationGetting Started in R
Getting Started in R Giles Hooker May 28, 2007 1 Overview R is a free alternative to Splus: a nice environment for data analysis and graphical exploration. It uses the objectoriented paradigm to implement
More informationR Notebook Introduction to R in Immunobiology course
R Notebook Introduction to R in Immunobiology course This is the R file I used to go through the introduction to R. I hope it will help you to understand things better. We will also use the presenters
More informationAn Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012
An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences Scott C Merrill September 5 th, 2012 Chapter 2 Additional help tools Last week you asked about getting help on packages.
More informationAn Introduction to R- Programming
An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University
More informationR basics workshop Sohee Kang
R basics workshop Sohee Kang Math and Stats Learning Centre Department of Computer and Mathematical Sciences Objective To teach the basic knowledge necessary to use R independently, thus helping participants
More informationStat405. More about data. Hadley Wickham. Tuesday, September 11, 12
Stat405 More about data Hadley Wickham 1. (Data update + announcement) 2. Motivating problem 3. External data 4. Strings and factors 5. Saving data Slot machines they be sure casinos are honest? CC by-nc-nd:
More informationR syntax guide. Richard Gonzalez Psychology 613. August 27, 2015
R syntax guide Richard Gonzalez Psychology 613 August 27, 2015 This handout will help you get started with R syntax. There are obviously many details that I cannot cover in these short notes but these
More informationGetting Started. Slides R-Intro: R-Analytics: R-HPC:
Getting Started Download and install R + Rstudio http://www.r-project.org/ https://www.rstudio.com/products/rstudio/download2/ TACC ssh username@wrangler.tacc.utexas.edu % module load Rstats %R Slides
More informationDescription/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources
R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org
More informationMBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R
MBV4410/9410 Fall 2018 Bioinformatics for Molecular Biology Introduction to R Outline Introduce R Basic operations RStudio Bioconductor? Goal of the lecture Introduce you to R Show how to run R, basic
More informationGetting Started in R
Getting Started in R Phil Beineke, Balasubramanian Narasimhan, Victoria Stodden modified for Rby Giles Hooker January 25, 2004 1 Overview R is a free alternative to Splus: a nice environment for data analysis
More informationLecture 1: Getting Started and Data Basics
Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate
More informationsocial data science Introduction to R Sebastian Barfort August 07, 2016 University of Copenhagen Department of Economics 1/40
social data science Introduction to R Sebastian Barfort August 07, 2016 University of Copenhagen Department of Economics 1/40 welcome Course Description The objective of this course is to learn how to
More informationBasic R QMMA. Emanuele Taufer. 2/19/2018 Basic R (1)
Basic R QMMA Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20classes/1-3_basic_r.html#(1) 1/21 Preliminary R is case sensitive: a is not the same as A.
More informationMails : ; Document version: 14/09/12
Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary
More informationNo Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot.
No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. 3 confint A metafor package function that gives you the confidence intervals of effect sizes.
More informationPractical 2: Plotting
Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory
More informationModule 4. Data Input. Andrew Jaffe Instructor
Module 4 Data Input Andrew Jaffe Instructor Data Input We used several pre-installed sample datasets during previous modules (CO2, iris) However, 'reading in' data is the first step of any real project/analysis
More informationA whirlwind introduction to using R for your research
A whirlwind introduction to using R for your research Jeremy Chacón 1 Outline 1. Why use R? 2. The R-Studio work environment 3. The mock experimental analysis: 1. Writing and running code 2. Getting data
More informationStatistics for Biologists: Practicals
Statistics for Biologists: Practicals Peter Stoll University of Basel HS 2012 Peter Stoll (University of Basel) Statistics for Biologists: Practicals HS 2012 1 / 22 Outline Getting started Essentials of
More informationInput/Output Data Frames
Input/Output Data Frames Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Input/Output Importing text files Rectangular (n rows, c columns) Usually you want to use read.table read.table(file,
More informationIST 3108 Data Analysis and Graphics Using R. Summarizing Data Data Import-Export
IST 3108 Data Analysis and Graphics Using R Summarizing Data Data Import-Export Engin YILDIZTEPE, PhD Working with Vectors and Logical Subscripts >xsum(x) how many of the values were less than
More informationAn introduction to R WS 2013/2014
An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,
More informationMATLAB TUTORIAL WORKSHEET
MATLAB TUTORIAL WORKSHEET What is MATLAB? Software package used for computation High-level programming language with easy to use interactive environment Access MATLAB at Tufts here: https://it.tufts.edu/sw-matlabstudent
More informationWhere s the spreadsheet?
Reading in Data Where s the spreadsheet? Statistical packages normally have a spreadsheet R has a minimal-but-usable spreadsheet More emphasis on data generated/curated externally Very powerful data import
More informationBasic matrix math in R
1 Basic matrix math in R This chapter reviews the basic matrix math operations that you will need to understand the course material and how to do these operations in R. 1.1 Creating matrices in R Create
More informationVariables: Objects in R
Variables: Objects in R Basic R Functionality Introduction to R for Public Health Researchers Common new users frustations 1. Different versions of software 2. Data type problems (is that a string or a
More informationProgramming for Chemical and Life Science Informatics
Programming for Chemical and Life Science Informatics I573 - Week 7 (Statistical Programming with R) Rajarshi Guha 24 th February, 2009 Resources Download binaries If you re working on Unix it s a good
More informationTopics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics
Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,
More informationIntro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf. Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University
Intro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University Overview Import data Move around the file system, save an image
More informationA brief introduction to R
A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background
More informationBasics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3)
Basics of R 1. Arithmetic Operators > 2+2 > sqrt(2) # (2) >2^2 > sin(pi) # sin(π) >(1-2)*3 > exp(1) # e 1 >1-2*3 > log(10) # This is a short form of the full command, log(10, base=e). (Note) For log 10
More informationBrief cheat sheet of major functions covered here. shoe<-c(8,7,8.5,6,10.5,11,7,6,12,10)
1 Class 2. Handling data in R Creating, editing, reading, & exporting data frames; sorting, subsetting, combining Goals: (1) Creating matrices and dataframes: cbind and as.data.frame (2) Editing data:
More informationLAB #1: DESCRIPTIVE STATISTICS WITH R
NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab
More informationBusiness Statistics: R tutorials
Business Statistics: R tutorials Jingyu He September 29, 2017 Install R and RStudio R is a free software environment for statistical computing and graphics. Download free R and RStudio for Windows/Mac:
More informationLECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N.
LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I Part Two Introduction to R Programming RStudio November 2016 Written by N.Nilgün Çokça Introduction to R Programming 5 Installing R & RStudio 5 The R Studio
More informationImporting data sets in R
Importing data sets in R R can import and export different types of data sets including csv files text files excel files access database STATA data SPSS data shape files audio files image files and many
More informationData Analysis in Paleontology Using R. Looping Basics
Data Analysis in Paleontology Using R Session 4 26 Jan 2006 Gene Hunt Dept. of Paleobiology NMNH, SI Looping Basics Situation: you have a set of objects (sites, species, measurements, etc.) and want to
More informationR in Linguistic Analysis. Week 2 Wassink Autumn 2012
R in Linguistic Analysis Week 2 Wassink Autumn 2012 Today R fundamentals The anatomy of an R help file but first... How did you go about learning the R functions in the reading? More help learning functions
More informationStatistical Software Camp: Introduction to R
Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities
More informationReading and writing data
25/10/2017 Reading data Reading data is one of the most consuming and most cumbersome aspects of bioinformatics... R provides a number of ways to read and write data stored on different media (file, database,
More informationIntroduction to Matlab
Introduction to Matlab Andreas C. Kapourani (Credit: Steve Renals & Iain Murray) 9 January 08 Introduction MATLAB is a programming language that grew out of the need to process matrices. It is used extensively
More informationIntroduction into R. A Short Overview. Thomas Girke. December 8, Introduction into R Slide 1/21
Introduction into R A Short Overview Thomas Girke December 8, 212 Introduction into R Slide 1/21 Introduction Look and Feel of the R Environment R Library Depositories Installation Getting Around Basic
More informationPart I { Getting Started & Manipulating Data with R
Part I { Getting Started & Manipulating Data with R Gilles Lamothe February 21, 2017 Contents 1 URL for these notes and data 2 2 Origins of R 2 3 Downloading and Installing R 2 4 R Console and Editor 3
More informationR and parallel libraries. Introduction to R for data analytics Bologna, 26/06/2017
R and parallel libraries Introduction to R for data analytics Bologna, 26/06/2017 Outline Overview What is R R Console Input and Evaluation Data types R Objects and Attributes Vectors and Lists Matrices
More informationLecture 06: Feb 04, Transforming Data. Functions Classes and Objects Vectorization Subsets. James Balamuta STAT UIUC
Lecture 06: Feb 04, 2019 Transforming Data Functions Classes and Objects Vectorization Subsets James Balamuta STAT 385 @ UIUC Announcements hw02 is will be released Tonight Due on Wednesday, Feb 13th,
More informationBioinformatics Workshop - NM-AIST
Bioinformatics Workshop - NM-AIST Day 2 Introduction to R Thomas Girke July 24, 212 Bioinformatics Workshop - NM-AIST Slide 1/21 Introduction Look and Feel of the R Environment R Library Depositories Installation
More informationIntro to R. Fall Fall 2017 CS130 - Intro to R 1
Intro to R Fall 2017 Fall 2017 CS130 - Intro to R 1 Intro to R R is a language and environment that allows: Data management Graphs and tables Statistical analyses You will need: some basic statistics We
More informationData Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47
Data Mining - Data Dr. Jean-Michel RICHER 2018 jean-michel.richer@univ-angers.fr Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Outline 1. Introduction 2. Data preprocessing 3. CPA with R 4. Exercise
More informationExperimental epidemiology analyses with R and R commander. Lars T. Fadnes Centre for International Health University of Bergen
Experimental epidemiology analyses with R and R commander Lars T. Fadnes Centre for International Health University of Bergen 1 Click to add an outline 2 How to install R commander? - install.packages("rcmdr",
More informationExploring and Understanding Data Using R.
Exploring and Understanding Data Using R. Loading the data into an R data frame: variable
More informationGetting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018
Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from
More information2013 Eric Pitman Summer Workshop in Computational Science....an introduction to R, statistics, programming, and getting to know datasets
2013 Eric Pitman Summer Workshop in Computational Science...an introduction to R, statistics, programming, and getting to know datasets Data is Everywhere For example: Science/engineering/medicine Environmental
More informationLab of COMP 406. MATLAB: Quick Start. Lab tutor : Gene Yu Zhao Mailbox: or Lab 1: 11th Sep, 2013
Lab of COMP 406 MATLAB: Quick Start Lab tutor : Gene Yu Zhao Mailbox: csyuzhao@comp.polyu.edu.hk or genexinvivian@gmail.com Lab 1: 11th Sep, 2013 1 Where is Matlab? Find the Matlab under the folder 1.
More informationLab 1: Introduction, Plotting, Data manipulation
Linear Statistical Models, R-tutorial Fall 2009 Lab 1: Introduction, Plotting, Data manipulation If you have never used Splus or R before, check out these texts and help pages; http://cran.r-project.org/doc/manuals/r-intro.html,
More informationStat 579: Objects in R Vectors
Stat 579: Objects in R Vectors Ranjan Maitra 2220 Snedecor Hall Department of Statistics Iowa State University. Phone: 515-294-7757 maitra@iastate.edu, 1/23 Logical Vectors I R allows manipulation of logical
More informationModule 1: Introduction RStudio
Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator
More informationDesktop Command window
Chapter 1 Matlab Overview EGR1302 Desktop Command window Current Directory window Tb Tabs to toggle between Current Directory & Workspace Windows Command History window 1 Desktop Default appearance Command
More informationthe star lab introduction to R Day 2 Open R and RWinEdt should follow: we ll need that today.
R-WinEdt Open R and RWinEdt should follow: we ll need that today. Cleaning the memory At any one time, R is storing objects in its memory. The fact that everything is an object in R is generally a good
More informationIntroduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016
Introduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016 Installing R R is constantly updated and you should download a recent version; the version when this workshop was written was 3.2.4 I also
More informationReading and writing data
An introduction to WS 2017/2018 Reading and writing data Dr. Noémie Becker Dr. Sonja Grath Special thanks to: Prof. Dr. Martin Hutzenthaler and Dr. Benedikt Holtmann for significant contributions to course
More informationIntroduction to R, Github and Gitlab
Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and
More information6 Subscripting. 6.1 Basics of Subscripting. 6.2 Numeric Subscripts. 6.3 Character Subscripts
6 Subscripting 6.1 Basics of Subscripting For objects that contain more than one element (vectors, matrices, arrays, data frames, and lists), subscripting is used to access some or all of those elements.
More informationData Import and Export
Data Import and Export Eugen Buehler October 17, 2018 Importing Data to R from a file CSV (comma separated value) tab delimited files Excel formats (xls, xlsx) SPSS/SAS/Stata RStudio will tell you if you
More informationAuthor: Leonore Findsen, Qi Wang, Sarah H. Sellke, Jeremy Troisi
0. Downloading Data from the Book Website 1. Go to http://bcs.whfreeman.com/ips8e 2. Click on Data Sets 3. Click on Data Sets: PC Text 4. Click on Click here to download. 5. Right Click PC Text and choose
More informationIntroduction to scientific programming in R
Introduction to scientific programming in R John M. Drake & Pejman Rohani 1 Introduction This course will use the R language programming environment for computer modeling. The purpose of this exercise
More informationComputing With R Handout 1
Computing With R Handout 1 Getting Into R To access the R language (free software), go to a computing lab that has R installed, or a computer on which you have downloaded R from one of the distribution
More informationMATLAB: The greatest thing ever. Why is MATLAB so great? Nobody s perfect, not even MATLAB. Prof. Dionne Aleman. Excellent matrix/vector handling
MATLAB: The greatest thing ever Prof. Dionne Aleman MIE250: Fundamentals of object-oriented programming University of Toronto MIE250: Fundamentals of object-oriented programming (Aleman) MATLAB 1 / 1 Why
More informationStatistics 13, Lab 1. Getting Started. The Mac. Launching RStudio and loading data
Statistics 13, Lab 1 Getting Started This first lab session is nothing more than an introduction: We will help you navigate the Statistics Department s (all Mac) computing facility and we will get you
More informationData types and structures
An introduc+on to Data types and structures Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 3 Review GeFng started with R Crea+ng Objects Data types in R Data structures in R
More informationIntroduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center
Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center What is R? R is a statistical computing environment with graphics capabilites It is fully scriptable
More informationWhat is KNIME? workflows nodes standard data mining, data analysis data manipulation
KNIME TUTORIAL What is KNIME? KNIME = Konstanz Information Miner Developed at University of Konstanz in Germany Desktop version available free of charge (Open Source) Modular platform for building and
More informationIntroduction to the R Language
Introduction to the R Language Data Types and Basic Operations Starting Up Windows: Double-click on R Mac OS X: Click on R Unix: Type R Objects R has five basic or atomic classes of objects: character
More informationReferences R's single biggest strenght is it online community. There are tons of free tutorials on R.
Introduction to R Syllabus Instructor Grant Cavanaugh Department of Agricultural Economics University of Kentucky E-mail: gcavanugh@uky.edu Course description Introduction to R is a short course intended
More informationAdvanced Econometric Methods EMET3011/8014
Advanced Econometric Methods EMET3011/8014 Lecture 2 John Stachurski Semester 1, 2011 Announcements Missed first lecture? See www.johnstachurski.net/emet Weekly download of course notes First computer
More informationChapter 7. The Data Frame
Chapter 7. The Data Frame The R equivalent of the spreadsheet. I. Introduction Most analytical work involves importing data from outside of R and carrying out various manipulations, tests, and visualizations.
More informationIntro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming
Intro to Programming Unit 7 Intro to Programming 1 What is Programming? 1. Programming Languages 2. Markup vs. Programming 1. Introduction 2. Print Statement 3. Strings 4. Types and Values 5. Math Externals
More information