Statistical Programming with R

Size: px
Start display at page:

Download "Statistical Programming with R"

Transcription

1 Statistical Programming with R Dan Mazur, McGill HPC daniel.mazur@mcgill.ca guillimin@calculquebec.ca May 14,

2 Outline R syntax and basic data types Documentation and help pages Reading in data Data exploration Programming in R Probability, distributions, and pseudorandom numbers Linear Regression Confidence Intervals Statistical Hypothesis Testing Parallel Programming in R

3 What is R R is... A programming language Designed for interactive, statistical programming and plotting Free software. You may... Run the software for any purpose Inspect the source code Share copies with others Modify and improve R Share your improvements with the whole community

4 Exercise 0: Login and Setup Login to Guillimin $ ssh -X class##@guillimin.hpc.mcgill.ca Load software modules $ module add R/3.1.2 $ module add Rstudio (optional) Copy workshop files $ cp -R /software/workshop/r/* ~/. (on Guillimin) or, us git $ git clone

5 Exercise 1: Invitation to R To launch R, type 'R' in the terminal Let's skip hello world and start analyzing data In the prompt, we will load a data table, get a summary of each column, and make scatterplots of every pair of variables > library(datasets) #Import a library full of data sets >?airquality #Access the help tool for one of the datasets > data(airquality) #Load one of the datasets > summary(airquality) #Print summary of dataset > pairs(airquality) #Make scatterplots of all pairs of variables In a few commands, we just learned a lot about our data!

6 The R Assignment Operator In R, the assignment operator is '<-' Looks like a left-pointing arrow a <- 3 In most languages, assignment is done using the '=' symbol R supports both a = 3 a <- 3 Google's R style guide forbids '=' for assignment Consider this ambiguity f(a = 3) - Call f() with named variable a set to 3 f(a <- 3) - Assign 3 to a, call f() with first positional variable set to the value of a

7 R data types Atomic data types (similar to other languages) character, numeric, integer, complex, logical (TRUE and FALSE), factors Lists Vectors Created with the c() function: > a <- c(1,2,3,4,5) Character Strings Use 'single quotes' or double quotes Data Frames Collections of vectors (like a database table or spreadsheet) Tables Frequency tables Arrays and Matrices

8 Factors Non-numeric data In the air quality data, Month is a label, not numeric data By default, the summary() function will compute quantiles for numeric data, but frequencies for factors > summary(airquality$month) Min. 1st Qu. Median Mean 3rd Qu. Max > airquality$month <- as.factor(airquality$month) > summary(airquality$month)

9 Vectors Vector - an ordered sequence of data elements of a single type Created with the c() function: a <- c(1,2,3,4,5) Summary functions: sort(), min(), max(), mean(), var(), median(), quantile(), sum() Operators +,-,*,/,sqrt(),log(),exp() are element-wise > c(1,2,3,4)/5 [1]

10 Lists List - an ordered sequence of data elements of any type A generalized vector a <- list(2, a, TRUE, c(3,4))

11 Indexing in R Lists and arrays are indexed using square brackets > a <- c(2,4,6) > a[2] [1] 4 We can give list elements names, and index by names > a <- list(alice=2, bob=6, carol=12) > a[['carol']] 12 > a$bob 6 use double bracket [[]] notation to select individual items from a list, and single bracket [] notation to select a sublist of elements Use the dollar sign $ notation to select named members You can index lists with logicals > a[ a > 3 ] $bob [1] 6 $carol [1]

12 Data Frames Data frame - Collection of vectors of different types > myframe <- data.frame(col1=c(1,2,3,4), col2=c(2,4,6,8), col3=factor(c("a","b","c","d"))) > myframe col1 col2 col A B C D Our airquality data was imported as a data frame

13 Data Tables Frequency tables > data(titanic) > class(titanic) [1] "table" > Titanic,, Age = Child, Survived = No Sex Class Male Female 1st 0 0 2nd 0 0 3rd Crew 0 0,, Age = Adult, Survived = No

14 Arrays and Matrices Array - A multiply subscripted collection of data entries of a single type Matrix - A 2-dimensional array Vector - a 1-dimensional array # Generate a 2 by 2 by 5 array. > x <- array(1:20, dim=c(2,2,5)) > x[,,2] [,1] [,2] [1,] 5 7 [2,] 6 8 > x[,2,] [,1] [,2] [,3] [,4] [,5] [1,] [2,] # Generate a 3 by 2 array (matrix). > B = matrix( + c(2, 4, 3, 1, 5, 7), + nrow=3, + ncol=2) > B # B has 3 rows and 2 columns [,1] [,2] [1,] 2 1 [2,] 4 5 [3,]

15 Indexing Arrays Indices for each dimension are separated by commas An absent index is treated as 'everything' > x[, 2, 2:4 ] [,1] [,2] [,3] [1,] [2,]

16 Indexing with Logicals I want to set all elements of a vector that are less than 5 to the value 0. Which line of code achieves this? x <- c(7, 2, 9, 5, 2, 7, 8, 1, 0) A) x[x == 0] < 5 B) x[x < 5] <- 0 C) x[x == 5] <- 0 D) x[x < 0 ] <- 5 E) None of these > x [1]

17 Indexing with Logicals I want to set all elements of a vector that are less than 5 to the value 0. Which line of code achieves this? x <- c(7, 2, 9, 5, 2, 7, 8, 1, 0) A) x[x == 0] < 5 B) x[x < 5] <- 0 C) x[x == 5] <- 0 D) x[x < 0 ] <- 5 E) None of these > x [1]

18 Appending Data cbind() and rbind() can append data to a vector, matrix, or data frame > a <- data.frame(one=c( 0, 1, 2),two=c("a","a","b")) > c <- rbind(a,b) > b <- data.frame(one=c(10,11,12),two=c("c","c","d")) > c > a one two one two 1 0 a 1 0 a 2 1 a 2 1 a 3 2 b 3 2 b 4 10 c > b 5 11 c one two 6 12 d 1 10 c > d <- cbind(a,b) 2 11 c > d 3 12 d one two one two 1 0 a 10 c 2 1 a 11 c 3 2 b 12 d

19 Documentation and Help Pages Built-in help system can be accessed using the syntax '?function' try:?mean Official documentation is available on the R- project web site: There are many tutorials and tips on the web

20 What type of data is a? > a <- rbind( c(2,4,6), c(3, 2, 7) ) Hint: look at '?rbind' Rule: Do not run this command A) A vector of length 2 B) A vector of length 6 C) A 3 by 2 matrix D) A 2 by 3 matrix E) Something else

21 What type of data is a? > a <- rbind( c(2,4,6), c(3, 2, 7) ) Hint: look at '?rbind' Rule: Do not run this command A) A vector of length 2 B) A vector of length 6 C) A 3 by 2 matrix D) A 2 by 3 matrix E) Something else

22 Exercise 2: Reading in Data Use less (or any text viewer) to view the contents of titanic.csv $ less titanic.csv use 'q' to exit less Use the read.csv() function to load the titanic data set > titanic <- read.csv('titanic.csv', sep=',') Use ls() to see what variables are defined > ls() [1] "titanic" Use some of the commands we've already seen to explore the data

23 Reading in Data There are many tools for reading data from many sources into R read.csv(), read.table(), scan() RMySQL, hdf5, netcdf, bmp, jpeg, png, Consider data in a table on a website e.g. This code loads a data.frame from a website designed for human eyes library('xml') url <- " zz = readhtmltable(url) if(any(i <- sapply(zz, ncol) == 14)) { zz = zz[[which(i)[1]]] }

24 Print() The print() function can be used to produce output or to inspect variables Each data-type in R can implement its own output under print() > print("hello world") [1] "hello world" > print(c(2,3,4)) [1] > print(airquality) Ozone Solar.R Wind Temp Month Day head() produces the first 6 (default) lines, tail() produces the last

25 Exercise 3: Data Exploration Find commands to perform the following tasks: Print only the first two rows of titanic Print only the last two rows of titanic Compute the total number of survivors (hint: use sum()) Compute the mean age of all passengers (hint: use mean() with the na.rm option) How many passengers have unknown ages? (hint: Missing values are labeled as NA in R. Use the is.na() function to find them)

26 Simple Plots R is capable of making publication-quality plots and charts Disclaimer: This workshop will only cover the basic plotting commands Great for quickly peeking at your data Making publication-quality graphs is outside our scope Other tutorials will teach you how to make plots for publication

27 Exercise 4: Simple Plots Load the airquality data set > library(datasets) > data(airquality) Try the following plot commands and understand what they are doing (use?plot ): > pairs(airquality) > hist(airquality$temp) > boxplot(airquality) > stripchart(airquality) > plot(airquality$solar.r, airquality$ozone)

28 Exercise 5: Subsets In the subset of airquality data where Ozone > 31 and Temp > 90, what is the mean value of Solar.R? A) B) C) D) E) None of these

29 Exercise 5: Subsets In the subset of airquality data where Ozone > 31 and Temp > 90, what is the mean value of Solar.R? A) B) C) D) E) None of these

30 Exercise 5: Solution #Load the airquality data set library(datasets) data(airquality) #Compute the filter vector for the subset filter.vector <- airquality[,"ozone"] > 31 & airquality[,"temp"] > 90 #Compute the mean of Solar.R for the subset mean(airquality[filter.vector, "Solar.R"], na.rm=t) #Alternatively, we can use the subset() function mean(subset(airquality, Ozone > 31 & Temp > 90)$Solar.R)

31 Programming in R The basic programming features in R are For loops While loops Switch statements functions anonymous functions

32 Using Loops in R Use for() to loop over elements of lists Use while() to loop until a condition is satisfied > words = c("these", "are", "the", "words") for (word in words) { print(word) } [1] "These" [1] "are" [1] "the" [1] "words" for (i in 1:3) { print(i) } [1] 1 [1] 2 [1] 3 while(i < 2) { i <- i+1 print(i) } [1] 1 [1]

33 Not Using Loops in R Loops in R tend to be slow It is preferred to avoid them when possible Use vectorized operations as much as possible (more later) Reduce() function iterates a function over a list or vector R provides functions for applying a function repeatedly to sets of data apply() - apply function to sections of array, return array sapply() - apply function to elements of a list, returns a vector, matrix, or list lapply() - apply function to elements of a list, return list Use '?apply', or '?Reduce' for more details

34 Functions in R In R, functions are objects that get assigned variable names foo <- function(bar) { x <- rnorm(bar) mean(x) } foo(1:10) [1]

35 Default values for arguments You may specify default values for optional arguments in functions foo <- function(bar, message=false) { x <- rnorm(bar) if (message) {print( Computing... )} mean(x) } > foo(1:10) [1] > foo(1:10,true) [1] "Hello" [1] > foo(1:10,message=true) [1] "Hello" [1]

36 Anonymous Functions in R You can use a function without giving it a name > (function(x,y)x+y)(5,6) [1] 11 Useful if small function is only needed on one line This is most frequently used with the apply() set of functions > lapply(c(2,4,6), function(x) x**2) [[1]] [1] 4 [[2]] [1] 16 [[3]] [1]

37 What is the output? What is the output of this R code? foo <- function(x) { bar <- function(y) { y+z } z <- 5 x + bar(x) } z <- 10 foo(2) A) 0 B) 9 C) 14 D) 21 E) None of these

38 What is the output? What is the output of this R code? foo <- function(x) { bar <- function(y) { y+z } z <- 5 x + bar(x) } z <- 10 foo(2) A) 0 B) 9 C) 14 D) 21 E) None of these

39 Lexical Scoping foo <- function(x) { bar <- function(y) { y+z } z <- 5 x + bar(x) } z <- 10 foo(2) Which z value will R use inside bar()? 5 or 10? The answer is z = 5. The output will be 9. Lexical scoping specifies that the version of a free variable that is in scope when the function is defined will be used Recursive search through environment of definition A dynamically scoped language would use z = 10 in bar() with output =

40 What is the output? foo<-function(x, y=x) x+y x<-100 foo(2, x) foo(2) A) [1] 102 [1] 102 B) [1] 4 [1] 4 C) [1] 102 [1] 4 D) [1] 4 [1]

41 What is the output? foo<-function(x, y=x) x+y x<-100 foo(2, x) foo(2) A) [1] 102 [1] 102 B) [1] 4 [1] 4 C) [1] 102 [1] 4 D) [1] 4 [1]

42 Closures Lexical Closure: Data structure that stores a function along with its environment Free variables are captured at the time the closure is defined

43 Closures: Example With regular functions (all functions are closures) add1 <- function(x){ x + 1 } add2 <- function(x){ x + 2 } add3 <- function(x){ x + 3 } add1(10) add2(9) add3(8) With closures addn <- function(n) { function(x) { x + n } } add1 <- addn(1) add2 <- addn(2) add3 <- addn(3) add1(10) add2(9) add3(8) Q: How does R know what value of n to use each time? A: Lexical scoping: n gets defined when the closure is defined

44 Superassignment The superassignment operator, <<-, can assign to a variable higher up in the stack Be careful with the superassignment operator Normally, R functions have no side effects, but <<- can cause side effects R functions are always pass-by-value

45 Exercise 6: Counter Closure Write a closure, counter() Sets a local variable, ctr <- 0 Defines a function f() increments ctr (See note below) prints the current value of ctr Returns f() Test your closure > c1 <- counter() > c2 <- counter() > c1() > c2() > c1() > c2() Note: We cannot increment ctr inside of f() with the <- operator because it is in a higher lexical scope The superassignment operator, <<-, will start searching up in scope until it finds a variable to assign We could also use the assign() function

46 Debugging in R traceback() - Prints the sequence of function calls debug(functionname) - Flag a function for debugging When functionname runs, you will be taken to the browser the browser has four basic debug commands n[ext] - Execute current line and print the next c[ontinue] - Execute the rest of the function without stopping Q[uit] - Exit the browser where - Show the call stack May also use other R commands such as print()

47 Exercise 7: lapply() The coefficient of variation of a vector x is defined as: sd(x)/mean(x) Use lapply() with an anonymous function to compute the coefficient of variation for the Wind and Temp columns of airquality simultaneously Don't use any loops

48 Exercise 7: lapply() The coefficient of variation of a vector x is defined as: sd(x)/mean(x) Use lapply() with an anonymous function to compute the coefficient of variation for the Wind and Temp columns of airquality simultaneously Don't use any loops Solution: > lapply(airquality[,c("wind", "Temp")], function(x) sd(x)/mean(x))

49 Exercise 8: Debugging Run the script debug.r source('debug.r') Use traceback() to discover which function has the error Use debug() to flag the appropriate function for debugging > debug(func#) > func1(-1) > Browse[2]> print(x) > Browse[2]> n

50 Probability and Distributions height of the probability density function cumulative density function inverse cumulative density function (quantiles) generates pseudorandom numbers Uniform Gaussian T-distribution Binomial Chi-squared dunif() dnorm() dt() dbinom() dchisq() punif() pnorm() pt() pbinom() pchisq() qunif() qnorm() qt() qbinom() qchisq() runif() rnorm() rt() rbinom() rchisq()

51 Probability and Distributions > plot(1:30,dchisq(1:30, df=10), type='l') > par(new=t) > hist(rchisq(10000, df=10), axes=false, xlab="", ylab="")

52 Exercise 9: Confidence Intervals What are the range of values such that 95% of the area is contained (i.e. the 95% confidence interval, see figure)? Write the R commands to produce the answer hint: use qnorm() This is a Gaussian distribution with mean=0 and standard deviation=1 2.5% 95% 2.5%

53 Linear Regressions Want to model the relationship between a dependent (target) variable, y, and one or more explanatory variables, x The lm() function is used to fit linear models in R Use Wilkinson notation to define statistical models in R [response variable] ~ [predictor variables] e.g.: Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width

54 Linear Regression > x <- 0:100 > e <- rnorm(101, 0, 10) > y < * x + e > model <- lm(y ~ x) > summary(model) Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) x <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 99 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 2468 on 1 and 99 DF, p-value: < 2.2e

55 Linear Regression > plot(x, y) > abline(model)

56 Linear Regression The predict() function can make predictions about the value of target variable, y, for new data Use se.fit=true to get standard errors > new_data <- data.frame(x = c(40, 50, 60, 120)) > predict(model, new_data, se.fit=true) $fit $se.fit $df [1] 99 $residual.scale [1]

57 Model Formula Notation Include all variables e.g.: These formulae refer to the same model

58 Exercise 10: Linear Regression In the airquality data, use a linear model with Ozone as the dependent variable and Temp and Wind as the predictor variables Include interactions between Temp and Wind in your model Predict the Ozone level on a day when Temp=80 degrees F, and Wind=15 mph hint: Don't use dollar signs in your model formula lm(ozone ~..., airquality) not lm(airquality$ozone ~...) > new_data <- data.frame(temp=80, Wind=15)

59 Linear Regression Solution > library(datasets) > data(airquality) > mymodel <- lm(ozone ~ Temp*Wind, airquality) > new_data <- data.frame(temp=80, Wind=15) > predict(mymodel, new_data, se.fit=true) $fit $se.fit [1] $df [1] 112 $residual.scale [1] A :

60 Hypothesis Testing > titanic <- read.csv('titanic.csv') > mean(titanic$age, na.rm=t) [1] > t.test(titanic$age, mu=30, alternative="greater") One Sample t-test data: titanic$age t = , df = 632, p-value = alternative hypothesis: true mean is greater than percent confidence interval: Inf sample estimates: mean of x t.test() is used for Student's T-Test Sample mean is Test if true mean is greater than 30 We can see T-statistic Number of degrees of freedom P-value of the test Confidence interval

61 Exercise 11: Two-sample testing t.test() can also perform two-sample testing if given two vectors (samples) instead of one Test whether the true mean of the ages of 1 st class passengers differs from the true mean of the ages of 3 rd class passengers Hints Prepare two vectors, one for first class and one for third class ages Use alternative= two.sided (or omit because two.sided is default)

62 Solution: Two-sample testing > ages.first <- titanic[titanic$pclass == "1st", "age"] > ages.third <- titanic[titanic$pclass == "3rd", "age"] > t.test(ages.first, ages.third, alternative="two.sided") Welch Two Sample t-test data: ages.first and ages.third t = , df = , p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y

63 Vectorization in R Consider a dot-product between two vectors of length 1,000,000 The time required for a for loop is more than 100x longer than the time required for a vectorized operation > a <- rnorm(1e6) > b <- rnorm(1e6) > dotprod <- 0 > system.time(sum(a*b)) user system elapsed > system.time(for (i in 1:length(a)) { dotprod <- dotprod +a[i]*b[i]}) user system elapsed

64 Vectorization in R Two main reasons for the poor performance of the dot product loop The for loop and even the colon (':'), and vector subscripts are implemented as R functions that require overhead to set up stack frames and create the environment Objects in R are pass-by-value => Data must be copied into the environments set up for those functions Many R functions are vectorized sqrt(), log(), exp(), sin(), etc

65 Parallel Programming in R Built-in package 'parallel' Included in versions >= Builds on and merges 'snow' (simple network of workstations) and 'multicore' packages

66 Setting up a parallel system library(parallel) # report the number of available cores detectcores() [1] 4 # Create a cluster with 3 cores cl <- makecluster(3)

67 parlapply() The most common way to parallelize R code is to replace expensive lapply() operations with the parallel parlapply() from the parallel package library(parallel) # report the number of available cores detectcores() [1] 4 # Create a cluster with 3 cores cl <- makecluster(3) > mylist <- seq(1e6) > cl socket cluster with 3 nodes on host localhost # Time lapply > system.time(mylist2 <- lapply(mylist, function(x) x^2)) user system elapsed # Time parlapply > system.time(mylist2 <- parlapply(cl, mylist, function(x) x^2)) user system elapsed

68 Exercise 12: Parallel R The file lapply-serial.r contains a simple serial code where most of the time is spent on an lapply() operation. Parallelize this code using the 'parallel' package in R

69 Summary Today we learned How to launch the R interpreter How to load and use external libraries How to work with R data types such as data frames How to create and call functions How scoping works in R How to work with probability distributions How to perform linear regressions How to compute confidence intervals How to perform a hypothesis test How to parallelize your code using the parallel and snow libraries

70 The End What questions do you have?

R Programming Basics - Useful Builtin Functions for Statistics

R Programming Basics - Useful Builtin Functions for Statistics R Programming Basics - Useful Builtin Functions for Statistics Vectorized Arithmetic - most arthimetic operations in R work on vectors. Here are a few commonly used summary statistics. testvect = c(1,3,5,2,9,10,7,8,6)

More information

Functional Programming. Biostatistics

Functional Programming. Biostatistics Functional Programming Biostatistics 140.776 What is Functional Programming? Functional programming concentrates on four constructs: 1. Data (numbers, strings, etc) 2. Variables (function arguments) 3.

More information

Introduction to R, Github and Gitlab

Introduction to R, Github and Gitlab Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and

More information

Advanced Econometric Methods EMET3011/8014

Advanced Econometric Methods EMET3011/8014 Advanced Econometric Methods EMET3011/8014 Lecture 2 John Stachurski Semester 1, 2011 Announcements Missed first lecture? See www.johnstachurski.net/emet Weekly download of course notes First computer

More information

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:

R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: Introduction to R R R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: http://www.r-project.org/ Code Editor: http://rstudio.org/

More information

S CHAPTER return.data S CHAPTER.Data S CHAPTER

S CHAPTER return.data S CHAPTER.Data S CHAPTER 1 S CHAPTER return.data S CHAPTER.Data MySwork S CHAPTER.Data 2 S e > return ; return + # 3 setenv S_CLEDITOR emacs 4 > 4 + 5 / 3 ## addition & divison [1] 5.666667 > (4 + 5) / 3 ## using parentheses [1]

More information

An Introductory Guide to R

An Introductory Guide to R An Introductory Guide to R By Claudia Mahler 1 Contents Installing and Operating R 2 Basics 4 Importing Data 5 Types of Data 6 Basic Operations 8 Selecting and Specifying Data 9 Matrices 11 Simple Statistics

More information

The simpleboot Package

The simpleboot Package The simpleboot Package April 1, 2005 Version 1.1-1 Date 2005-03-31 LazyLoad yes Depends R (>= 2.0.0), boot Title Simple Bootstrap Routines Author Maintainer Simple bootstrap

More information

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri) R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine

More information

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,

More information

Package uclaboot. June 18, 2003

Package uclaboot. June 18, 2003 Package uclaboot June 18, 2003 Version 0.1-3 Date 2003/6/18 Depends R (>= 1.7.0), boot, modreg Title Simple Bootstrap Routines for UCLA Statistics Author Maintainer

More information

R and parallel libraries. Introduction to R for data analytics Bologna, 26/06/2017

R and parallel libraries. Introduction to R for data analytics Bologna, 26/06/2017 R and parallel libraries Introduction to R for data analytics Bologna, 26/06/2017 Outline Overview What is R R Console Input and Evaluation Data types R Objects and Attributes Vectors and Lists Matrices

More information

Exercise 2.23 Villanova MAT 8406 September 7, 2015

Exercise 2.23 Villanova MAT 8406 September 7, 2015 Exercise 2.23 Villanova MAT 8406 September 7, 2015 Step 1: Understand the Question Consider the simple linear regression model y = 50 + 10x + ε where ε is NID(0, 16). Suppose that n = 20 pairs of observations

More information

Stochastic Models. Introduction to R. Walt Pohl. February 28, Department of Business Administration

Stochastic Models. Introduction to R. Walt Pohl. February 28, Department of Business Administration Stochastic Models Introduction to R Walt Pohl Universität Zürich Department of Business Administration February 28, 2013 What is R? R is a freely-available general-purpose statistical package, developed

More information

Introduction to the R Language

Introduction to the R Language Introduction to the R Language Loop Functions Biostatistics 140.776 1 / 32 Looping on the Command Line Writing for, while loops is useful when programming but not particularly easy when working interactively

More information

S Basics. Statistics 135. Autumn Copyright c 2005 by Mark E. Irwin

S Basics. Statistics 135. Autumn Copyright c 2005 by Mark E. Irwin S Basics Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin S Basics When discussing the S environment, I will, or at least try to, make the following distinctions. S will be used when what is

More information

Logical operators: R provides an extensive list of logical operators. These include

Logical operators: R provides an extensive list of logical operators. These include meat.r: Explanation of code Goals of code: Analyzing a subset of data Creating data frames with specified X values Calculating confidence and prediction intervals Lists and matrices Only printing a few

More information

Regression on the trees data with R

Regression on the trees data with R > trees Girth Height Volume 1 8.3 70 10.3 2 8.6 65 10.3 3 8.8 63 10.2 4 10.5 72 16.4 5 10.7 81 18.8 6 10.8 83 19.7 7 11.0 66 15.6 8 11.0 75 18.2 9 11.1 80 22.6 10 11.2 75 19.9 11 11.3 79 24.2 12 11.4 76

More information

Introduction to the R Language

Introduction to the R Language Introduction to the R Language Data Types and Basic Operations Starting Up Windows: Double-click on R Mac OS X: Click on R Unix: Type R Objects R has five basic or atomic classes of objects: character

More information

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015 STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................

More information

Introduction to R 21/11/2016

Introduction to R 21/11/2016 Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? https://cran.r-project.org/ How to install and use it? Follow the steps: you don t need advanced

More information

Introduction to R. Introduction to Econometrics W

Introduction to R. Introduction to Econometrics W Introduction to R Introduction to Econometrics W3412 Begin Download R from the Comprehensive R Archive Network (CRAN) by choosing a location close to you. Students are also recommended to download RStudio,

More information

36-402/608 HW #1 Solutions 1/21/2010

36-402/608 HW #1 Solutions 1/21/2010 36-402/608 HW #1 Solutions 1/21/2010 1. t-test (20 points) Use fullbumpus.r to set up the data from fullbumpus.txt (both at Blackboard/Assignments). For this problem, analyze the full dataset together

More information

Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center

Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center What is R? R is a statistical computing environment with graphics capabilites It is fully scriptable

More information

Getting Started in R

Getting Started in R Getting Started in R Giles Hooker May 28, 2007 1 Overview R is a free alternative to Splus: a nice environment for data analysis and graphical exploration. It uses the objectoriented paradigm to implement

More information

EPIB Four Lecture Overview of R

EPIB Four Lecture Overview of R EPIB-613 - Four Lecture Overview of R R is a package with enormous capacity for complex statistical analysis. We will see only a small proportion of what it can do. The R component of EPIB-613 is divided

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

Using R. Liang Peng Georgia Institute of Technology January 2005

Using R. Liang Peng Georgia Institute of Technology January 2005 Using R Liang Peng Georgia Institute of Technology January 2005 1. Introduction Quote from http://www.r-project.org/about.html: R is a language and environment for statistical computing and graphics. It

More information

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables: Regression Lab The data set cholesterol.txt available on your thumb drive contains the following variables: Field Descriptions ID: Subject ID sex: Sex: 0 = male, = female age: Age in years chol: Serum

More information

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created

More information

Lecture 3: Basics of R Programming

Lecture 3: Basics of R Programming Lecture 3: Basics of R Programming This lecture introduces you to how to do more things with R beyond simple commands. Outline: 1. R as a programming language 2. Grouping, loops and conditional execution

More information

R Quick Start. Appendix A. A.1 Correspondences

R Quick Start. Appendix A. A.1 Correspondences Appendix A R Quick Start Here we present a quick introduction to the R data/statistical programming language. Further learning resources are listed at http://heather.cs.ucdavis.edu/~/matloff/r.html. R

More information

Control Flow Structures

Control Flow Structures Control Flow Structures STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Expressions 2 Expressions R code

More information

Getting Started in R

Getting Started in R Getting Started in R Phil Beineke, Balasubramanian Narasimhan, Victoria Stodden modified for Rby Giles Hooker January 25, 2004 1 Overview R is a free alternative to Splus: a nice environment for data analysis

More information

Install RStudio from - use the standard installation.

Install RStudio from   - use the standard installation. Session 1: Reading in Data Before you begin: Install RStudio from http://www.rstudio.com/ide/download/ - use the standard installation. Go to the course website; http://faculty.washington.edu/kenrice/rintro/

More information

An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012

An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012 An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences Scott C Merrill September 5 th, 2012 Chapter 2 Additional help tools Last week you asked about getting help on packages.

More information

Introduction to R. Stat Statistical Computing - Summer Dr. Junvie Pailden. July 5, Southern Illinois University Edwardsville

Introduction to R. Stat Statistical Computing - Summer Dr. Junvie Pailden. July 5, Southern Illinois University Edwardsville Introduction to R Stat 575 - Statistical Computing - Summer 2016 Dr. Junvie Pailden Southern Illinois University Edwardsville July 5, 2016 Why R R offers a powerful and appealing interactive environment

More information

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from

More information

limma: A brief introduction to R

limma: A brief introduction to R limma: A brief introduction to R Natalie P. Thorne September 5, 2006 R basics i R is a command line driven environment. This means you have to type in commands (line-by-line) for it to compute or calculate

More information

The R statistical computing environment

The R statistical computing environment The R statistical computing environment Luke Tierney Department of Statistics & Actuarial Science University of Iowa June 17, 2011 Luke Tierney (U. of Iowa) R June 17, 2011 1 / 27 Introduction R is a language

More information

Reading and wri+ng data

Reading and wri+ng data An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look

More information

Mails : ; Document version: 14/09/12

Mails : ; Document version: 14/09/12 Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary

More information

Stat 290: Lab 2. Introduction to R/S-Plus

Stat 290: Lab 2. Introduction to R/S-Plus Stat 290: Lab 2 Introduction to R/S-Plus Lab Objectives 1. To introduce basic R/S commands 2. Exploratory Data Tools Assignment Work through the example on your own and fill in numerical answers and graphs.

More information

AN INTRODUCTION TO R

AN INTRODUCTION TO R AN INTRODUCTION TO R DEEPAYAN SARKAR Language Overview II In this tutorial session, we will learn more details about the R language. Objects. Objects in R are anything that can be assigned to a variable.

More information

Lecture 3: Basics of R Programming

Lecture 3: Basics of R Programming Lecture 3: Basics of R Programming This lecture introduces how to do things with R beyond simple commands. We will explore programming in R. What is programming? It is the act of instructing a computer

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Yet Another R FAQ, or How I Learned to Stop Worrying and Love Computing 1. Roger Koenker CEMMAP and University of Illinois, Urbana-Champaign

Yet Another R FAQ, or How I Learned to Stop Worrying and Love Computing 1. Roger Koenker CEMMAP and University of Illinois, Urbana-Champaign Yet Another R FAQ, or How I Learned to Stop Worrying and Love Computing 1 Roger Koenker CEMMAP and University of Illinois, Urbana-Champaign It was a splendid mind. For if thought is like the keyboard of

More information

1 Lab 1. Graphics and Checking Residuals

1 Lab 1. Graphics and Checking Residuals R is an object oriented language. We will use R for statistical analysis in FIN 504/ORF 504. To download R, go to CRAN (the Comprehensive R Archive Network) at http://cran.r-project.org Versions for Windows

More information

R Commander Tutorial

R Commander Tutorial R Commander Tutorial Introduction R is a powerful, freely available software package that allows analyzing and graphing data. However, for somebody who does not frequently use statistical software packages,

More information

An Introduction to the R Commander

An Introduction to the R Commander An Introduction to the R Commander BIO/MAT 460, Spring 2011 Christopher J. Mecklin Department of Mathematics & Statistics Biomathematics Research Group Murray State University Murray, KY 42071 christopher.mecklin@murraystate.edu

More information

Lecture 1: Getting Started and Data Basics

Lecture 1: Getting Started and Data Basics Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate

More information

Handling Missing Values

Handling Missing Values Handling Missing Values STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Missing Values 2 Introduction

More information

LAB #1: DESCRIPTIVE STATISTICS WITH R

LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015 R syntax guide Richard Gonzalez Psychology 613 August 27, 2015 This handout will help you get started with R syntax. There are obviously many details that I cannot cover in these short notes but these

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

Introduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center

Introduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center Introduction to R Nishant Gopalakrishnan, Martin Morgan Fred Hutchinson Cancer Research Center 19-21 January, 2011 Getting Started Atomic Data structures Creating vectors Subsetting vectors Factors Matrices

More information

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression

More information

R Command Summary. Steve Ambler Département des sciences économiques École des sciences de la gestion. April 2018

R Command Summary. Steve Ambler Département des sciences économiques École des sciences de la gestion. April 2018 R Command Summary Steve Ambler Département des sciences économiques École des sciences de la gestion Université du Québec à Montréal c 2018 : Steve Ambler April 2018 This document describes some of the

More information

A brief introduction to R

A brief introduction to R A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background

More information

Extremely short introduction to R Jean-Yves Sgro Feb 20, 2018

Extremely short introduction to R Jean-Yves Sgro Feb 20, 2018 Extremely short introduction to R Jean-Yves Sgro Feb 20, 2018 Contents 1 Suggested ahead activities 1 2 Introduction to R 2 2.1 Learning Objectives......................................... 2 3 Starting

More information

MATLAB GUIDE UMD PHYS401 SPRING 2011

MATLAB GUIDE UMD PHYS401 SPRING 2011 MATLAB GUIDE UMD PHYS401 SPRING 2011 Note that it is sometimes useful to add comments to your commands. You can do this with % : >> data=[3 5 9 6] %here is my comment data = 3 5 9 6 At any time you can

More information

Practice for Learning R and Learning Latex

Practice for Learning R and Learning Latex Practice for Learning R and Learning Latex Jennifer Pan August, 2011 Latex Environments A) Try to create the following equations: 1. 5+6 α = β2 2. P r( 1.96 Z 1.96) = 0.95 ( ) ( ) sy 1 r 2 3. ˆβx = r xy

More information

Getting Started. Slides R-Intro: R-Analytics: R-HPC:

Getting Started. Slides R-Intro:   R-Analytics:   R-HPC: Getting Started Download and install R + Rstudio http://www.r-project.org/ https://www.rstudio.com/products/rstudio/download2/ TACC ssh username@wrangler.tacc.utexas.edu % module load Rstats %R Slides

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

the R environment The R language is an integrated suite of software facilities for:

the R environment The R language is an integrated suite of software facilities for: the R environment The R language is an integrated suite of software facilities for: Data Handling and storage Matrix Math: Manipulating matrices, vectors, and arrays Statistics: A large, integrated set

More information

EE 301 Signals & Systems I MATLAB Tutorial with Questions

EE 301 Signals & Systems I MATLAB Tutorial with Questions EE 301 Signals & Systems I MATLAB Tutorial with Questions Under the content of the course EE-301, this semester, some MATLAB questions will be assigned in addition to the usual theoretical questions. This

More information

MATLAB TUTORIAL WORKSHEET

MATLAB TUTORIAL WORKSHEET MATLAB TUTORIAL WORKSHEET What is MATLAB? Software package used for computation High-level programming language with easy to use interactive environment Access MATLAB at Tufts here: https://it.tufts.edu/sw-matlabstudent

More information

Lab #3. Viewing Data in SAS. Tables in SAS. 171:161: Introduction to Biostatistics Breheny

Lab #3. Viewing Data in SAS. Tables in SAS. 171:161: Introduction to Biostatistics Breheny 171:161: Introduction to Biostatistics Breheny Lab #3 The focus of this lab will be on using SAS and R to provide you with summary statistics of different variables with a data set. We will look at both

More information

Risk Management Using R, SoSe 2013

Risk Management Using R, SoSe 2013 1. Problem (vectors and factors) a) Create a vector containing the numbers 1 to 10. In this vector, replace all numbers greater than 4 with 5. b) Create a sequence of length 5 starting at 0 with an increment

More information

MATLAB GUIDE UMD PHYS375 FALL 2010

MATLAB GUIDE UMD PHYS375 FALL 2010 MATLAB GUIDE UMD PHYS375 FALL 200 DIRECTORIES Find the current directory you are in: >> pwd C:\Documents and Settings\ian\My Documents\MATLAB [Note that Matlab assigned this string of characters to a variable

More information

Lecture 3 - Object-oriented programming and statistical programming examples

Lecture 3 - Object-oriented programming and statistical programming examples Lecture 3 - Object-oriented programming and statistical programming examples Björn Andersson (w/ Ronnie Pingel) Department of Statistics, Uppsala University February 1, 2013 Table of Contents 1 Some notes

More information

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT Python for Data Analysis Prof.Sushila Aghav-Palwe Assistant Professor MIT Four steps to apply data analytics: 1. Define your Objective What are you trying to achieve? What could the result look like? 2.

More information

Index. Bar charts, 106 bartlett.test function, 159 Bottles dataset, 69 Box plots, 113

Index. Bar charts, 106 bartlett.test function, 159 Bottles dataset, 69 Box plots, 113 Index A Add-on packages information page, 186 187 Linux users, 191 Mac users, 189 mirror sites, 185 Windows users, 187 aggregate function, 62 Analysis of variance (ANOVA), 152 anova function, 152 as.data.frame

More information

Introducion to R and parallel libraries. Giorgio Pedrazzi, CINECA Matteo Sartori, CINECA School of Data Analytics and Visualisation Milan, 09/06/2015

Introducion to R and parallel libraries. Giorgio Pedrazzi, CINECA Matteo Sartori, CINECA School of Data Analytics and Visualisation Milan, 09/06/2015 Introducion to R and parallel libraries Giorgio Pedrazzi, CINECA Matteo Sartori, CINECA School of Data Analytics and Visualisation Milan, 09/06/2015 Overview What is R R Console Input and Evaluation Data

More information

GRAD6/8104; INES 8090 Spatial Statistic Spring 2017

GRAD6/8104; INES 8090 Spatial Statistic Spring 2017 Lab #1 Basics in Spatial Statistics (Due Date: 01/30/2017) PURPOSES 1. Get familiar with statistics and GIS 2. Learn to use open-source software R for statistical analysis Before starting your lab, create

More information

Stat 579: More Preliminaries, Reading from Files

Stat 579: More Preliminaries, Reading from Files Stat 579: More Preliminaries, Reading from Files Ranjan Maitra 2220 Snedecor Hall Department of Statistics Iowa State University. Phone: 515-294-7757 maitra@iastate.edu September 1, 2011, 1/10 Some more

More information

1. Introduction to R. 1.1 Introducing R

1. Introduction to R. 1.1 Introducing R 1. Introduction to R 1.1 Introducing R 1.1.1 What is R? R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical (linear and nonlinear modeling,

More information

Basic R Part 1 BTI Plant Bioinformatics Course

Basic R Part 1 BTI Plant Bioinformatics Course Basic R Part 1 BTI Plant Bioinformatics Course Spring 2013 Sol Genomics Network Boyce Thompson Institute for Plant Research by Jeremy D. Edwards What is R? Statistical programming language Derived from

More information

8.1 R Computational Toolbox Tutorial 3

8.1 R Computational Toolbox Tutorial 3 8.1 R Computational Toolbox Tutorial 3 Introduction to Computational Science: Modeling and Simulation for the Sciences, 2 nd Edition Angela B. Shiflet and George W. Shiflet Wofford College 2014 by Princeton

More information

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010 THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

for statistical analyses

for statistical analyses Using for statistical analyses Robert Bauer Warnemünde, 05/16/2012 Day 6 - Agenda: non-parametric alternatives to t-test and ANOVA (incl. post hoc tests) Wilcoxon Rank Sum/Mann-Whitney U-Test Kruskal-Wallis

More information

Business Statistics: R tutorials

Business Statistics: R tutorials Business Statistics: R tutorials Jingyu He September 29, 2017 Install R and RStudio R is a free software environment for statistical computing and graphics. Download free R and RStudio for Windows/Mac:

More information

Stat 579: Objects in R Vectors

Stat 579: Objects in R Vectors Stat 579: Objects in R Vectors Ranjan Maitra 2220 Snedecor Hall Department of Statistics Iowa State University. Phone: 515-294-7757 maitra@iastate.edu, 1/23 Logical Vectors I R allows manipulation of logical

More information

Simulating power in practice

Simulating power in practice Simulating power in practice Author: Nicholas G Reich This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

Programming Exercise 1: Linear Regression

Programming Exercise 1: Linear Regression Programming Exercise 1: Linear Regression Machine Learning Introduction In this exercise, you will implement linear regression and get to see it work on data. Before starting on this programming exercise,

More information

Learning from Data Introduction to Matlab

Learning from Data Introduction to Matlab Learning from Data Introduction to Matlab Amos Storkey, David Barber and Chris Williams a.storkey@ed.ac.uk Course page : http://www.anc.ed.ac.uk/ amos/lfd/ This is a modified version of a text written

More information

Data types and structures

Data types and structures An introduc+on to Data types and structures Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 3 Review GeFng started with R Crea+ng Objects Data types in R Data structures in R

More information

Elements of a programming language 3

Elements of a programming language 3 Elements of a programming language 3 Marcin Kierczak 21 September 2016 Contents of the lecture variables and their types operators vectors numbers as vectors strings as vectors matrices lists data frames

More information

R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean

R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean Copyright 2017 by Joseph W. McKean at Western Michigan University. All rights reserved. Reproduction or translation of

More information

Package pbapply. R topics documented: January 10, Type Package Title Adding Progress Bar to '*apply' Functions Version Date

Package pbapply. R topics documented: January 10, Type Package Title Adding Progress Bar to '*apply' Functions Version Date Type Package Title Adding Progress Bar to '*apply' Functions Version 1.3-4 Date 2018-01-09 Package pbapply January 10, 2018 Author Peter Solymos [aut, cre], Zygmunt Zawadzki [aut] Maintainer Peter Solymos

More information

Binary Regression in S-Plus

Binary Regression in S-Plus Fall 200 STA 216 September 7, 2000 1 Getting Started in UNIX Binary Regression in S-Plus Create a class working directory and.data directory for S-Plus 5.0. If you have used Splus 3.x before, then it is

More information

Fathom Dynamic Data TM Version 2 Specifications

Fathom Dynamic Data TM Version 2 Specifications Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other

More information

Bjørn Helge Mevik Research Computing Services, USIT, UiO

Bjørn Helge Mevik Research Computing Services, USIT, UiO 23.11.2011 1 Introduction to R and Bioconductor: Computer Lab Bjørn Helge Mevik (b.h.mevik@usit.uio.no), Research Computing Services, USIT, UiO (based on original by Antonio Mora, biotek) Exercise 1. Fundamentals

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information