Part I { Getting Started & Manipulating Data with R

Size: px
Start display at page:

Download "Part I { Getting Started & Manipulating Data with R"

Transcription

1 Part I { Getting Started & Manipulating Data with R Gilles Lamothe February 21, 2017 Contents 1 URL for these notes and data 2 2 Origins of R 2 3 Downloading and Installing R 2 4 R Console and Editor 3 5 RStudio 3 6 Working Directory 3 7 Importing & Exporting data with R / Data structures (Vectors and Dataframes) 4 8 Types, Test and Coercion / sapply / Factors 8 9 Functions of a numerical vector (Descriptive Statistics) / Group statistics 9 10 Missing values / User build functions To source an R script Applying our new skills Logical vectors / Table of frequencies / Convert a numerical vector into a factor Operations on numerical vectors Cut Functions Revisited Plot a one numerical variable (Boxplot & Histogram) / Installing an R package Applying our new skills { Part II Saving the workspace 23 1

2 Intro to R Workshop - Psychology Statistics Club 2 Intro to R Workshop - Psychology Statistics Club 1 URL for these notes and data To download these notes, follow the link: aix1.uottawa.ca/~glamothe/psystatclub The name of the le is Rworkshop.pdf. To nd the corresponding data, follow the link: aix1.uottawa.ca/~glamothe/psystatclub The data is in the folder called data. 2 Origins of R R was designed by Ross Ihaka (computer scientist) and Robert Gentleman (statistician) in the early 1990's at the University of Auckland (New Zealand). R was developed at the Bell Laboratories in Murray Hill, New Jersey. R is much more than a statistical package. It is its own computer language and environment within wich statistical techniques are implemented. To know more ABOUT R, follow the link: R's strength is that is allows the user to retain full control. It is highly exible. 3 Downloading and Installing R To nd R, to google.ca and search for R. Follow the link for: Here is the URL: R: The R Project for Statistical Computing. Here is a youtube video for instructions concerning the installation of R.

3 Intro to R Workshop - Psychology Statistics Club 3 4 R Console and Editor R is a command based programming language. prompt > in the R console. We enter commands at the Here is an example, where we compute p We enter the command and the prompt and press ENTER. > sqrt(2)+4 [1] Alternatively, we can enter commands in an R editor window. Select File! New script in Windows or select File! New document on a Mac. In an Editor, we select the commands that we want to send to the prompt and we press CTRL-R in Windows or CMD-Enter on a Mac. 5 RStudio RStudio is a free and open-source integrated development environment (IDE) for R. RStudio is a more user-friendly interface to facilitate the use of R. To install RStudio, download the installer for your operating system from the following Webpage: You will see many windows, within the RStudio environment. There is, on the left-hand-side : { the console with the R prompt, which is waiting patiently for you commands. { an R editor window, within which was can work and eventually submit our commands to the console. 6 Working Directory We use the command getwd() to display the working directory. We use the function setwd() to display the working directory. Below, we set the directory to C:/Rstuff and we verify that the working directory has been properly set by displaying the current working directory. > setwd("c:/rstuff") > getwd() [1] "c:/rstuff"

4 Intro to R Workshop - Psychology Statistics Club 4 Comments: Windows users are used to seeing the blackslash (\) as the path separator. However, even for Windows, R uses the frontslash (/) as the path separator. Using a working directory will save you time. However, we will see later that we can ignore the working directory by using the function file.choose(). It forces R to open a window and ask you for the location of your le. 7 Importing & Exporting data with R / Data structures (Vectors and Dataframes) R does not have a good worksheet. So you will need to use some other editing software, e.g. excel or Open Office Calc, with a worksheet to save your data. We recommend that you save your data as a text le, preferably as tab delimited, or CSV (i.e. comma separated). We will be using tab delimited. This allows me to have commas in my le that are not interpreted as the start of a new column. R has four data structures: dataframe, vector, list, and matrix. For now, we will discuss the dataframe and the vector. Dataframe: { A dataframe is a data table. { The rows are the statistical units. { The columns are the variables that are used to describe the statistical units. { Here is an example of a dataframe: subject ID gender age (in years) measured response complient 1 Male Female Female Female Female Male { We saved the above table in the le example.txt. It is a tab-delimited text le. { Assuming that the le is in our working directory (for me it is: C:/Rstuff), then we can use the following command to assign the data in the le to a dataframe that we called data. > data<-read.table("example.txt",header=true,sep="\t") > names(data) [1] "subject.id" "gender" "age..in.years." [4] "measured.response" "complient"

5 Intro to R Workshop - Psychology Statistics Club 5 Vector: Comments: 1. <- is used for assignments. Using the read.table function, we assigned a dataframe to data. The name of the dataframe is provided by the user. 2. By default, R assumes that we will not give names to our columns. If the rst row in your table has column names, we need to add the argument header=true. 3. By default, R uses space to separate columns. We used tabs, so we add the argument sep="\t". For a CSV le, use sep="," 4. We can use the function names() to display the names of the columns of a dataframe. 5. R does not permit certain symbols for the names of a column, e.g. spaces. In all of those cases, the symbol is replaced by a dot. 6. Here we display the rst three rows of the dataframe data. > head(data,n=3) subject.id gender age..in.years. measured.response complient 1 1 Male Female Female Here we display the dimensions (# of rows & # of columns) of the dataframe data. > dim(data) [1] 6 5 > nrow(data) [1] 6 > ncol(data) [1] 5 8. Instead of giving the name of the le to import, we use use the file.choose() function. R will open a window to ask you the location of the le. > data<-read.table(file.choose(),header=true,sep="\t") Think of a vector, as a column in a dataframe. It is a homogeneous data structure. All of its elements are of the same type. Here are some common types: integer, double (often called numeric), boolean (often called logical), categorical (often called character or factor). Basically, a vector is either a numerical (quantitative) variable or a categorical (qualitative) variable. Reference to a vector in a dataframe: { with the column's name:

6 Intro to R Workshop - Psychology Statistics Club 6 > data$gender [1] Male Female Female Female Female Male Levels: Female Male Remark: R noticed that data$gender is categorical. So it also displayed its levels. { with the column's position: > data[,2] [1] Male Female Female Female Female Male Levels: Female Male Remark: We use square brackets for subsetting data with R. Here [,2] refers to the second column. [2,] refers to the second row and [2,3] refers to 2nd row and 3rd column. { Here we display the 2nd row in the dataframe data and, then, we ask R if it is a vector or a dataframe. A row is not a vector. > data[2,] subject.id gender age..in.years. measured.response complient 2 2 Female > is.data.frame(data[2,]) [1] TRUE > is.vector(data[2,]) [1] FALSE { Here we display the 3nd column in the dataframe data and, then, we ask R if it is a vector. > data[,3] [1] > is.vector(data[,3]) [1] TRUE A vector does not need to be a column of a dataframe. Just think of it as a list of elements of the same type. Examples of vectors: 1. The names of the dataframe is a character vector. > names(data) [1] "subject ID" "gender" "age (in years)" [4] "measured response" "complient" > is.vector(names(data)) [1] TRUE 2. We can construct vectors with function c() (which stands for combine). Here is a command to assign a character vector to names(data).

7 Intro to R Workshop - Psychology Statistics Club 7 names(data)<-c("subject ID","gender","age", "measured response","complient") We now redisplay the names of the columns in the dataframe data. > names(data) [1] "subject ID" "gender" "age" [4] "measured response" "complient" Comment: If you have a space in the name of a column, you must use quotes to refer to it by name. > data$`measured response` [1] Here are a few useful ways to build vectors. > # a numerical vector of zeros of size 6 > numeric(6) [1] > # build a vector by repeating a vector 4 times > rep(c(1,2),4) [1] > # an empty vector > c() NULL > # a numerical vector with the integers from 1 to 6 > 1:6 [1] > # a numerical vector with the integers from 10 to 20 > 10:20 [1] > # build a sequence of numbers with an increment of 0.5 > # starting at 1 and ending at 5. > seq(1,5,by=0.5) [1] Remark: We use the symbol # to write comments that R does not interpret. We can easily add a vector to a dataframe. > data<-data.frame(data,degree) > names(data) [1] "subject.id" "gender" "age" [4] "measured.response" "complient" "degree" We end this section by exporting a dataframe with the function read.table(). It will save the le in the working directory, unless you give it a path. write.table(data, file = "SavedExample.txt",sep = "\t", row.names = FALSE, col.names = TRUE)

8 Intro to R Workshop - Psychology Statistics Club 8 Comments: With the above command, we saved the dataframe data in the le `SavedExample.txt' in the working directory. R might modify the names of the columns, since some characters are not permitted, e.g. spaces. 8 Types, Test and Coercion / sapply / Factors Here are a few test functions: is.character(), is.logical(), is.numeric(), is.factor(). Here are a few coercion functions: as.character(), as.logical(), as.numeric(), as.factor(), factor(). Here is an example of testing that it is of type factor and then forcing it to be of type factor. > is.factor(data$complient) [1] FALSE > is.numeric(data$complient) [1] TRUE > data$complient<-factor(data$complient) > is.factor(data$complient) [1] TRUE > is.numeric(data$complient) [1] FALSE large Factors: Factors are categorical variables. They have levels, i.e. the categories. > levels(data$complient) [1] "1" "2" Here is a display of the vector: > data$complient [1] Levels: 1 2 We can change the labels of the levels, with the function factor by using a labels argument. The labels must be a vector as the same length as the levels. > data$complient<-factor(data$complient,labels=c("yes","no")) > data$complient [1] Yes Yes No No No No Levels: Yes No

9 Intro to R Workshop - Psychology Statistics Club 9 sapply: It is good practice to verify the type of all the columns in a dataframe. This can be time consuming to verify the type one column at a time. We can use the function sapply to apply the function is.factor on all the columns of a dataframe. It returns a logical vector. > sapply(data,is.factor) subject.id gender age FALSE TRUE FALSE measured.response complient degree FALSE TRUE TRUE We identify which columns are numerical. > sapply(data,is.numeric) subject.id gender age TRUE FALSE TRUE measured.response complient degree TRUE FALSE FALSE 9 Functions of a numerical vector (Descriptive Statistics) / Group statistics Let x be a numerical vector. For example, consider the following assignment. x<-c(16,15,14,16,13,12,14,13,10) By using the command summary(x), we obtain some descriptive statistics for x. > summary(x) Min. 1st Qu. Median Mean 3rd Qu. Max Here are some descriptives for the age of the subjects in the dataframe data. > summary(data$age) Min. 1st Qu. Median Mean 3rd Qu. Max Here is a list of some common statistics. sum(x) mean(x) var(x) sd(x) min(x) max(x) median(x) # sum of the components # mean # variance # standard deviation # min # max # median

10 Intro to R Workshop - Psychology Statistics Club 10 quantile(x) # 5-number summary (min,q1,median,q3,max) length(x) # number of components sort(x) # arrange values in ascending order rank(x) # rank the values from 1 to length(x) Let us use the sapply function to get the mean for each column in the dataframe data. > sapply(data,mean) subject.id gender age NA measured.response complient degree NA NA Warning messages: 1: In mean.default(x[[i]],...) : argument is not numeric or logical: returning NA 2: In mean.default(x[[i]],...) : argument is not numeric or logical: returning NA 3: In mean.default(x[[i]],...) : argument is not numeric or logical: returning NA Comments: R is warning us that it cannot compute the mean of a non-numerical vector. If we know that a certain command will display warnings, we can suppress warnings before using the command. It is good practice, > # suppress warnings > options(warn=-1) > sapply(data,mean) subject.id gender age NA measured.response complient degree NA NA > # put warnings back on > options(warn=0) Recall that we use square brackets for subsetting. We could extract the numerical columns and only compute the mean for each of these columns. > numericcol<-sapply(data,is.numeric) > sapply(data[,numericcol],mean) subject.id age measured.response Group Statistics:

11 Intro to R Workshop - Psychology Statistics Club 11 We can use the aggregate function to get group statistics for a numerical vector y according to the levels of another variable x in a dataframe data. fun is the descriptive statistic that you want to compute. Its usage is aggregate(y~x,data,fun) Here are the mean age for the subjects in the dataframe data according to the levels of gender. > Mean<-aggregate(age~gender,data,mean) > Mean gender age 1 Female Male Here we build a dataframe with a few descriptive statistics for the age for the subjects in the dataframe data according to the levels of gender. > Mean<-aggregate(age~gender,data,mean) > SD<-aggregate(age~gender,data,sd) > n<-aggregate(age~gender,data,length) > AgeStats<-data.frame(Mean,SD[,2],n[,2]) > names(agestats)<-c("gender","mean","std dev","n") > AgeStats Gender Mean Std dev n 1 Female Male Remark: The discussion in this section assumes that there are no missing values. > mean(c(na,0,1)) [1] NA Missing values can cause some problems. R could not compute the mean of the vector, since there was a missing value. 10 Missing values / User build functions Consider the data in the le OmitExample.txt. We import the data and assign it to the dataframe data1. > data1<-read.table("omitexample.txt",header=true,sep="\t") > data1 subject.id gender age..in.years. measured.response complient 1 1 Male Female 45 NA Female Female NA 1.8 2

12 Intro to R Workshop - Psychology Statistics Club Female Male degree 1 yes 2 no 3 no 4 yes 5 yes 6 yes There is a missing value in the third column and also in the fourth column. So we will have diculties compute the mean for each of these columns. > numericcol<-sapply(data1,is.numeric) > sapply(data1[,numericcol],mean) subject.id age..in.years. measured.response NA NA complient We can omit the missing values with the function na.omit. However, it will delete all rows with a missing value. > na.omit(data1) subject.id gender age..in.years. measured.response complient 1 1 Male Female Female Male degree 1 yes 3 no 5 yes 6 yes User dened function: Another solution is to build our own function to compute the mean of a vector after omitting missing values in this vector. mean.na<-function(x) { x<-x[!is.na(x)] # keep value if not NA return(mean(x)) } We use our user dened function to compute the mean of the numerical vectors in the dataframe data1. > sapply(data1[,numericcol],mean.na) subject.id age..in.years. measured.response

13 Intro to R Workshop - Psychology Statistics Club complient R does not have a sample size function. We can use the function length, but it gives the number of components in the vector (including missing values). > length(data1$age..in.years.) [1] 6 > data1$age..in.years. [1] NA We dene functions to count the number of missing values in a vector and to count the number of non-missing values in a vector. numna<-function(x) { return(length(x[is.na(x)])) } SampleSize<-function(x) { return(length(x)-numna(x)) } We use our functions on the vector age..in.years. in the dataframe data1. > SampleSize(data1$age..in.years.) [1] 5 > numna(data1$age..in.years.) [1] 1 11 To source an R script We will want to save our user dened R functions. Steps to save your functions: 1. In the RStudio menu, select File! New File! New R script. 2. In the new editor window, enter your functions. 3. Save the current editor window. I saved it as myfunctions.r. 2 ways to Access your functions: 1. In the RStudio menu, select File! Open File. Browse for myfunctions.r. 2. Alternatively, we can source the le. R will interpret all commands in the le that have been sourced. I am assuming that myfunctions.r is in the working directory. > source("myfunctions.r")

14 Intro to R Workshop - Psychology Statistics Club Applying our new skills Exercises to try in class. 1. Consider a company that produces items made from glass. The data is in the le glassworks.txt. (Source: Applied Linear Models, by Kutner et al.). A large company is studying the eects of the length of special training for new employees on the quality of work. Employees are randomly assigned to have either 6, 8, 10, or 12 hours of training. After the special training, each employee is given the same amount of material. The response variable is the number of acceptable pieces. (a) Compute the number of subjects per group. Is it a balanced study (i.e. same # of observations per group)? (b) Compute the mean response and the standard deviation of the response for each group. 2. Consider the data in the le FinalGrades.txt. (a) For each assignment, compute the mean and the standard deviation. (b) For the nal exam, compute the mean according to the levels of the faculty.

15 Intro to R Workshop - Psychology Statistics Club Logical vectors / Table of frequencies / Convert a numerical vector into a factor It is easy to compute the proportion of statistical units in a dataframe that satisfy a certain condition. We use logical operators. Operator Description < less than <= less than or equal to > greater than >= greater than or equal to == exactly equal to!= not equal to!x Not x x j y x&y istrue(x) x OR y x AND y test if x is TRUE What proportion of the subjects in data are over 60? > data$age>60 [1] FALSE FALSE TRUE FALSE FALSE FALSE R considers FALSE as 0 and TRUE as 1. So by computing the mean, we get the proportion of TRUE values: > mean(data$age>60) [1] So 16.7% of the subjects are over 60. What proportion of the subjects in data are male? > data$gender=="male" [1] TRUE FALSE FALSE FALSE FALSE TRUE > mean(data$gender=="male") [1] So 33.3% of the subjects are male. What proportion of the subjects in data are female and at most 30? > (data$gender=="female")&(data$age<=30) [1] FALSE FALSE FALSE TRUE TRUE FALSE > mean((data$gender=="female")&(data$age<=30)) [1] So 33.3% of the subjects are female and at most 30. Table of frequencies: We can use the function table to obtain a table of frequencies for a categorical vector.

16 Intro to R Workshop - Psychology Statistics Club 16 > table(data$gender) Female Male 4 2 > # expressed as a percentage > table(data$gender)/length(data$gender)*100 Female Male Cut a numeric vector into intervals: We can use the function cut to break a numerical vector into intervals. It has an argument called break. It can be a number for the number of breaks or a numeric vector for the values of the breaks. > cut(data$age,breaks=3) [1] (33.3,47.7] (33.3,47.7] (47.7,62] (19,33.3] (19,33.3] [6] (33.3,47.7] Levels: (19,33.3] (33.3,47.7] (47.7,62] > cut(data$age,breaks=c(19,30,40,65)) [1] (30,40] (40,65] (40,65] <NA> (19,30] (40,65] Levels: (19,30] (30,40] (40,65] > agecat<-cut(data$age,breaks=c(19,30,40,65)) Here we have a two-way contingency table for the joint distribution of gender and age. > table(data$gender,agecat) agecat (19,30] (30,40] (40,65] Female Male Logical vector for subsetting: We can use a logical vector to obtain a subset a dataframe. Suppose that we only want to retain the female subjects. > datafemale <- data[data$gender=="female",] > datafemale subject.id gender age measured.response complient degree 2 2 Female Yes no 3 3 Female No no 4 4 Female No yes 5 5 Female No yes Comment: [data$gender=="female",] means that we want to retain the rows that correspond to females, but keep all of the columns.

17 Intro to R Workshop - Psychology Statistics Club Operations on numerical vectors R has a few arithmetic operators. Operator Description + addition subtraction multiplication = division ^ or exponentiation x%%y modulus (x mod y) 5%%2 is 1 x%=%y integer division 5%=%2 is 2 Operations are done component-wise. Here are a few examples. > x<-c(1,2,3) > y<-c(-1,0,1) > x+y [1] > x-y [1] > x*y [1] > x/y [1] -1 Inf 3 > 2+x [1] > 2*x [1] > x/2 [1] > x^2 [1] Here are a few other functions that might be useful. > sqrt(x) ## square root [1] > log(x) ## natural logarithm [1] > exp(x) ## exponential of base e [1] > w<-c(1.2,4.5,7.8) > w<-c(1.2,4.5,-7.8) > ceiling(w) ## round up [1] > floor(w) ## round down [1] 1 4-8

18 Intro to R Workshop - Psychology Statistics Club 18 > round(w) ## round [1] > abs(w) ## absolute value [1] We can apply an arithmetic operations in the presence of missing values. result will be NA. The > z<-c(na,1,1) > x+z [1] NA 3 4 Grades Example: We import the data from the le SmallClass.txt and display the names of the columns. > grades<-read.table("smallclass.txt",header=true,sep="\t") > names(grades) [1] "Student.ID" "Assignment1..Total.Pts..20." [3] "Assignment2..Total.Pts..30." "Test...Total.Pts..12." [5] "Assignment3..Total.Pts..50." "Assignment4..Total.Pts..20." [7] "FinalExam..Total.Pts..100." We will build a dataframe just for the assignments. > assignments<-grades[,c(2,3,5,6)] > names(assignments) [1] "Assignment1..Total.Pts..20." "Assignment2..Total.Pts..30." [3] "Assignment3..Total.Pts..50." "Assignment4..Total.Pts..20." We will convert each assignment into a percentage. > assignments<-grades[,c(2,3,5,6)] > assignments[,1]<-assignments[,1]/20*100 > assignments[,2]<-assignments[,2]/30*100 > assignments[,3]<-assignments[,3]/50*100 > assignments[,4]<-assignments[,4]/20*100 > names(assignments)<-c("hw1","hw2","hw3","hw4") > sapply(assignments,mean) hw1 hw2 hw3 hw Remark: There are no NA in the means. This means that there are no missing assignments. Say that we would like to keep that best 3 of 4 assignments. function that does the following We will build a For each student (i.e. each statistical unit), we construct a numeric vector of the four marks and convert any NA to a zero.

19 Intro to R Workshop - Psychology Statistics Club 19 We sort the four assignments and keep the best 3 of the four. We add the row of the best 3 of 4 assignments to a dataframe and we compute the average of the three marks. We use the function rbind to add a row to a dataframe. The result of the function will be a list containing a dataframe and a vector. BestOf <- function(x,n) { nstudents<-nrow(x) nassignments<-ncol(x) # empty dataframe BestMarks<-data.frame(NULL) # initialize a vector for the averages Avg<-numeric(nStudents) for (i in 1:nStudents) { best<-sort(as.numeric(x[i,]),decreasing=true)[1:n] best[best==na]<-0 Avg[i]<-mean(best) BestMarks<-rbind(BestMarks,best) } names(bestmarks)<-paste0("best",1:n) } return(list(bestmarks=bestmarks,mean=avg)) We are ready to get the best 3 assignment for each student. > BestOf(assignments,3) $BestMarks Best1 Best2 Best $Mean [1] We will add these means to the original dataframe. > grades<-data.frame(grades,bestof(assignments,3)$mean)

20 Intro to R Workshop - Psychology Statistics Club 20 > names(grades) [1] "Student.ID" "Assignment1..Total.Pts..20." [3] "Assignment2..Total.Pts..30." "Test...Total.Pts..12." [5] "Assignment3..Total.Pts..50." "Assignment4..Total.Pts..20." [7] "FinalExam..Total.Pts..100." "BestOf.assignments..3..Mean" We are ready to compute the nal grades with the following scheme. Best 3 of 4 assignments 20% Max(Test,Final) 25% Final Exam 55% If the student did not write the nal exam, it should remain an NA. > test<-grades[,4]/12*100 > test[test==na]<-0 > exam<-grades[,7] > exam [1] > hw<-grades[,8] > FinalGrade<-.2*hw+.25*max(exam,test)+.55*exam > FinalGrade [1] Cut Functions Revisited We are now ready to compute the alpha grades, i.e. convert a numeric vector into a factor. We use the cut function as follows. ## letter grades: ##90-100=A+; 85-89=A; 80-84=A-; ##75-79=B; 70-74=B-; ##65-69=C; 60-64=C-; ##55-59=D; 50-54=D-; ##40-49=E; 0-39=F; letters=c("f","e","d-","d","c-","c","b-","b","a","a-","a+") # right-hand limits are not included alpha.grade<-cut(finalgrade, c(0,40,50,55,60,65,70,75,80,85,90,inf), right=false, labels=letters ) Here are the alpha grades for the 5 students and a corresponding table of frequency. > alpha.grade [1] B- B- C A- A Levels: F E D- D C- C B- B A A- A+ > table(alpha.grade) alpha.grade F E D- D C- C B- B A A- A

21 Intro to R Workshop - Psychology Statistics Club Plot a one numerical variable (Boxplot & Histogram) / Installing an R package To produce a boxplot of x and a histogram of x, we use boxplot(x) and hist(x), respectively. > par(mfrow=c(1,2)) # graphics window of 1 row, 2 columns > hist(finalgrade) > boxplot(finalgrade) The result is Installing an R package: R has a very large community of contributes that are developing packages that are freely available through CRAN (The Comprehensive R Archive Network). We will install the package packhv with the following command: install.packages("packhv") Comments: You only need to install the package once onto your computer. If you do not have administrator privileges on the computer, then you might not be able to install a package in the default location for packages. If that is the case, then create a new directory on the computer, say C:/myPackages. You should be able to install in this location : install.packages("packhv",lib="c:/mypackages") To use a package during a session, we must load it by using the function library. For example: > library(packhv) > # you can also give the location of the package on your computer > library(packhv,lib.loc="c:/mypackages") We can now use the function hist_boxplot from the package packhv par(mfrow=c(1,1)) # graphics window of 1 row, 1 column hist_boxplot(finalgrade,col="lightblue",freq=true, ylab="frequency", xlab="final Grade (in percentage)", main="distribution of the Final Grades",

22 Intro to R Workshop - Psychology Statistics Club 22 cex.lab=1.15) # increase the font size on both axes axis(2,cex.axis=1.05) axis(1,cex.axis=1.15) Now this is where R shines. We can easily modify a graph. For example, we can add text to the plot with the function text. text(85,1.7,"mean = 76, sd = 6.7, n = 5",cex=1.15) The result is 17 Applying our new skills { Part II Exercises to try in class. 1. Consider a company that produces items made from glass. The data is in the le glassworks.txt. (Source: Applied Linear Models, by Kutner et al.). A large company is studying the eects of the length of special training for new employees on the quality of work. Employees are randomly assigned to have either 6, 8, 10, or 12 hours of training. After the special training, each employee is given the same amount of material. The response variable is the number of acceptable pieces. (a) Compute comparative boxplots to describe the number of acceptable pieces according to the amount of training. Hint: try boxplot(y~x,data) 2. Consider the data in the le FinalGrades.txt. (a) Compute the nal grades with the following distribution. Convert the nal grades to alpha grades and produce a frequency table for the alpha grades. Best 2 of 4 assignments 15% Max(Test,Final) 15% Final Exam 60% (b) Produce a plot to describe the distribution of the nal grades. (c) What is the failure rate? That is, what proportion of the students have a nal grade lower than 50%?

23 Intro to R Workshop - Psychology Statistics Club Saving the workspace You can save objects to retrieve in the future. Here we saved the objects Cereal and methadone. > # save specific objects to a file > # if you don't specify the path, the cwd is assumed > save(cereal,methadone,file="myfile.rdata") To retrieve the saved objects, use the function load. # load a workspace into the current session # if you don't specify the path, the cwd is assumed load("myfile.rdata")

BIO5312: R Session 1 An Introduction to R and Descriptive Statistics

BIO5312: R Session 1 An Introduction to R and Descriptive Statistics BIO5312: R Session 1 An Introduction to R and Descriptive Statistics Yujin Chung August 30th, 2016 Fall, 2016 Yujin Chung R Session 1 Fall, 2016 1/24 Introduction to R R software R is both open source

More information

A Brief Introduction to R

A Brief Introduction to R A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.

More information

An Introduction to Statistical Computing in R

An Introduction to Statistical Computing in R An Introduction to Statistical Computing in R K2I Data Science Boot Camp - Day 1 AM Session May 15, 2017 Statistical Computing in R May 15, 2017 1 / 55 AM Session Outline Intro to R Basics Plotting In

More information

An Introduction to R- Programming

An Introduction to R- Programming An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics

More information

Computer lab 2 Course: Introduction to R for Biologists

Computer lab 2 Course: Introduction to R for Biologists Computer lab 2 Course: Introduction to R for Biologists April 23, 2012 1 Scripting As you have seen, you often want to run a sequence of commands several times, perhaps with small changes. An efficient

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N.

LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N. LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I Part Two Introduction to R Programming RStudio November 2016 Written by N.Nilgün Çokça Introduction to R Programming 5 Installing R & RStudio 5 The R Studio

More information

Introduction to Statistics using R/Rstudio

Introduction to Statistics using R/Rstudio Introduction to Statistics using R/Rstudio R and Rstudio Getting Started Assume that R for Windows and Macs already installed on your laptop. (Instructions for installations sent) R on Windows R on MACs

More information

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015 R syntax guide Richard Gonzalez Psychology 613 August 27, 2015 This handout will help you get started with R syntax. There are obviously many details that I cannot cover in these short notes but these

More information

R package

R package R package www.r-project.org Download choose the R version for your OS install R for the first time Download R 3 run R MAGDA MIELCZAREK 2 help help( nameofthefunction )? nameofthefunction args(nameofthefunction)

More information

Business Statistics: R tutorials

Business Statistics: R tutorials Business Statistics: R tutorials Jingyu He September 29, 2017 Install R and RStudio R is a free software environment for statistical computing and graphics. Download free R and RStudio for Windows/Mac:

More information

Getting Started. Slides R-Intro: R-Analytics: R-HPC:

Getting Started. Slides R-Intro:   R-Analytics:   R-HPC: Getting Started Download and install R + Rstudio http://www.r-project.org/ https://www.rstudio.com/products/rstudio/download2/ TACC ssh username@wrangler.tacc.utexas.edu % module load Rstats %R Slides

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

This document is designed to get you started with using R

This document is designed to get you started with using R An Introduction to R This document is designed to get you started with using R We will learn about what R is and its advantages over other statistics packages the basics of R plotting data and graphs What

More information

An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012

An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012 An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences Scott C Merrill September 5 th, 2012 Chapter 2 Additional help tools Last week you asked about getting help on packages.

More information

EPIB Four Lecture Overview of R

EPIB Four Lecture Overview of R EPIB-613 - Four Lecture Overview of R R is a package with enormous capacity for complex statistical analysis. We will see only a small proportion of what it can do. The R component of EPIB-613 is divided

More information

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio Introduction to R and R-Studio 2018-19 Toy Program #1 R Essentials This illustration Assumes that You Have Installed R and R-Studio If you have not already installed R and RStudio, please see: Windows

More information

Lecture 1: Getting Started and Data Basics

Lecture 1: Getting Started and Data Basics Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate

More information

Instruction: Download and Install R and RStudio

Instruction: Download and Install R and RStudio 1 Instruction: Download and Install R and RStudio We will use a free statistical package R, and a free version of RStudio. Please refer to the following two steps to download both R and RStudio on your

More information

What is Matlab? The command line Variables Operators Functions

What is Matlab? The command line Variables Operators Functions What is Matlab? The command line Variables Operators Functions Vectors Matrices Control Structures Programming in Matlab Graphics and Plotting A numerical computing environment Simple and effective programming

More information

Introduction to R Commander

Introduction to R Commander Introduction to R Commander 1. Get R and Rcmdr to run 2. Familiarize yourself with Rcmdr 3. Look over Rcmdr metadata (Fox, 2005) 4. Start doing stats / plots with Rcmdr Tasks 1. Clear Workspace and History.

More information

a suite of operators for calculations on arrays, in particular

a suite of operators for calculations on arrays, in particular The R Environment (Adapted from the Venables and Smith R Manual on www.r-project.org and from Andreas Buja s web site for Applied Statistics at http://www-stat.wharton.upenn.edu/ buja/stat-541/notes-stat-541.r)

More information

R in Linguistic Analysis. Week 2 Wassink Autumn 2012

R in Linguistic Analysis. Week 2 Wassink Autumn 2012 R in Linguistic Analysis Week 2 Wassink Autumn 2012 Today R fundamentals The anatomy of an R help file but first... How did you go about learning the R functions in the reading? More help learning functions

More information

R basics workshop Sohee Kang

R basics workshop Sohee Kang R basics workshop Sohee Kang Math and Stats Learning Centre Department of Computer and Mathematical Sciences Objective To teach the basic knowledge necessary to use R independently, thus helping participants

More information

STA 248 S: Some R Basics

STA 248 S: Some R Basics STA 248 S: Some R Basics The real basics The R prompt > > # A comment in R. Data To make the variable x equal to 2 use > x x = 2 To make x a vector, use the function c() ( c for concatenate)

More information

Introduction to R. Introduction to Econometrics W

Introduction to R. Introduction to Econometrics W Introduction to R Introduction to Econometrics W3412 Begin Download R from the Comprehensive R Archive Network (CRAN) by choosing a location close to you. Students are also recommended to download RStudio,

More information

DSCI 325: Handout 24 Introduction to Writing Functions in R Spring 2017

DSCI 325: Handout 24 Introduction to Writing Functions in R Spring 2017 DSCI 325: Handout 24 Introduction to Writing Functions in R Spring 2017 We have already used several existing R functions in previous handouts. For example, consider the Grades dataset. Once the data frame

More information

Short Introduction to R

Short Introduction to R Short Introduction to R Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México June, 2015. CIMMYT, México-SAGPDB Short Introduction to R 1/51 Contents 1 Introduction 2 Simple objects 3 User defined

More information

LAB #1: DESCRIPTIVE STATISTICS WITH R

LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R R Workshop Guide This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, you can access a more extensive user guide for R on the

More information

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri) R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

Introduction to R: Using R for statistics and data analysis

Introduction to R: Using R for statistics and data analysis Why use R? Introduction to R: Using R for statistics and data analysis George W Bell, Ph.D. BaRC Hot Topics November 2014 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/

More information

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot.

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. 3 confint A metafor package function that gives you the confidence intervals of effect sizes.

More information

MATLAB NOTES. Matlab designed for numerical computing. Strongly oriented towards use of arrays, one and two dimensional.

MATLAB NOTES. Matlab designed for numerical computing. Strongly oriented towards use of arrays, one and two dimensional. MATLAB NOTES Matlab designed for numerical computing. Strongly oriented towards use of arrays, one and two dimensional. Excellent graphics that are easy to use. Powerful interactive facilities; and programs

More information

7/18/16. Review. Review of Homework. Lecture 3: Programming Statistics in R. Questions from last lecture? Problems with Stata? Problems with Excel?

7/18/16. Review. Review of Homework. Lecture 3: Programming Statistics in R. Questions from last lecture? Problems with Stata? Problems with Excel? Lecture 3: Programming Statistics in R Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Questions from last lecture? Problems with Stata? Problems with Excel? 2

More information

Introduction to R: Part I

Introduction to R: Part I Introduction to R: Part I Jeffrey C. Miecznikowski March 26, 2015 R impact R is the 13th most popular language by IEEE Spectrum (2014) Google uses R for ROI calculations Ford uses R to improve vehicle

More information

Mails : ; Document version: 14/09/12

Mails : ; Document version: 14/09/12 Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary

More information

Eric Pitman Summer Workshop in Computational Science

Eric Pitman Summer Workshop in Computational Science Eric Pitman Summer Workshop in Computational Science 2. Data Structures: Vectors and Data Frames Jeanette Sperhac Data Objects in R These objects, composed of multiple atomic data elements, are the bread

More information

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Contents Things to Know Before You Begin.................................... 1 Entering and Outputting Data......................................

More information

Basic R Part 1. Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York U.S.A. by Aureliano Bombarely Gomez

Basic R Part 1. Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York U.S.A. by Aureliano Bombarely Gomez Basic R Part 1 Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York 14853-1801 U.S.A. by Aureliano Bombarely Gomez A Brief Introduction to R: 1. What is R? 2. Software and documentation.

More information

Introduction to R: Using R for statistics and data analysis

Introduction to R: Using R for statistics and data analysis Why use R? Introduction to R: Using R for statistics and data analysis George W Bell, Ph.D. BaRC Hot Topics November 2015 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/

More information

Lab 1: Getting started with R and RStudio Questions? or

Lab 1: Getting started with R and RStudio Questions? or Lab 1: Getting started with R and RStudio Questions? david.montwe@ualberta.ca or isaacren@ualberta.ca 1. Installing R and RStudio To install R, go to https://cran.r-project.org/ and click on the Download

More information

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3)

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3) Basics of R 1. Arithmetic Operators > 2+2 > sqrt(2) # (2) >2^2 > sin(pi) # sin(π) >(1-2)*3 > exp(1) # e 1 >1-2*3 > log(10) # This is a short form of the full command, log(10, base=e). (Note) For log 10

More information

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org

More information

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics November 2013 George W. Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ To perform inferential

More information

Statistics 120 Statistical Computing With R. First Prev Next Last Go Back Full Screen Close Quit

Statistics 120 Statistical Computing With R. First Prev Next Last Go Back Full Screen Close Quit Statistics 120 Statistical Computing With R First Prev Next Last Go Back Full Screen Close Quit The R System This course uses the R computing environment for practical examples. R serves both as a statistical

More information

Introduction to MatLab. Introduction to MatLab K. Craig 1

Introduction to MatLab. Introduction to MatLab K. Craig 1 Introduction to MatLab Introduction to MatLab K. Craig 1 MatLab Introduction MatLab and the MatLab Environment Numerical Calculations Basic Plotting and Graphics Matrix Computations and Solving Equations

More information

Installing and Using R

Installing and Using R The National Animal Nutrition Program (NANP) Modeling Committee A National Research Support Project (NRSP-9) Supported by the Experiment Station Committee on Organization and Policy, The State Agricultural

More information

Intro to R. Fall Fall 2017 CS130 - Intro to R 1

Intro to R. Fall Fall 2017 CS130 - Intro to R 1 Intro to R Fall 2017 Fall 2017 CS130 - Intro to R 1 Intro to R R is a language and environment that allows: Data management Graphs and tables Statistical analyses You will need: some basic statistics We

More information

Introduction to CS databases and statistics in Excel Jacek Wiślicki, Laurent Babout,

Introduction to CS databases and statistics in Excel Jacek Wiślicki, Laurent Babout, One of the applications of MS Excel is data processing and statistical analysis. The following exercises will demonstrate some of these functions. The base files for the exercises is included in http://lbabout.iis.p.lodz.pl/teaching_and_student_projects_files/files/us/lab_04b.zip.

More information

Index. Bar charts, 106 bartlett.test function, 159 Bottles dataset, 69 Box plots, 113

Index. Bar charts, 106 bartlett.test function, 159 Bottles dataset, 69 Box plots, 113 Index A Add-on packages information page, 186 187 Linux users, 191 Mac users, 189 mirror sites, 185 Windows users, 187 aggregate function, 62 Analysis of variance (ANOVA), 152 anova function, 152 as.data.frame

More information

STAT10010 Introductory Statistics Lab 2

STAT10010 Introductory Statistics Lab 2 STAT10010 Introductory Statistics Lab 2 1. Aims of Lab 2 By the end of this lab you will be able to: i. Recognize the type of recorded data. ii. iii. iv. Construct summaries of recorded variables. Calculate

More information

Getting Started in R

Getting Started in R Getting Started in R Giles Hooker May 28, 2007 1 Overview R is a free alternative to Splus: a nice environment for data analysis and graphical exploration. It uses the objectoriented paradigm to implement

More information

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created

More information

R Basics / Course Business

R Basics / Course Business R Basics / Course Business We ll be using a sample dataset in class today: CourseWeb: Course Documents " Sample Data " Week 2 Can download to your computer before class CourseWeb survey on research/stats

More information

An introduction to R WS 2013/2014

An introduction to R WS 2013/2014 An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,

More information

Lab 1 - Worksheet Spring 2013

Lab 1 - Worksheet Spring 2013 Math 300 UMKC Lab 1 - Worksheet Spring 2013 Learning Objectives: 1. How to use Matlab as a calculator 2. Learn about Matlab built in functions 3. Matrix and Vector arithmetics 4. MATLAB rref command 5.

More information

Statistical Software Camp: Introduction to R

Statistical Software Camp: Introduction to R Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities

More information

MATLAB Basics EE107: COMMUNICATION SYSTEMS HUSSAIN ELKOTBY

MATLAB Basics EE107: COMMUNICATION SYSTEMS HUSSAIN ELKOTBY MATLAB Basics EE107: COMMUNICATION SYSTEMS HUSSAIN ELKOTBY What is MATLAB? MATLAB (MATrix LABoratory) developed by The Mathworks, Inc. (http://www.mathworks.com) Key Features: High-level language for numerical

More information

A Guide to Using Some Basic MATLAB Functions

A Guide to Using Some Basic MATLAB Functions A Guide to Using Some Basic MATLAB Functions UNC Charlotte Robert W. Cox This document provides a brief overview of some of the essential MATLAB functionality. More thorough descriptions are available

More information

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises Introduction to R Course in Practical Analysis of Microarray Data Computational Exercises 2010 March 22-26, Technischen Universität München Amin Moghaddasi, Kurt Fellenberg 1. Installing R. Check whether

More information

An Introduction to the R Commander

An Introduction to the R Commander An Introduction to the R Commander BIO/MAT 460, Spring 2011 Christopher J. Mecklin Department of Mathematics & Statistics Biomathematics Research Group Murray State University Murray, KY 42071 christopher.mecklin@murraystate.edu

More information

The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up.

The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up. Fall 2010 Handout on Using R Page: 1 The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up. 1. Installing R First off,

More information

R Workshop Daniel Fuller

R Workshop Daniel Fuller R Workshop Daniel Fuller Welcome to the R Workshop @ Memorial HKR The R project for statistical computing is a free open source statistical programming language and project. Follow these steps to get started:

More information

Introduction to RStudio

Introduction to RStudio Introduction to RStudio Ulrich Halekoh Epidemiology and Biostatistics, SDU May 4, 2018 R R is a language that started by Ross Ihaka and Robert Gentleman in 1991 as an open source alternative to S emphasizes

More information

Introduction to R Programming

Introduction to R Programming Course Overview Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data

More information

Introduction to R. Stat Statistical Computing - Summer Dr. Junvie Pailden. July 5, Southern Illinois University Edwardsville

Introduction to R. Stat Statistical Computing - Summer Dr. Junvie Pailden. July 5, Southern Illinois University Edwardsville Introduction to R Stat 575 - Statistical Computing - Summer 2016 Dr. Junvie Pailden Southern Illinois University Edwardsville July 5, 2016 Why R R offers a powerful and appealing interactive environment

More information

Outline. CSE 1570 Interacting with MATLAB. Outline. Starting MATLAB. MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An.

Outline. CSE 1570 Interacting with MATLAB. Outline. Starting MATLAB. MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An. CSE 10 Interacting with MATLAB Instructor: Aijun An Department of Computer Science and Engineering York University aan@cse.yorku.ca Outline Starting MATLAB MATLAB Windows Using the Command Window Some

More information

STAT 540 Computing in Statistics

STAT 540 Computing in Statistics STAT 540 Computing in Statistics Introduces programming skills in two important statistical computer languages/packages. 30-40% R and 60-70% SAS Examples of Programming Skills: 1. Importing Data from External

More information

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R MBV4410/9410 Fall 2018 Bioinformatics for Molecular Biology Introduction to R Outline Introduce R Basic operations RStudio Bioconductor? Goal of the lecture Introduce you to R Show how to run R, basic

More information

Introduction to MATLAB

Introduction to MATLAB ELG 3125 - Lab 1 Introduction to MATLAB TA: Chao Wang (cwang103@site.uottawa.ca) 2008 Fall ELG 3125 Signal and System Analysis P. 1 Do You Speak MATLAB? MATLAB - The Language of Technical Computing ELG

More information

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline. MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An.

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline. MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An. CSE 170 Interacting with MATLAB Instructor: Aijun An Department of Computer Science and Engineering York University aan@cse.yorku.ca Outline Starting MATLAB MATLAB Windows Using the Command Window Some

More information

Numerical Methods 5633

Numerical Methods 5633 Numerical Methods 5633 Lecture 1 Marina Krstic Marinkovic marina.marinkovic@cern.ch School of Mathematics Trinity College Dublin Marina Krstic Marinkovic 1 / 15 5633-Numerical Methods R programming https://www.r-project.org/

More information

Example how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler

Example how not to do it: JMP in a nutshell 1 HR, 17 Apr Subject Gender Condition Turn Reactiontime. A1 male filler JMP in a nutshell 1 HR, 17 Apr 2018 The software JMP Pro 14 is installed on the Macs of the Phonetics Institute. Private versions can be bought from

More information

MATLAB INTRODUCTION. Risk analysis lab Ceffer Attila. PhD student BUTE Department Of Networked Systems and Services

MATLAB INTRODUCTION. Risk analysis lab Ceffer Attila. PhD student BUTE Department Of Networked Systems and Services MATLAB INTRODUCTION Risk analysis lab 2018 2018. szeptember 10., Budapest Ceffer Attila PhD student BUTE Department Of Networked Systems and Services ceffer@hit.bme.hu Előadó képe MATLAB Introduction 2

More information

Handout #1. The abbreviations of FIVE references are PE, MPS, BR, FCDAE, and PRA. There is additional reference about the use of R (BR).

Handout #1. The abbreviations of FIVE references are PE, MPS, BR, FCDAE, and PRA. There is additional reference about the use of R (BR). Handout #1 Title: FAE Course: Econ 368/01 Spring/2015 Instructor: Dr. I-Ming Chiu The abbreviations of FIVE references are PE, MPS, BR, FCDAE, and PRA. There is additional reference about the use of R

More information

Introduction to Matlab

Introduction to Matlab Introduction to Matlab Kristian Sandberg Department of Applied Mathematics University of Colorado Goal The goal with this worksheet is to give a brief introduction to the mathematical software Matlab.

More information

Microsoft Access 2016

Microsoft Access 2016 Access 2016 Instructor s Manual Page 1 of 10 Microsoft Access 2016 Module Two: Querying a Database A Guide to this Instructor s Manual: We have designed this Instructor s Manual to supplement and enhance

More information

Microsoft Access 2016

Microsoft Access 2016 Access 2016 Instructor s Manual Page 1 of 10 Microsoft Access 2016 Module Two: Querying a Database A Guide to this Instructor s Manual: We have designed this Instructor s Manual to supplement and enhance

More information

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ To perform inferential statistics

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

What is a Function? EF102 - Spring, A&S Lecture 4 Matlab Functions

What is a Function? EF102 - Spring, A&S Lecture 4 Matlab Functions What is a Function? EF102 - Spring, 2002 A&S Lecture 4 Matlab Functions What is a M-file? Matlab Building Blocks Matlab commands Built-in commands (if, for, ) Built-in functions sin, cos, max, min Matlab

More information

Starting Matlab. MATLAB Laboratory 09/09/10 Lecture. Command Window. Drives/Directories. Go to.

Starting Matlab. MATLAB Laboratory 09/09/10 Lecture. Command Window. Drives/Directories. Go to. Starting Matlab Go to MATLAB Laboratory 09/09/10 Lecture Lisa A. Oberbroeckling Loyola University Maryland loberbroeckling@loyola.edu http://ctx.loyola.edu and login with your Loyola name and password...

More information

History, installation and connection

History, installation and connection History, installation and connection The men behind our software Jim Goodnight, CEO SAS Inc Ross Ihaka Robert Gentleman (Duncan Temple Lang) originators of R 2 / 75 History SAS From late 1960s, North Carolina

More information

Using R for statistics and data analysis

Using R for statistics and data analysis Introduction ti to R: Using R for statistics and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ Why use R? To perform inferential statistics (e.g.,

More information

Univariate Data - 2. Numeric Summaries

Univariate Data - 2. Numeric Summaries Univariate Data - 2. Numeric Summaries Young W. Lim 2018-08-01 Mon Young W. Lim Univariate Data - 2. Numeric Summaries 2018-08-01 Mon 1 / 36 Outline 1 Univariate Data Based on Numerical Summaries R Numeric

More information

PART 1 PROGRAMMING WITH MATHLAB

PART 1 PROGRAMMING WITH MATHLAB PART 1 PROGRAMMING WITH MATHLAB Presenter: Dr. Zalilah Sharer 2018 School of Chemical and Energy Engineering Universiti Teknologi Malaysia 23 September 2018 Programming with MATHLAB MATLAB Environment

More information

An Introduction to R 1.1 Getting started

An Introduction to R 1.1 Getting started An Introduction to R 1.1 Getting started Dan Navarro (daniel.navarro@adelaide.edu.au) School of Psychology, University of Adelaide ua.edu.au/ccs/people/dan DSTO R Workshop, 29-Apr-2015 There s a book http://ua.edu.au/ccs/teaching/lsr/

More information

Introduction to MATLAB

Introduction to MATLAB to MATLAB Spring 2019 to MATLAB Spring 2019 1 / 39 The Basics What is MATLAB? MATLAB Short for Matrix Laboratory matrix data structures are at the heart of programming in MATLAB We will consider arrays

More information

Introduction to MATLAB. Computational Probability and Statistics CIS 2033 Section 003

Introduction to MATLAB. Computational Probability and Statistics CIS 2033 Section 003 Introduction to MATLAB Computational Probability and Statistics CIS 2033 Section 003 About MATLAB MATLAB (MATrix LABoratory) is a high level language made for: Numerical Computation (Technical computing)

More information

Introduction to MATLAB for Engineers, Third Edition

Introduction to MATLAB for Engineers, Third Edition PowerPoint to accompany Introduction to MATLAB for Engineers, Third Edition William J. Palm III Chapter 2 Numeric, Cell, and Structure Arrays Copyright 2010. The McGraw-Hill Companies, Inc. This work is

More information

STAT 20060: Statistics for Engineers. Statistical Programming with R

STAT 20060: Statistics for Engineers. Statistical Programming with R STAT 20060: Statistics for Engineers Statistical Programming with R Why R? Because it s free to download for everyone! Most statistical software is very, very expensive, so this is a big advantage. Statisticians

More information

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline (Cont d) MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An

Outline. CSE 1570 Interacting with MATLAB. Starting MATLAB. Outline (Cont d) MATLAB Windows. MATLAB Desktop Window. Instructor: Aijun An CSE 170 Interacting with MATLAB Instructor: Aijun An Department of Computer Science and Engineering York University aan@cse.yorku.ca Outline Starting MATLAB MATLAB Windows Using the Command Window Some

More information

R commander an introduction

R commander an introduction R commander an introduction free, user-friendly, and powerful software Ho Kim SCHOOL OF PUBLIC HEALTH, SNU Useful sites R is a free software with powerful tools The Comprehensive R Archives Network http://cran.r-project.org/

More information

Stat 302 Statistical Software and Its Applications Introduction to R

Stat 302 Statistical Software and Its Applications Introduction to R Stat 302 Statistical Software and Its Applications Introduction to R Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 January 8, 2015 2 Statistical Software There are

More information

1 Introduction. 1.1 What is Statistics?

1 Introduction. 1.1 What is Statistics? 1 Introduction 1.1 What is Statistics? MATH1015 Biostatistics Week 1 Statistics is a scientific study of numerical data based on natural phenomena. It is also the science of collecting, organising, interpreting

More information

Numeric Vectors STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley

Numeric Vectors STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley Numeric Vectors STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Data Types and Structures To make the

More information