BIOSTAT640 R Lab1 for Spring 2016

Size: px
Start display at page:

Download "BIOSTAT640 R Lab1 for Spring 2016"

Transcription

1 BIOSTAT640 R Lab1 for Spring 2016 Minming Li & Steele H. Valenzuela Feb.1, 2016 This is the first R lab session of course BIOSTAT640 at UMass during the Spring 2016 semester. I, Minming (Matt) Li, am going to use the R codes from some questions of the first 3 homework sets as examples for the R lab session. For Amherst class on Wed (02/03/2016), I am going to present this document, and during this talk, I am also going to cover the key points and personal suggestions in using RStudio, R Markdown, etc. Starting from Page14, it was written by Steele Valenzuela in 2015, which I also combined into this file. For HW1 questions after Q4, please refer to the other version of solution by Prof. Carol Bigelow. Here we will mainly use R to solve HW1: Q1-Q4. Q1: a <- c(5, 10, 6, 11, 5, 14, 30, 11, 17, 3, 9, 3, 8, 8, 5, 5, 7, 4, 3, 7, 9, 11, 11, 9, 4) breaks <- seq(0, 30, by=5) # set intervals # breaks a.cut <- cut(a, breaks, right=true) a.freq <- table(a.cut) # a.freq a.cumfreq <- cumsum(a.freq) # cumsum means the cumulative sums # a.cumfreq a.relfreq <- a.freq/length(a) # a.relfreq a.cumrelfreq <- cumsum(a.relfreq) # a.cumrelfreq Q1table <- cbind(a.freq, a.cumfreq, a.relfreq, a.cumrelfreq) colnames(q1table) <- c("freq", "Cum Freq", "Rel Freq", "Cum Rel Freq") Q1table Freq Cum Freq Rel Freq Cum Rel Freq (0,5] (5,10] (10,15] (15,20] (20,25] (25,30] Q2: Boxplot for cholesterol levels (mg/dl) for two groups of men 1

2 grp1 <- c(233, 291, 312, 250, 246, 197, 268, 224, 239, 239, 254, 276, 234, 181, 248, 252, 202, 218, 212, 325) grp2 <- c(344, 185, 263, 246, 224, 212, 188, 250, 148, 169, 226, 175, 242, 252, 153, 183, 137, 202, 194, 213) cholesterol <- cbind(grp1, grp2) # cholesterol # median(grp1) #242.5 # median(grp2) #207 # 2a: Side-by-side box plot boxplot(grp1, grp2) # 2b. Side-by-side histograms with same definitions (starting value, ending value, tick marks, etc) of t # summary(grp1) # summary(grp2) par(mfrow=c(1,2)) hist(grp1, breaks=seq(120, 360, by=20)) hist(grp2, breaks=seq(120, 360, by=20)) 2

3 Histogram of grp1 Histogram of grp2 Frequency Frequency grp grp2 Comparing the two distributions, the first group of men has higher median, Q1, Q3 and Min values than the second group, but has smaller Max than 2nd group. Q3: Prob(11 or more of the 64 exposed firefighters reporting breath shortness): 1-pbinom(10, 64, 1/22) [1] # OR WE CAN USE: sum(dbinom(11:64, 64, 1/22)) # Q4: 4a. What proportion of women have weights that are outside the ACES-II ejection seat acceptable range? pnorm(140, mean=143, sd=29) + (1-pnorm(211, mean=143, sd=29)) [1] b. In a sample of 1000 women, how many are expected to have weights below the 140 lb threshold? 3

4 1000*pnorm(140, mean=143, sd=29) [1] So, there are about 459 women that are expected to have weights below the 140 lb threshold. For HW2 Q1 and Q2, please refer to the other version of solution by Prof. Carol Bigelow. Here we will mainly use R to solve HW2-Q3. Q3: Let s download the dataset. library(foreign) url <- " dat <- read.dta(file = url) # read.dta: Read Stata Binary Files # dat head(dat) temp boiling x <- dat$temp y1 <- dat$boiling m1 <- lm(y1 ~ x) # OR: m1 <- lm(boiling~temp, data=dat) # OR: m1 <- lm(dat$boiling ~ dat$temp) summary(m1) Call: lm(formula = y1 ~ x) Residuals: Min 1Q Median 3Q Max

5 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-14 *** x < 2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: < 2.2e-16 m1$coefficients # output the intercept and slope (Intercept) x # confint(m1) # output the CI for the parameters in the fitted model anova(m1) Analysis of Variance Table Response: y1 Df Sum Sq Mean Sq F value Pr(>F) x < 2.2e-16 *** Residuals Signif. codes: 0 *** ** 0.01 * plot(x, y1, main="simple Linear Regression of Y=boiling on X=temp", xlab="temp", ylab="boiling", pch=18) # pch means "plot character", can be 1 to 18; abline(m1, col="red", lty=1, lwd=2) # lty: "line type"; lwd: "line width". 5

6 Simple Linear Regression of Y=boiling on X=temp boiling temp From the above summary(m1) output, we know that: (a): Parameter Estimates: and 0.44, so regression line: Y= *X; (b): ANOVA (Analysis of Variance) table is obtained above using the anova(m1) code; (c): R-square=0.9207; (d): The scatterplot with the fitted line is plotted above. (2). Now, instead of Y=boiling, we want to use newy=100*log10(boiling) newy=100*log10(y1) m2 <- lm(newy ~ x) summary(m2) Call: lm(formula = newy ~ x) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** x e-15 *** --- 6

7 Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: 3.623e-15 m2$coefficients # output the intercept and slope (Intercept) x # confint(m2) # output the CI for the parameters in the fitted model anova(m2) Analysis of Variance Table Response: newy Df Sum Sq Mean Sq F value Pr(>F) x e-15 *** Residuals Signif. codes: 0 *** ** 0.01 * plot(x, newy, main="simple Linear Regression of Y=100*log10(boiling) on X=temp", xlab="temp", ylab="100*log10(boiling)", pch=18) # pch means "plot character", can be 1 to 18; abline(m2, col="red", lwd=2) # lwd means "line width" Simple Linear Regression of Y=100*log10(boiling) on X=temp 100*log10(boiling) temp 7

8 From the above summary(m2) output, we know that: (a): Parameter Estimates: and 0.93, so regression line: 100log10(Y)= *X; (b): ANOVA (Analysis of Variance) table is obtained above using the anova(m2) code; (c): R-square=0.8852; (d): The scatterplot with the fitted line is plotted above. Solution (one paragraph of text that is interpretation of analysis): Did you notice that the scatter plot of these data reveal two outlying values? Their inclusion may or may not be appropriate. If all n=31 data points are included in the analysis, then the model that explains more of the variability in boiling point is Y=boiling point modeled linearly in X=temp. It has a greater Rˆ2 (92.07% vs %). Be careful - It would not make sense to compare the residual mean squares of the two models because the scales of measurement involved are di erent. For HW3 Q1, Q4, Q5, please refer to the other version of solution by Prof. Carol Bigelow. Here we will mainly use R to solve Q2,Q3. Q3: Use R to reproduce th anova tables you worked with in Q2. Let s download the dataset. library(foreign) url <- " dat <- read.dta(file = url) # read.dta: Read Stata Binary Files # dat head(dat) id y x1 x dim(dat) [1] 15 4 str(dat) 8

9 data.frame : 15 obs. of 4 variables: $ id: num $ y : num $ x1: num $ x2: num attr(*, "datalabel")= chr "PubHlth 640 Unit 2 Regression - Larvae data" - attr(*, "time.stamp")= chr "10 Feb :36" - attr(*, "formats")= chr "%9.0g" "%9.0g" "%9.0g" "%9.0g" - attr(*, "types")= int attr(*, "val.labels")= chr "" "" "" "" - attr(*, "var.labels")= chr "larva id" "log10(survival)" "log10(dose)" "log10(weight)" - attr(*, "expansion.fields")=list of 2..$ : chr "_dta" "note1" "\"Week 3 homework assignment exercises 2 and 3\""..$ : chr "_dta" "note0" "1" - attr(*, "version")= int 12 summary(dat) id y x1 x2 Min. : 1.0 Min. :2.351 Min. : Min. : st Qu.: 4.5 1st Qu.: st Qu.: st Qu.: Median : 8.0 Median :2.452 Median : Median : Mean : 8.0 Mean :2.567 Mean : Mean : rd Qu.:11.5 3rd Qu.: rd Qu.: rd Qu.: Max. :15.0 Max. :2.966 Max. : Max. : # model regression on x1 alone m1 <- lm(y ~ x1, data=dat) summary(m1) Call: lm(formula = y ~ x1, data = dat) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-15 *** x e-05 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 13 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 13 DF, p-value: 7.944e-05 m1$coefficients # output the intercept and slope (Intercept) x

10 confint(m1) # output the CI for the parameters in the fitted model 2.5 % 97.5 % (Intercept) x anova(m1) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) x e-05 *** Residuals Signif. codes: 0 *** ** 0.01 * # model regression on x2 alone m2 <- lm(y ~ x2, data=dat) summary(m2) Call: lm(formula = y ~ x2, data = dat) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-13 *** x *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 13 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 13 DF, p-value: m2$coefficients # output the intercept and slope (Intercept) x confint(m2) # output the CI for the parameters in the fitted model 2.5 % 97.5 % (Intercept) x

11 anova(m2) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) x *** Residuals Signif. codes: 0 *** ** 0.01 * # model regression on x1 and x2 m3 <- lm(y ~ x1 + x2, data=dat) summary(m3) Call: lm(formula = y ~ x1 + x2, data = dat) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-13 *** x e-05 *** x *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 12 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 12 DF, p-value: 6.085e-07 m3$coefficients # output the intercept and slope (Intercept) x1 x confint(m3) # output the CI for the parameters in the fitted model 2.5 % 97.5 % (Intercept) x x anova(m3) Analysis of Variance Table 11

12 Response: y Df Sum Sq Mean Sq F value Pr(>F) x e-07 *** x *** Residuals Signif. codes: 0 *** ** 0.01 * From above, you see that there are 3 times that we input almost exactly the same codes, but on m1, or m2, or m3. Here is a good chance to use Function in R. model.regress <- function(x) { a <- summary(x) b <- x$coefficients c <- confint(x) d <- anova(x) print(a); print(b); print(c); print(d) # the last line typically specifies what one wants the function to print } m1 <- lm(y ~ x1, data=dat) m2 <- lm(y ~ x2, data=dat) m3 <- lm(y ~ x1 + x2, data=dat) model.regress(m1) Call: lm(formula = y ~ x1, data = dat) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-15 *** x e-05 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 13 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 13 DF, p-value: 7.944e-05 (Intercept) x % 97.5 % (Intercept)

13 x Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) x e-05 *** Residuals Signif. codes: 0 *** ** 0.01 * # model.regress(m2) # model.regress(m3) 13

14 First Session Steele H. Valenzuela January 2015 Contents Downloading and Installing R and RStudio (for PC/Mac) If you have an Ubuntu operating system (OS) or any other OS aside from a PC/Mac, then you must love Linux and/or working in the terminal. That s impressive and you most likely can skip ahead to the section titled First Session In order to download R, please see their website. The website appears underdeveloped but you re at the right spot. Under the heading Download and Install R, you ll see the links to download R for your respective OS. In order to download RStudio (a fancy graphical user interface [GUI]), please see their website. This website is not underdeveloped but rather crisp. Moving on, under the heading Installers for ALL Platforms, you ll see the links to download RStudio for your respective OS. For your respective OS, it should be the first or the second link. The best explanation to give as to why you need both is this: RStudio is the fancy and friendly GUI that keeps you sane and organized as well as providing an inviting workplace, hence the very point of a GUI. You CANNOT use RStudio without R. R is what is under the hood, the engine per se, and RStudio is the interior and exterior of your car/truck/italian moped. I am automobile illiterate so please don t make me continue with this metaphor. Launching and Exiting RStudio Launching RStudio For PC users, please do the following: Start > All Programs > RStudio For Mac users, please do the following: Applications > RStudio; or Click on the RStudio icon shortcut on your dock; or In spotlight search (Cmd + Spacebar; or the magnifying glass in the upper right corner), enter RStudio and press return. Once in awhile, you will be asked to update RStudio. Please do so. Exiting RStudio From the toolbar at the top, at the far left: Click RStudio. From the drop down menu, click Quit RStudio. For PC, the shortcut is [insert shortcut here]. For Macs, the shortcut is Cmd + Q. Toolbar and Windows Upon starting RStudio, it will be intimidating, but do not fear, your loyal TA is here. There is a lot to grasp but getting your feet wet is the best way to progress with R. 1

15 Toolbar highlight. There are many options present in the toolbar, but for now, there are 4 commands we will The first command is under the following: File > New File From there, one should see R Script. Similar to Stata s do-file, a script allows you to write notes, commands, and anything else one may desire for a session. The second command is near the first: File > New Project From there, a box will pop up, asking you to Create a Project From. Choose the first option, New Directory, followed by Empty Directory, and give your project a name. For the sake of this class, name your project by the respective week of the class or the session number. The third command is under the following: Tools > Import Dataset > From Text File... /From Web URL... Uploading the data can be painful and if it takes you longer than 10 minutes, please consult me or any online 2

16 help forums. Unless one is scraping data from Twitter or some other complex method, it should be relatively easy. If need be, copy and paste into excel and save as a comma separated value (CSV) file and then import. You ll also notice the shortcut symbol for Import Dataset, a sheet and an arrow. The fourth command is under the following: Tools > Global Options For now, Appearance and Pane Layout are readily available to change the appearance of RStudio, such as the size of the text, the color scheme of the script and code, and various other options. 3

17 RStudio Layout Upon starting RStudio, you will see this similar layout (enter image). One half of the computer monitor is devoted to the source code, scripts, and the console, which runs the code. As for the other half of the monitor, you will see an environment where objects are stored, such as datasets and functions as well as the name of your files, plots, packages, and other tabs that you may explore. First Session Key to Colors The default color theme for R is TextMate. If you recall, you may change the theme under Tools > Global Options > Appearances. 4

18 Green- These are comments. In R, commands that begin with a number sign/pound sign/hashtag are comments. Blue- These are commands. Each command begins with an angle bracket ( > ). Black- This is R output. Compare it with the output you get. Preliminaries (working directory, start log, input data) In order to start or track a log, there is something called git. You may have heard of GitHub. We ll talk about that at a later time. As for now, simply start a new file or script to save your notes and commands. Your working directory is the default location to save and generate files, datasets, etc. Here are two helpful commands. # The following command displays your current working directory. getwd(...) # default location will be where project is specified 5

19 # The following command changes your current working directory to whatever you set it to. setwd(...) # this is a pain to write out but it ll become useful R has the ability to import data from a web url as well as a text file (CSV, comma delimited or tab files, etc.). R contains packages, a unique set of commands and datasets, which are developed by users from all over the world. Before uploading a package into your library, you must install the respective package as such: install.packages( package name ). In order to upload the same dataset from the Stata handout, the following commands will ensue. library(foreign) # a package that translates Stata, SAS, or SPSS data into R stata <- " # Web URL, note.../ivf.dta ivf <- read.dta(file = stata) # ivf is the name of the dataset whereas read.dta is the command As I m sure you ve noticed, the symbol <- is how objects are stored in R. View Data Structure Here are commands to view the structure of the data. str(ivf) # the structure of dataset ivf data.frame : 641 obs. of 6 variables: $ id : num $ matage : int $ hyp : int $ gestwks: num $ sex : Factor w/ 2 levels "male","female": $ bweight: int attr(*, "datalabel")= chr "In Vitro Fertilization data" - attr(*, "time.stamp")= chr "27 Aug :11" - attr(*, "formats")= chr "%9.0g" "%8.0g" "%8.0g" "%9.0g"... - attr(*, "types")= int attr(*, "val.labels")= chr "" "" "" ""... - attr(*, "var.labels")= chr "identity number" "maternal age (years)" "hypertension (1=yes, 0=no)" - attr(*, "version")= int 7 - attr(*, "label.table")=list of 1..$ sex: Named int attr(*, "names")= chr "male" "female" dim(ivf) # displays the dimensions of dataset ivf [1] names(ivf) # displays variable/column names [1] "id" "matage" "hyp" "gestwks" "sex" "bweight" summary(ivf) # summary of ivf 6

20 id matage hyp gestwks sex Min. : 1 Min. :23 Min. :0.000 Min. :24.7 male :326 1st Qu.:161 1st Qu.:31 1st Qu.: st Qu.:38.0 female:315 Median :321 Median :34 Median :0.000 Median :39.1 Mean :321 Mean :34 Mean :0.139 Mean :38.7 3rd Qu.:481 3rd Qu.:37 3rd Qu.: rd Qu.:40.1 Max. :641 Max. :43 Max. :1.000 Max. :42.4 NA s :2 bweight Min. : 630 1st Qu.:2850 Median :3200 Mean :3129 3rd Qu.:3550 Max. :4650 summary(ivf$sex) # factoral summary of categorical variable sex male female summary(ivf$bweight) # numerical summary of variable bweight Min. 1st Qu. Median Mean 3rd Qu. Max Examine Data The simplest way to view your data is with the following command: View(ivf) # will display a table in the same pane as scripts To view snippets of your data, implement the following commands: head(ivf) # default display is 6 rows id matage hyp gestwks sex bweight female female female male female male 3260 tail(ivf, 10) # displays tail end of ivf, more specifically, the last 10 rows id matage hyp gestwks sex bweight female female female male

21 male female male female female male 2920 Lastly, you may view specific rows and columns as such: ivf[10, 6] # 10th row, 6th column ivf[325, ] # 325th row, all columns ivf[, 4] # all 641 rows, 4th column Single Variable Description syntax: In R, if one would like to highlight a single variable, implement the following mean(ivf$bweight) # dataset followed by $ followed by variable name [1] 3129 If one would like to explore a single variable, here are some commands: min(ivf$matage); max(ivf$matage) # similar to SAS, you may implement the a semi-colon to separate comman [1] 23 [1] 43 range(ivf$gestwks) # range [1] mean(ivf$gestwks); median(ivf$gestwks) # mean and median [1] [1] sd(ivf$gestwks) # standard deviation [1] 2.33 table(ivf$sex) # frequency male female

22 The base graphics in R are really just that, basic, but in the coming weeks, new graphical packages will be explained and implemented. For now, here are a few commands: hist(ivf$bweight) # histogram Histogram of ivf$bweight Frequency boxplot(ivf$bweight) # boxplot ivf$bweight stem(ivf$bweight) # stem and leaf plot...not sure if you ll need to ever use this 9

23 The decimal point is 2 digit(s) to the right of the Two Variable Descriptives Here are a few commands for two variable descriptives: table(ivf$sex, ivf$hyp) # 2-way table 0 1 male female table(ivf$sex, ivf$hyp)/641 # R does not have a good way to display percentages so it must be done manua 0 1 male female And some graphical commands for two variable descriptives: boxplot(ivf$gestwks ~ ivf$sex) # one continuous variable by one descriptive variable 10

24 male female plot(ivf$gestwks, ivf$bweight, main = "insert main title here") # simple plot command insert main title here ivf$bweight ivf$gestwks plot(ivf$gestwks, ivf$bweight, xlab = "x-axis label here", ylab = "y-axis label here") 11

25 y axis label here x axis label here We can already see that in terms of summary statistics, Stata can be quite useful, so no hard feelings if you use Stata in the beginning. Very Important Information In RStudio, under one of the panes, you will see a tab for files. Depending on your working directory, you may see your own files from another project. DO NOT DELETE THESE FILES AS THEY WILL BE PERMANENTLY DELETED. This happened to a student from a class last semester... There is a lot I have not included because there is simply too much material. If you are ever confused about a command, type the following:?plot # this will direct you to a file under the help tab??plot # a more general search containing the words after?? Lastly, there are many resources online, such as stackoverflow or just to name a few. Professor Nicholas Reich taught a course last semester in R and his website is at this link. 12

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R...

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R... Introduction to R and R-Studio 2017-18 02. Getting Data Into R 1. Enter Data Directly into R...... 2. Import Excel Data (.xlsx ) into R..... 3. Import Stata Data (.dta ) into R...... a) From a folder on

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

Introduction to R. Dataset Basics. March 2018

Introduction to R. Dataset Basics. March 2018 Introduction to R March 2018 1. Preliminaries.... a) Suggested packages for importing/exporting data.... b) FAQ: How to find the path of your dataset (or whatever). 2. Import/Export Data........ a) R (.Rdata)

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... 2. Load R Data.. a. Load R data frames...

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata Paul Dickman September 2003 1 A brief introduction to Stata Starting the Stata program

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users BIOSTATS 640 Spring 2018 Review of Introductory Biostatistics STATA solutions Page 1 of 13 Key Comments begin with an * Commands are in bold black I edited the output so that it appears here in blue Unit

More information

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables: Regression Lab The data set cholesterol.txt available on your thumb drive contains the following variables: Field Descriptions ID: Subject ID sex: Sex: 0 = male, = female age: Age in years chol: Serum

More information

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015

Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Entering and Outputting Data 2 nd best TA ever: Steele H. Valenzuela February 2-6, 2015 Contents Things to Know Before You Begin.................................... 1 Entering and Outputting Data......................................

More information

Stata version 12. Lab Session 1 February Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session..

Stata version 12. Lab Session 1 February Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. Stata version 12 Lab Session 1 February 2013 1. Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. 3. Preliminary: How to Save a Stata Graph... 4. Enter Data:

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

Stata versions 12 & 13 Week 4 Practice Problems

Stata versions 12 & 13 Week 4 Practice Problems Stata versions 12 & 13 Week 4 Practice Problems SOLUTIONS 1 Practice Screen Capture a Create a word document Name it using the convention lastname_lab1docx (eg bigelow_lab1docx) b Using your browser, go

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users BIOSTATS 640 Spring 2019 Review of Introductory Biostatistics R solutions Page 1 of 16 Preliminary Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users a) How are homeworks graded? This

More information

Practice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version)

Practice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version) Practice in R January 28, 2010 (pdf version) 1 Sivan s practice Her practice file should be (here), or check the web for a more useful pointer. 2 Hetroskadasticity ˆ Let s make some hetroskadastic data:

More information

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software. Welcome to Basic Excel, presented by STEM Gateway as part of the Essential Academic Skills Enhancement, or EASE, workshop series. Before we begin, I want to make sure we are clear that this is by no means

More information

36-402/608 HW #1 Solutions 1/21/2010

36-402/608 HW #1 Solutions 1/21/2010 36-402/608 HW #1 Solutions 1/21/2010 1. t-test (20 points) Use fullbumpus.r to set up the data from fullbumpus.txt (both at Blackboard/Assignments). For this problem, analyze the full dataset together

More information

Stata version 14 Also works for versions 13 & 12. Lab Session 1 February Preliminary: How to Screen Capture..

Stata version 14 Also works for versions 13 & 12. Lab Session 1 February Preliminary: How to Screen Capture.. Stata version 14 Also works for versions 13 & 12 Lab Session 1 February 2016 1. Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. 3. Preliminary: How to Save

More information

Demo yeast mutant analysis

Demo yeast mutant analysis Demo yeast mutant analysis Jean-Yves Sgro February 20, 2018 Contents 1 Analysis of yeast growth data 1 1.1 Set working directory........................................ 1 1.2 List all files in directory.......................................

More information

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio Introduction to R and R-Studio 2018-19 Toy Program #1 R Essentials This illustration Assumes that You Have Installed R and R-Studio If you have not already installed R and RStudio, please see: Windows

More information

7/18/16. Review. Review of Homework. Lecture 3: Programming Statistics in R. Questions from last lecture? Problems with Stata? Problems with Excel?

7/18/16. Review. Review of Homework. Lecture 3: Programming Statistics in R. Questions from last lecture? Problems with Stata? Problems with Excel? Lecture 3: Programming Statistics in R Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Questions from last lecture? Problems with Stata? Problems with Excel? 2

More information

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS. 1 SPSS 11.5 for Windows Introductory Assignment Material covered: Opening an existing SPSS data file, creating new data files, generating frequency distributions and descriptive statistics, obtaining printouts

More information

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression

More information

Salary 9 mo : 9 month salary for faculty member for 2004

Salary 9 mo : 9 month salary for faculty member for 2004 22s:52 Applied Linear Regression DeCook Fall 2008 Lab 3 Friday October 3. The data Set In 2004, a study was done to examine if gender, after controlling for other variables, was a significant predictor

More information

Introduction to Minitab 1

Introduction to Minitab 1 Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,

More information

A Knitr Demo. Charles J. Geyer. February 8, 2017

A Knitr Demo. Charles J. Geyer. February 8, 2017 A Knitr Demo Charles J. Geyer February 8, 2017 1 Licence This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/.

More information

NEURAL NETWORKS. Cement. Blast Furnace Slag. Fly Ash. Water. Superplasticizer. Coarse Aggregate. Fine Aggregate. Age

NEURAL NETWORKS. Cement. Blast Furnace Slag. Fly Ash. Water. Superplasticizer. Coarse Aggregate. Fine Aggregate. Age NEURAL NETWORKS As an introduction, we ll tackle a prediction task with a continuous variable. We ll reproduce research from the field of cement and concrete manufacturing that seeks to model the compressive

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

Math 263 Excel Assignment 3

Math 263 Excel Assignment 3 ath 263 Excel Assignment 3 Sections 001 and 003 Purpose In this assignment you will use the same data as in Excel Assignment 2. You will perform an exploratory data analysis using R. You shall reproduce

More information

AA BB CC DD EE. Introduction to Graphics in R

AA BB CC DD EE. Introduction to Graphics in R Introduction to Graphics in R Cori Mar 7/10/18 ### Reading in the data dat

More information

Introduction to R. Introduction to Econometrics W

Introduction to R. Introduction to Econometrics W Introduction to R Introduction to Econometrics W3412 Begin Download R from the Comprehensive R Archive Network (CRAN) by choosing a location close to you. Students are also recommended to download RStudio,

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

Practical 2: Plotting

Practical 2: Plotting Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory

More information

Advanced Econometric Methods EMET3011/8014

Advanced Econometric Methods EMET3011/8014 Advanced Econometric Methods EMET3011/8014 Lecture 2 John Stachurski Semester 1, 2011 Announcements Missed first lecture? See www.johnstachurski.net/emet Weekly download of course notes First computer

More information

Lab 1 Introduction to R

Lab 1 Introduction to R Lab 1 Introduction to R Date: August 23, 2011 Assignment and Report Due Date: August 30, 2011 Goal: The purpose of this lab is to get R running on your machines and to get you familiar with the basics

More information

An Introductory Guide to R

An Introductory Guide to R An Introductory Guide to R By Claudia Mahler 1 Contents Installing and Operating R 2 Basics 4 Importing Data 5 Types of Data 6 Basic Operations 8 Selecting and Specifying Data 9 Matrices 11 Simple Statistics

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA Learning objectives: Getting data ready for analysis: 1) Learn several methods of exploring the

More information

Brief Guide on Using SPSS 10.0

Brief Guide on Using SPSS 10.0 Brief Guide on Using SPSS 10.0 (Use student data, 22 cases, studentp.dat in Dr. Chang s Data Directory Page) (Page address: http://www.cis.ysu.edu/~chang/stat/) I. Processing File and Data To open a new

More information

Exercise 2.23 Villanova MAT 8406 September 7, 2015

Exercise 2.23 Villanova MAT 8406 September 7, 2015 Exercise 2.23 Villanova MAT 8406 September 7, 2015 Step 1: Understand the Question Consider the simple linear regression model y = 50 + 10x + ε where ε is NID(0, 16). Suppose that n = 20 pairs of observations

More information

An Introduction to R- Programming

An Introduction to R- Programming An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University

More information

Tutorial 3: Probability & Distributions Johannes Karreth RPOS 517, Day 3

Tutorial 3: Probability & Distributions Johannes Karreth RPOS 517, Day 3 Tutorial 3: Probability & Distributions Johannes Karreth RPOS 517, Day 3 This tutorial shows you: how to simulate a random process how to plot the distribution of a variable how to assess the distribution

More information

1 Introduction. 1.1 What is Statistics?

1 Introduction. 1.1 What is Statistics? 1 Introduction 1.1 What is Statistics? MATH1015 Biostatistics Week 1 Statistics is a scientific study of numerical data based on natural phenomena. It is also the science of collecting, organising, interpreting

More information

Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975.

Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975. Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975. SPSS Statistics were designed INTRODUCTION TO SPSS Objective About the

More information

Coding & Data Skills for Communicators Dr. Cindy Royal Texas State University - San Marcos School of Journalism and Mass Communication

Coding & Data Skills for Communicators Dr. Cindy Royal Texas State University - San Marcos School of Journalism and Mass Communication Coding & Data Skills for Communicators Dr. Cindy Royal Texas State University - San Marcos School of Journalism and Mass Communication Spreadsheet Basics Excel is a powerful productivity tool. It s a spreadsheet

More information

Stata versions 12 & 13 Week 4 - Practice Problems

Stata versions 12 & 13 Week 4 - Practice Problems Stata versions 12 & 13 Week 4 - Practice Problems DUE: Monday February 24, 2014 Last submission date for credit: Monday March 3, 2014 1 Practice Screen Capture a Create a word document Name it using the

More information

An introduction to plotting data

An introduction to plotting data An introduction to plotting data Eric D. Black California Institute of Technology February 25, 2014 1 Introduction Plotting data is one of the essential skills every scientist must have. We use it on a

More information

Introduction to R, Github and Gitlab

Introduction to R, Github and Gitlab Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and

More information

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison 610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison R is very touchy about unbalanced designs, partly because

More information

Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes

Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes Paul Dickman, Rino Bellocco April 18, 2001 We will be using the computer teaching room located on the second floor of Norrbacka,

More information

LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N.

LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N. LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I Part Two Introduction to R Programming RStudio November 2016 Written by N.Nilgün Çokça Introduction to R Programming 5 Installing R & RStudio 5 The R Studio

More information

Computing With R Handout 1

Computing With R Handout 1 Computing With R Handout 1 Getting Into R To access the R language (free software), go to a computing lab that has R installed, or a computer on which you have downloaded R from one of the distribution

More information

Dataset Used in This Lab (download from course website framingham_1000.rdata

Dataset Used in This Lab (download from course website   framingham_1000.rdata Introduction to R and R- Studio Sring 2019 Lab #1 Some Basics Before you begin: If you have not already installed R and RStudio, lease see Windows Users: htt://eole.umass.edu/bie540w/df/how%20to%20install%20r%20and%20r%20studio%20windows%20users%20fall%20201

More information

Stata: A Brief Introduction Biostatistics

Stata: A Brief Introduction Biostatistics Stata: A Brief Introduction Biostatistics 140.621 2005-2006 1. Statistical Packages There are many statistical packages (Stata, SPSS, SAS, Splus, etc.) Statistical packages can be used for Analysis Data

More information

= 3 + (5*4) + (1/2)*(4/2)^2.

= 3 + (5*4) + (1/2)*(4/2)^2. Physics 100 Lab 1: Use of a Spreadsheet to Analyze Data by Kenneth Hahn and Michael Goggin In this lab you will learn how to enter data into a spreadsheet and to manipulate the data in meaningful ways.

More information

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015 STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................

More information

Homework 1 Excel Basics

Homework 1 Excel Basics Homework 1 Excel Basics Excel is a software program that is used to organize information, perform calculations, and create visual displays of the information. When you start up Excel, you will see the

More information

LAB #1: DESCRIPTIVE STATISTICS WITH R

LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

IQR = number. summary: largest. = 2. Upper half: Q3 =

IQR = number. summary: largest. = 2. Upper half: Q3 = Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number

More information

Statistics 13, Lab 1. Getting Started. The Mac. Launching RStudio and loading data

Statistics 13, Lab 1. Getting Started. The Mac. Launching RStudio and loading data Statistics 13, Lab 1 Getting Started This first lab session is nothing more than an introduction: We will help you navigate the Statistics Department s (all Mac) computing facility and we will get you

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10 St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

Lab 1: Getting started with R and RStudio Questions? or

Lab 1: Getting started with R and RStudio Questions? or Lab 1: Getting started with R and RStudio Questions? david.montwe@ualberta.ca or isaacren@ualberta.ca 1. Installing R and RStudio To install R, go to https://cran.r-project.org/ and click on the Download

More information

STATA 13 INTRODUCTION

STATA 13 INTRODUCTION STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA

More information

Illustrations - Simple and Multiple Linear Regression Steele H. Valenzuela February 18, 2015

Illustrations - Simple and Multiple Linear Regression Steele H. Valenzuela February 18, 2015 Illustrations - Simple and Multiple Linear Regression Steele H. Valenzuela February 18, 2015 Illustrations for Simple and Multiple Linear Regression February 2015 Simple Linear Regression 1. Introduction

More information

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot.

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. 3 confint A metafor package function that gives you the confidence intervals of effect sizes.

More information

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1)

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1) Orange Juice data Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20labs/l10-oj-data.html#(1) 1/31 Orange Juice Data The data contain weekly sales of refrigerated

More information

Week 1: Introduction to R, part 1

Week 1: Introduction to R, part 1 Week 1: Introduction to R, part 1 Goals Learning how to start with R and RStudio Use the command line Use functions in R Learning the Tools What is R? What is RStudio? Getting started R is a computer program

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 INTRODUCTION Graphs are one of the most important aspects of data analysis and presentation of your of data. They are visual representations

More information

> glucose = c(81, 85, 93, 93, 99, 76, 75, 84, 78, 84, 81, 82, 89, + 81, 96, 82, 74, 70, 84, 86, 80, 70, 131, 75, 88, 102, 115, + 89, 82, 79, 106)

> glucose = c(81, 85, 93, 93, 99, 76, 75, 84, 78, 84, 81, 82, 89, + 81, 96, 82, 74, 70, 84, 86, 80, 70, 131, 75, 88, 102, 115, + 89, 82, 79, 106) This document describes how to use a number of R commands for plotting one variable and for calculating one variable summary statistics Specifically, it describes how to use R to create dotplots, histograms,

More information

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R R Workshop Guide This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, you can access a more extensive user guide for R on the

More information

Stat 579: More Preliminaries, Reading from Files

Stat 579: More Preliminaries, Reading from Files Stat 579: More Preliminaries, Reading from Files Ranjan Maitra 2220 Snedecor Hall Department of Statistics Iowa State University. Phone: 515-294-7757 maitra@iastate.edu September 1, 2011, 1/10 Some more

More information

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Statistics Lab #7 ANOVA Part 2 & ANCOVA Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")

More information

SPSS. (Statistical Packages for the Social Sciences)

SPSS. (Statistical Packages for the Social Sciences) Inger Persson SPSS (Statistical Packages for the Social Sciences) SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform basic statistical calculations in SPSS.

More information

Lab 1 (fall, 2017) Introduction to R and R Studio

Lab 1 (fall, 2017) Introduction to R and R Studio Lab 1 (fall, 201) Introduction to R and R Studio Introduction: Today we will use R, as presented in the R Studio environment (or front end), in an introductory setting. We will make some calculations,

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Orientation Assignment for Statistics Software (nothing to hand in) Mary Parker,

Orientation Assignment for Statistics Software (nothing to hand in) Mary Parker, Orientation to MINITAB, Mary Parker, mparker@austincc.edu. Last updated 1/3/10. page 1 of Orientation Assignment for Statistics Software (nothing to hand in) Mary Parker, mparker@austincc.edu When you

More information

Statistical Software Camp: Introduction to R

Statistical Software Camp: Introduction to R Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities

More information

Computing With R Handout 1

Computing With R Handout 1 Computing With R Handout 1 The purpose of this handout is to lead you through a simple exercise using the R computing language. It is essentially an assignment, although there will be nothing to hand in.

More information

SISG/SISMID Module 3

SISG/SISMID Module 3 SISG/SISMID Module 3 Introduction to R Ken Rice Tim Thornton University of Washington Seattle, July 2018 Introduction: Course Aims This is a first course in R. We aim to cover; Reading in, summarizing

More information

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015

R syntax guide. Richard Gonzalez Psychology 613. August 27, 2015 R syntax guide Richard Gonzalez Psychology 613 August 27, 2015 This handout will help you get started with R syntax. There are obviously many details that I cannot cover in these short notes but these

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

Exploring and Understanding Data Using R.

Exploring and Understanding Data Using R. Exploring and Understanding Data Using R. Loading the data into an R data frame: variable

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here:

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here: Lab 1. Introduction to R & SAS R is free, open-source software. Get it here: http://tinyurl.com/yfet8mj for your own computer. 1.1. Using R like a calculator Open R and type these commands into the R Console

More information

Logical operators: R provides an extensive list of logical operators. These include

Logical operators: R provides an extensive list of logical operators. These include meat.r: Explanation of code Goals of code: Analyzing a subset of data Creating data frames with specified X values Calculating confidence and prediction intervals Lists and matrices Only printing a few

More information

Intro to Stata for Political Scientists

Intro to Stata for Political Scientists Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge

More information

Bernt Arne Ødegaard. 15 November 2018

Bernt Arne Ødegaard. 15 November 2018 R Bernt Arne Ødegaard 15 November 2018 To R is Human 1 R R is a computing environment specially made for doing statistics/econometrics. It is becoming the standard for advanced dealing with empirical data,

More information

Bivariate (Simple) Regression Analysis

Bivariate (Simple) Regression Analysis Revised July 2018 Bivariate (Simple) Regression Analysis This set of notes shows how to use Stata to estimate a simple (two-variable) regression equation. It assumes that you have set Stata up on your

More information

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R In this course we will be using R (for Windows) for most of our work. These notes are to help students install R and then

More information

Statistics for Biologists: Practicals

Statistics for Biologists: Practicals Statistics for Biologists: Practicals Peter Stoll University of Basel HS 2012 Peter Stoll (University of Basel) Statistics for Biologists: Practicals HS 2012 1 / 22 Outline Getting started Essentials of

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

IN-CLASS EXERCISE: INTRODUCTION TO R

IN-CLASS EXERCISE: INTRODUCTION TO R NAVAL POSTGRADUATE SCHOOL IN-CLASS EXERCISE: INTRODUCTION TO R Survey Research Methods Short Course Marine Corps Combat Development Command Quantico, Virginia May 2013 In-class Exercise: Introduction to

More information

Some issues with R It is command-driven, and learning to use it to its full extent takes some time and effort. The documentation is comprehensive,

Some issues with R It is command-driven, and learning to use it to its full extent takes some time and effort. The documentation is comprehensive, R To R is Human R is a computing environment specially made for doing statistics/econometrics. It is becoming the standard for advanced dealing with empirical data, also in finance. Good parts It is freely

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

Practical 2: Using Minitab (not assessed, for practice only!)

Practical 2: Using Minitab (not assessed, for practice only!) Practical 2: Using Minitab (not assessed, for practice only!) Instructions 1. Read through the instructions below for Accessing Minitab. 2. Work through all of the exercises on this handout. If you need

More information