BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...

Size: px
Start display at page:

Download "BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages..."

Transcription

1 BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages Load R Data.. a. Load R data frames... b. Check 3. SINGLE VARIABLE - Discrete. a. Numerical. b. Graphical 4. SINGLE VARIABLE - Continuous. a. Numerical. b. Graphical 5. TWO VARIABLES One Continuous, One Discrete. a. Numerical. b. Graphical 6. TWO VARIABLES BOTH Continuous... a. XY Scatter Plot R Data Description II Spring 2018.docx Page 1 of 13

2 1. Start of Session # 1. START OF SESSION # 1a. Preliminaries # setwd( ) to set working directory # Tips: 1) R wants forward slashes and 2) must enclose in quotes setwd("/users/cbigelow/desktop/") # 1b. Clear workspace rm(list=ls()) # 1c. Turn OFF scientific notation options(scipen=1000) # 1b. Install packages (one time) # Tip - package name MUST be enclosed in quotes # Note - You'll get lots of stuff appearing in your console window. # Dear Reader - I have commented out these installations because I already have them. # install.packages("mosaic") # install.packages("desctools") # install.packages("psych") # install.packages("ggplot2") # 1c. Attach packages to be used in this session (1x per session) library(mosaic) library(desctools) library(psych) library(ggplot2) R Data Description II Spring 2018.docx Page 2 of 13

3 2. Load R Data #2a. LOAD R dataframes # Tip - Be sure you have downloaded data and placed it on desktop # At right click on ENVIRONMENT tab to confirm all is well. setwd("/users/cbigelow/desktop/") load(file="larvae.rdata") load(file="ivf.rdata") # 2b. Check # str(dataframe) to check structure of dataframe str(ivf) ## 'data.frame': 641 obs. of 6 variables: ## $ id : num ## $ matage : int ## $ hyp : int ## $ gestwks: num ## $ sex : Factor w/ 2 levels "male","female": ## $ bweight: int ## - attr(*, "datalabel")= chr "In Vitro Fertilization data" ## - attr(*, "time.stamp")= chr "14 Feb :55" ## - attr(*, "formats")= chr "%9.0g" "%8.0g" "%8.0g" "%9.0g"... ## - attr(*, "types")= int ## - attr(*, "val.labels")= chr "" "" "" ""... ## - attr(*, "var.labels")= chr "identity number" "maternal age (years)" "hypertension (1=yes, 0=no)" "g estational age (weeks)"... ## - attr(*, "version")= int 12 ## - attr(*, "label.table")=list of 1 ##..$ sex: Named int 1 2 ##....- attr(*, "names")= chr "male" "female" # head(dataframe) to display first 6 rows head(ivf) ## id matage hyp gestwks sex bweight ## female 2410 ## female 2977 ## female 2100 ## male 3270 ## female 2620 ## male 3260 # tail(dataframe) to display last 6 rows tail(ivf) ## id matage hyp gestwks sex bweight ## male 2972 ## female 2850 ## male 3182 ## female 3048 ## female 3183 ## male 2920 R Data Description II Spring 2018.docx Page 3 of 13

4 # summary(dataframe) to get quick descriptives on every variable summary(ivf) ## id matage hyp gestwks ## Min. : 1 Min. :23.00 Min. : Min. :24.69 ## 1st Qu.:161 1st Qu.: st Qu.: st Qu.:38.01 ## Median :321 Median :34.00 Median : Median :39.15 ## Mean :321 Mean :33.97 Mean : Mean :38.69 ## 3rd Qu.:481 3rd Qu.: rd Qu.: rd Qu.:40.15 ## Max. :641 Max. :43.00 Max. : Max. :42.35 ## NA's :2 ## sex bweight ## male :326 Min. : 630 ## female:315 1st Qu.:2850 ## Median :3200 ## Mean :3129 ## 3rd Qu.:3550 ## Max. :4650 ## R Data Description II Spring 2018.docx Page 4 of 13

5 3. SINGLE VARIABLE - DISCRETE # 3. SINGLE VARIABLE - DISCRETE # 3a. Numerical # 3a.i. Frequency and relative frequency table - brute force # Step 1: Create columns of table n <- length(ivf$sex) sex_freq <- table(ivf$sex) sex_relfreq <- sex_freq/n sex_cum <- cumsum(sex_freq) sex_cumrel <- cumsum(sex_relfreq) # Step 2: cbind(,,,) to combine the columns into a table sextable <- cbind(sex_freq, sex_relfreq, sex_cum, sex_cumrel) # Step 3: colnames(" ", " ", " ") to label columns of table colnames(sextable) <- c("freq", "Rel Freq", "Cum Freq", "Cum Rel Freq") # Step 4: Display table with just 4 digits after the decimal round(sextable,digits=4) ## Freq Rel Freq Cum Freq Cum Rel Freq ## male ## female # 3a.ii Frequency and relative frequency table # Command Freq() in package=desctools # Tip - Turn off scientific notation first options(scipen=1000) Freq(ivf$sex) ## level freq perc cumfreq cumperc ## 1 male % % ## 2 female % % # 3b. Graphical # 3b.1. Bar Graph of counts # Command ggplot() + geom_bar() in package=ggplot2 # Basic ggplot(data=ivf, aes(x=factor(hyp))) + geom_bar() R Data Description II Spring 2018.docx Page 5 of 13

6 # With aesthetics: I recommend building step by step, then displaying! p <- ggplot(data=ivf, aes(x=factor(hyp))) p1 <- p + geom_bar(color="black", fill="blue") p2 <- p1 + xlab("hypertension") p3 <- p2 + ylab("frequency") p4 <- p3 + ggtitle("bar Graph of Hypertension") p5 <- p4 + theme_bw() p5 # 3b.2 Bar Graph of percents p6 <- ggplot(data=ivf, aes(x=hyp)) p7 <- p6 + geom_bar(aes(y = (..count..)/sum(..count..)), color="black", fill="blue") p8 <- p7 + scale_y_continuous(labels=scales::percent) p9 <- p8 + xlab("hypertension") p10 <-p9 + ylab("relative Frequency") p11 <- p10 + ggtitle("bar Graph of Hypertension") p12 <- p11 + theme_bw() p12 ## Warning: Removed 2 rows containing non-finite values (stat_count). R Data Description II Spring 2018.docx Page 6 of 13

7 4. SINGLE VARIABLE CONTINUOUS # 4. SINGLE VARIABLE - CONTINUOUS # 4a. Numerical # 4a.1. Five number summary # Command favstats() in package=mosaic favstats(~bweight, data=ivf) ## min Q1 median Q3 max mean sd n missing ## # 4a.2. Quantiles # Command quantile() in package=mosaic quantile(~bweight, data=ivf) ## 0% 25% 50% 75% 100% ## # 4a.3. Basic descriptives summary(ivf$bweight) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## # 4a.4. Detailed descriptives # Command describe( ) in package=psych describe(ivf$bweight) ## vars n mean sd median trimmed mad min max range skew ## X ## kurtosis se ## X # 4b. Graphical # 4b.1. Box plot # Command ggplot() + geom_boxplot() in package=ggplot2 # Basic ggplot(data=ivf, aes(x=1, y=matage)) + geom_boxplot() R Data Description II Spring 2018.docx Page 7 of 13

8 # With aesthetics - step by step p13 <- ggplot(data=ivf, aes(x=1, y=matage)) + geom_boxplot(color="black", fill="blue") p14 <- p13 + xlab(".") p15 <- p14 + ylab("maternal Age (years)") p16 <- p15 + ggtitle("box Plot of Maternal Age") p17 <- p16 + theme_bw() p17 # 4b.2 Histogram # Command ggplot() + geom_histogram() in package=ggplot2 # Basic ggplot(data=ivf, aes(x=matage)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. R Data Description II Spring 2018.docx Page 8 of 13

9 # With aesthetics - step by step p18 <- ggplot(data=ivf, aes(x=matage)) + geom_histogram(color="black", fill="blue", binwidth=2) p19 <- p18 + xlab("maternal Age (years)") p20 <- p19 + ylab("frequency") p21 <- p20 + ggtitle("histogram of Maternal Age") p22 <- p21 + theme_bw() p22 R Data Description II Spring 2018.docx Page 9 of 13

10 5. TWO VARIABLES One Continuous, One Discrete # 5. TWO VARIABLES - ONE CONTINUOUS, ONE DISCRETE # 5a. Numerical # 5a.1. Detailed descriptives, by group # Command describeby() in package=psych # Tip - Declare your discrete group variable to be factor library(psych) ivf$sex <- as.factor(ivf$sex) describeby(ivf$bweight, group = ivf$sex,digits= 4) ## $male ## vars n mean sd median trimmed mad min max range skew ## X ## kurtosis se ## X ## ## $female ## vars n mean sd median trimmed mad min max range skew ## X ## kurtosis se ## X ## ## attr(,"call") ## by.default(data = x, INDICES = group, FUN = describe, type = type) # 5b. Graphical # 5b.1. Side-by-side box plot # Command ggplot() + geom_boxplot() in package=ggplot2 # Basic ggplot(data=ivf, aes(x=as.factor(sex), y=bweight)) + geom_boxplot() R Data Description II Spring 2018.docx Page 10 of 13

11 # With aesthetics - step by step p13 <- ggplot(data=ivf, aes(x=as.factor(sex), y=bweight)) + geom_boxplot(color="black", fill="blue") p14 <- p13 + xlab("sex") p15 <- p14 + ylab("birthweight (g)") p16 <- p15 + ggtitle("side-by Side Box Plot of Birthweight, by Sex") p17 <- p16 + theme_bw() p17 # 5b.2. Histogram, by group - Stacked # Command ggplot() + geom_histogram() + facet_wrap() in package=ggplot2 # Basic ivf$sex <- as.factor(ivf$sex) ggplot(data=ivf, aes(x=bweight)) + geom_histogram() + facet_grid(sex ~.) ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. R Data Description II Spring 2018.docx Page 11 of 13

12 # With aesthetics - step by step p18 <- ggplot(data=ivf, aes(x=bweight, fill=sex)) + geom_histogram(color="black", fill="blue", binwidth=2 00) + facet_wrap(~sex, ncol=1) p19 <- p18 + xlab("birthweight (g)") p20 <- p19 + ylab("frequency") p21 <- p20 + ggtitle("histogram of Birthweight (g), by Sex") p22 <- p21 + theme_bw() p22 # 5b.3. Histogram, by group - Overlay # Command ggplot() + geom_histogram() in package=ggplot2 # With aesthetics - step by step p23 <- ggplot(data=ivf, aes(x=bweight, fill=sex)) + geom_histogram(position="identity", alpha=0.4, binwid th=200) p24 <- p23 + xlab("birthweight (g)") p25 <- p24 + ylab("frequency") p26 <- p25 + ggtitle("histogram of Birthweight (g), over Sex") p27 <- p26 + theme_bw() p27 R Data Description II Spring 2018.docx Page 12 of 13

13 6. TWO VARIABLES Both Continuous # 6. TWO VARIABLES - BOTH CONTINUOUS # 6a. XY Scatterplot # Command ggplot() + geom_point() in package=ggplot2 # Basic ggplot(data=ivf, aes(x=matage, y=bweight)) + geom_point() # With aesthetics - step by step p28 <- ggplot(data=ivf, aes(x=matage, y=bweight)) + geom_point(size=0.5, color="blue") p29 <- p28 + xlab("maternal Age (years") p30 <- p29 + ylab("birthweight (g)") p31 <- p30 + ggtitle("scatterplot of Birthweight in Relationship to Maternal Age") p32 <- p31 + theme_bw() p32 R Data Description II Spring 2018.docx Page 13 of 13

Introduction to R and R-Studio Toy Program #2 Excel to R & Basic Descriptives

Introduction to R and R-Studio Toy Program #2 Excel to R & Basic Descriptives Introduction to R and R-Studio 2018-19 Toy Program #2 Basic Descriptives Summary The goal of this toy program is to give you a boiler for working with your own excel data. So, I m hoping you ll try!. In

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

BIOSTAT640 R Lab1 for Spring 2016

BIOSTAT640 R Lab1 for Spring 2016 BIOSTAT640 R Lab1 for Spring 2016 Minming Li & Steele H. Valenzuela Feb.1, 2016 This is the first R lab session of course BIOSTAT640 at UMass during the Spring 2016 semester. I, Minming (Matt) Li, am going

More information

Stata version 14 Also works for versions 13 & 12. Lab Session 1 February Preliminary: How to Screen Capture..

Stata version 14 Also works for versions 13 & 12. Lab Session 1 February Preliminary: How to Screen Capture.. Stata version 14 Also works for versions 13 & 12 Lab Session 1 February 2016 1. Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. 3. Preliminary: How to Save

More information

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R...

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R... Introduction to R and R-Studio 2017-18 02. Getting Data Into R 1. Enter Data Directly into R...... 2. Import Excel Data (.xlsx ) into R..... 3. Import Stata Data (.dta ) into R...... a) From a folder on

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users BIOSTATS 640 Spring 2019 Review of Introductory Biostatistics R solutions Page 1 of 16 Preliminary Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users a) How are homeworks graded? This

More information

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata Paul Dickman September 2003 1 A brief introduction to Stata Starting the Stata program

More information

Introduction to R. Dataset Basics. March 2018

Introduction to R. Dataset Basics. March 2018 Introduction to R March 2018 1. Preliminaries.... a) Suggested packages for importing/exporting data.... b) FAQ: How to find the path of your dataset (or whatever). 2. Import/Export Data........ a) R (.Rdata)

More information

Dataset Used in This Lab (download from course website framingham_1000.rdata

Dataset Used in This Lab (download from course website   framingham_1000.rdata Introduction to R and R- Studio Sring 2019 Lab #1 Some Basics Before you begin: If you have not already installed R and RStudio, lease see Windows Users: htt://eole.umass.edu/bie540w/df/how%20to%20install%20r%20and%20r%20studio%20windows%20users%20fall%20201

More information

Stata version 12. Lab Session 1 February Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session..

Stata version 12. Lab Session 1 February Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. Stata version 12 Lab Session 1 February 2013 1. Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. 3. Preliminary: How to Save a Stata Graph... 4. Enter Data:

More information

Rstudio GGPLOT2. Preparations. The first plot: Hello world! W2018 RENR690 Zihaohan Sang

Rstudio GGPLOT2. Preparations. The first plot: Hello world! W2018 RENR690 Zihaohan Sang Rstudio GGPLOT2 Preparations There are several different systems for creating data visualizations in R. We will introduce ggplot2, which is based on Leland Wilkinson s Grammar of Graphics. The learning

More information

Psychology 405: Psychometric Theory Homework 1: answers

Psychology 405: Psychometric Theory Homework 1: answers Psychology 405: Psychometric Theory Homework 1: answers William Revelle Department of Psychology Northwestern University Evanston, Illinois USA April, 2017 1 / 12 Outline Preliminaries Assignment Analysis

More information

Graphics in R. Jim Bentley. The following code creates a couple of sample data frames that we will use in our examples.

Graphics in R. Jim Bentley. The following code creates a couple of sample data frames that we will use in our examples. Graphics in R Jim Bentley 1 Sample Data The following code creates a couple of sample data frames that we will use in our examples. > sex = c(rep("female",12),rep("male",7)) > mass = c(36.1, 54.6, 48.5,

More information

INTRODUCTION TO DATA. Welcome to the course!

INTRODUCTION TO DATA. Welcome to the course! INTRODUCTION TO DATA Welcome to the course! High School and Beyond id gender race socst 70 male white 57 121 female white 61 86 male white 31 137 female white 61 Loading data > # Load package > library(openintro)

More information

Introduction to R and the tidyverse. Paolo Crosetto

Introduction to R and the tidyverse. Paolo Crosetto Introduction to R and the tidyverse Paolo Crosetto Lecture 1: plotting Before we start: Rstudio Interactive console Object explorer Script window Plot window Before we start: R concatenate: c() assign:

More information

Introduction to R and R-Studio In-Class Lab Activity The 1970 Draft Lottery

Introduction to R and R-Studio In-Class Lab Activity The 1970 Draft Lottery Introduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery Summary The goal of this activity is to give you practice with R Markdown for saving your work. It s also a fun bit of

More information

Ggplot2 QMMA. Emanuele Taufer. 2/19/2018 Ggplot2 (1)

Ggplot2 QMMA. Emanuele Taufer. 2/19/2018 Ggplot2 (1) Ggplot2 QMMA Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20classes/1-4_ggplot2.html#(1) 1/27 Ggplot2 ggplot2 is a plotting system for R, based on the

More information

03 - Intro to graphics (with ggplot2)

03 - Intro to graphics (with ggplot2) 3 - Intro to graphics (with ggplot2) ST 597 Spring 217 University of Alabama 3-dataviz.pdf Contents 1 Intro to R Graphics 2 1.1 Graphics Packages................................ 2 1.2 Base Graphics...................................

More information

Demo yeast mutant analysis

Demo yeast mutant analysis Demo yeast mutant analysis Jean-Yves Sgro February 20, 2018 Contents 1 Analysis of yeast growth data 1 1.1 Set working directory........................................ 1 1.2 List all files in directory.......................................

More information

Package ggextra. April 4, 2018

Package ggextra. April 4, 2018 Package ggextra April 4, 2018 Title Add Marginal Histograms to 'ggplot2', and More 'ggplot2' Enhancements Version 0.8 Collection of functions and layers to enhance 'ggplot2'. The flagship function is 'ggmarginal()',

More information

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R R Workshop Guide This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, you can access a more extensive user guide for R on the

More information

Stata versions 12 & 13 Week 4 Practice Problems

Stata versions 12 & 13 Week 4 Practice Problems Stata versions 12 & 13 Week 4 Practice Problems SOLUTIONS 1 Practice Screen Capture a Create a word document Name it using the convention lastname_lab1docx (eg bigelow_lab1docx) b Using your browser, go

More information

IPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 16)

IPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 16) IPS9 in R: Bootstrap Methods and Permutation Tests (Chapter 6) Bonnie Lin and Nicholas Horton (nhorton@amherst.edu) July, 8 Introduction and background These documents are intended to help describe how

More information

The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan.

The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan. Graphics Agenda Grammer of Graphics Using ggplot2 The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan. ggplot2 (wiki) ggplot2 is a data visualization package Created

More information

Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 10, 2018

Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 10, 2018 Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 1, 218 Vignette Outline This vignette provides examples of conducting exploratory data analysis (EDA) on NAEP

More information

Importing and visualizing data in R. Day 3

Importing and visualizing data in R. Day 3 Importing and visualizing data in R Day 3 R data.frames Like pandas in python, R uses data frame (data.frame) object to support tabular data. These provide: Data input Row- and column-wise manipulation

More information

LAST UPDATED: October 16, 2012 DISTRIBUTIONS PSYC 3031 INTERMEDIATE STATISTICS LABORATORY. J. Elder

LAST UPDATED: October 16, 2012 DISTRIBUTIONS PSYC 3031 INTERMEDIATE STATISTICS LABORATORY. J. Elder LAST UPDATED: October 16, 2012 DISTRIBUTIONS Acknowledgements 2 Some of these slides have been sourced or modified from slides created by A. Field for Discovering Statistics using R. LAST UPDATED: October

More information

Data Visualization in R

Data Visualization in R Data Visualization in R L. Torgo ltorgo@fc.up.pt Faculdade de Ciências / LIAAD-INESC TEC, LA Universidade do Porto Oct, 216 Introduction Motivation for Data Visualization Humans are outstanding at detecting

More information

An Introduction to R Graphics

An Introduction to R Graphics An Introduction to R Graphics PnP Group Seminar 25 th April 2012 Why use R for graphics? Fast data exploration Easy automation and reproducibility Create publication quality figures Customisation of almost

More information

Statistical transformations

Statistical transformations Statistical transformations Next, let s take a look at a bar chart. Bar charts seem simple, but they are interesting because they reveal something subtle about plots. Consider a basic bar chart, as drawn

More information

Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes

Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes Paul Dickman, Rino Bellocco April 18, 2001 We will be using the computer teaching room located on the second floor of Norrbacka,

More information

8.3 simulating from the fitted model Chris Parrish July 3, 2016

8.3 simulating from the fitted model Chris Parrish July 3, 2016 8. simulating from the fitted model Chris Parrish July, 6 Contents speed of light (Simon Newcomb, 88) simulate data, fit the model, and check the coverage of the conf intervals............... model....................................................

More information

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio Introduction to R and R-Studio 2018-19 Toy Program #1 R Essentials This illustration Assumes that You Have Installed R and R-Studio If you have not already installed R and RStudio, please see: Windows

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

Stata versions 12 & 13 Week 4 - Practice Problems

Stata versions 12 & 13 Week 4 - Practice Problems Stata versions 12 & 13 Week 4 - Practice Problems DUE: Monday February 24, 2014 Last submission date for credit: Monday March 3, 2014 1 Practice Screen Capture a Create a word document Name it using the

More information

Data Visualization in R

Data Visualization in R Data Visualization in R L. Torgo ltorgo@fc.up.pt Faculdade de Ciências / LIAAD-INESC TEC, LA Universidade do Porto Aug, 2017 Introduction Motivation for Data Visualization Humans are outstanding at detecting

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

ggplot2 for Epi Studies Leah McGrath, PhD November 13, 2017

ggplot2 for Epi Studies Leah McGrath, PhD November 13, 2017 ggplot2 for Epi Studies Leah McGrath, PhD November 13, 2017 Introduction Know your data: data exploration is an important part of research Data visualization is an excellent way to explore data ggplot2

More information

#install.packages('devtools') #library(devtools) #install_github('famuvie/breedr', ref = github_release()) #install_github('famuvie/breedr')

#install.packages('devtools') #library(devtools) #install_github('famuvie/breedr', ref = github_release()) #install_github('famuvie/breedr') # title: Diagnosis of spatial and competiton effects in forest genetic trials using breedr: an example in Eucalyptus globulus # author: Eduardo Pablo Cappa # date: "June 26th, 2015" #install.packages('devtools')

More information

A set of rules describing how to compose a 'vocabulary' into permissible 'sentences'

A set of rules describing how to compose a 'vocabulary' into permissible 'sentences' Lecture 8: The grammar of graphics STAT598z: Intro. to computing for statistics Vinayak Rao Department of Statistics, Purdue University Grammar? A set of rules describing how to compose a 'vocabulary'

More information

Tutorial: SeqAPass Boxplot Generator

Tutorial: SeqAPass Boxplot Generator 1 Tutorial: SeqAPass Boxplot Generator 1. Access SeqAPASS by opening https://seqapass.epa.gov/seqapass/ using Mozilla Firefox web browser 2. Open the About link on the login page or upon logging in to

More information

The Average and SD in R

The Average and SD in R The Average and SD in R The Basics: mean() and sd() Calculating an average and standard deviation in R is straightforward. The mean() function calculates the average and the sd() function calculates the

More information

R Bootcamp Part I (B)

R Bootcamp Part I (B) R Bootcamp Part I (B) An R Script is available to make it easy for you to copy/paste all the tutorial commands into RStudio: http://statistics.uchicago.edu/~collins/rbootcamp/rbootcamp1b_rcode.r Preliminaries:

More information

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva Econ 329 - Stata Tutorial I: Reading, Organizing and Describing Data Sanjaya DeSilva September 8, 2008 1 Basics When you open Stata, you will see four windows. 1. The Results window list all the commands

More information

EXPLORATORY DATA ANALYSIS. Introducing the data

EXPLORATORY DATA ANALYSIS. Introducing the data EXPLORATORY DATA ANALYSIS Introducing the data Email data set > email # A tibble: 3,921 21 spam to_multiple from cc sent_email time image 1 not-spam 0 1 0 0

More information

Practice for Learning R and Learning Latex

Practice for Learning R and Learning Latex Practice for Learning R and Learning Latex Jennifer Pan August, 2011 Latex Environments A) Try to create the following equations: 1. 5+6 α = β2 2. P r( 1.96 Z 1.96) = 0.95 ( ) ( ) sy 1 r 2 3. ˆβx = r xy

More information

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 09/10/2018 Summarising Data Today we will consider Different types of data Appropriate ways to summarise these

More information

The diamonds dataset Visualizing data in R with ggplot2

The diamonds dataset Visualizing data in R with ggplot2 Lecture 2 STATS/CME 195 Matteo Sesia Stanford University Spring 2018 Contents The diamonds dataset Visualizing data in R with ggplot2 The diamonds dataset The tibble package The tibble package is part

More information

Lab5A - Intro to GGPLOT2 Z.Sang Sept 24, 2018

Lab5A - Intro to GGPLOT2 Z.Sang Sept 24, 2018 LabA - Intro to GGPLOT2 Z.Sang Sept 24, 218 In this lab you will learn to visualize raw data by plotting exploratory graphics with ggplot2 package. Unlike final graphs for publication or thesis, exploratory

More information

Introduction to Minitab 1

Introduction to Minitab 1 Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,

More information

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression

More information

Properties of Data. Digging into Data: Jordan Boyd-Graber. University of Maryland. February 11, 2013

Properties of Data. Digging into Data: Jordan Boyd-Graber. University of Maryland. February 11, 2013 Properties of Data Digging into Data: Jordan Boyd-Graber University of Maryland February 11, 2013 Digging into Data: Jordan Boyd-Graber (UMD) Properties of Data February 11, 2013 1 / 43 Roadmap Munging

More information

Lecture 1: Getting Started and Data Basics

Lecture 1: Getting Started and Data Basics Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate

More information

Facets and Continuous graphs

Facets and Continuous graphs Facets and Continuous graphs One way to add additional variables is with aesthetics. Another way, particularly useful for categorical variables, is to split your plot into facets, subplots that each display

More information

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT Python for Data Analysis Prof.Sushila Aghav-Palwe Assistant Professor MIT Four steps to apply data analytics: 1. Define your Objective What are you trying to achieve? What could the result look like? 2.

More information

A Quick and focused overview of R data types and ggplot2 syntax MAHENDRA MARIADASSOU, MARIA BERNARD, GERALDINE PASCAL, LAURENT CAUQUIL

A Quick and focused overview of R data types and ggplot2 syntax MAHENDRA MARIADASSOU, MARIA BERNARD, GERALDINE PASCAL, LAURENT CAUQUIL A Quick and focused overview of R data types and ggplot2 syntax MAHENDRA MARIADASSOU, MARIA BERNARD, GERALDINE PASCAL, LAURENT CAUQUIL 1 R and RStudio OVERVIEW 2 R and RStudio R is a free and open environment

More information

1 Building a simple data package for R. 2 Data files. 2.1 bmd data

1 Building a simple data package for R. 2 Data files. 2.1 bmd data 1 Building a simple data package for R Suppose that we wish to make a package containing data sets only available in-house or on CRAN. This is often done for the data sets in the examples and exercises

More information

Reading and wri+ng data

Reading and wri+ng data An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look

More information

Creating elegant graphics in R with ggplot2

Creating elegant graphics in R with ggplot2 Creating elegant graphics in R with ggplot2 Lauren Steely Bren School of Environmental Science and Management University of California, Santa Barbara What is ggplot2, and why is it so great? ggplot2 is

More information

R Workshop Daniel Fuller

R Workshop Daniel Fuller R Workshop Daniel Fuller Welcome to the R Workshop @ Memorial HKR The R project for statistical computing is a free open source statistical programming language and project. Follow these steps to get started:

More information

Practical 2: Plotting

Practical 2: Plotting Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory

More information

PRESENTING DATA. Overview. Some basic things to remember

PRESENTING DATA. Overview. Some basic things to remember PRESENTING DATA This handout is one of a series that accompanies An Adventure in Statistics: The Reality Enigma by me, Andy Field. These handouts are offered for free (although I hope you will buy the

More information

Install RStudio from - use the standard installation.

Install RStudio from   - use the standard installation. Session 1: Reading in Data Before you begin: Install RStudio from http://www.rstudio.com/ide/download/ - use the standard installation. Go to the course website; http://faculty.washington.edu/kenrice/rintro/

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics

More information

Advanced Plotting with ggplot2. Algorithm Design & Software Engineering November 13, 2016 Stefan Feuerriegel

Advanced Plotting with ggplot2. Algorithm Design & Software Engineering November 13, 2016 Stefan Feuerriegel Advanced Plotting with ggplot2 Algorithm Design & Software Engineering November 13, 2016 Stefan Feuerriegel Today s Lecture Objectives 1 Distinguishing different types of plots and their purpose 2 Learning

More information

Chapter 1 Histograms, Scatterplots, and Graphs of Functions

Chapter 1 Histograms, Scatterplots, and Graphs of Functions Chapter 1 Histograms, Scatterplots, and Graphs of Functions 1.1 Using Lists for Data Entry To enter data into the calculator you use the statistics menu. You can store data into lists labeled L1 through

More information

1 The ggplot2 workflow

1 The ggplot2 workflow ggplot2 @ statistics.com Week 2 Dope Sheet Page 1 dope, n. information especially from a reliable source [the inside dope]; v. figure out usually used with out; adj. excellent 1 This week s dope This week

More information

Introduction to Graphics with ggplot2

Introduction to Graphics with ggplot2 Introduction to Graphics with ggplot2 Reaction 2017 Flavio Santi Sept. 6, 2017 Flavio Santi Introduction to Graphics with ggplot2 Sept. 6, 2017 1 / 28 Graphics with ggplot2 ggplot2 [... ] allows you to

More information

ggplot in 3 easy steps (maybe 2 easy steps)

ggplot in 3 easy steps (maybe 2 easy steps) 1 ggplot in 3 easy steps (maybe 2 easy steps) 1.1 aesthetic: what you want to graph (e.g. x, y, z). 1.2 geom: how you want to graph it. 1.3 options: optional titles, themes, etc. 2 Background R has a number

More information

Using R to score personality scales

Using R to score personality scales Using R to score personality scales William Revelle Northwestern University February 27, 2013 Contents 1 Overview for the impatient 2 2 An example 2 2.1 Getting the data.................................

More information

Introductory Guide to SAS:

Introductory Guide to SAS: Introductory Guide to SAS: For UVM Statistics Students By Richard Single Contents 1 Introduction and Preliminaries 2 2 Reading in Data: The DATA Step 2 2.1 The DATA Statement............................................

More information

STAT 213 HW0a. R/RStudio Intro / Basic Descriptive Stats. Last Revised February 5, 2018

STAT 213 HW0a. R/RStudio Intro / Basic Descriptive Stats. Last Revised February 5, 2018 STAT 213 HW0a R/RStudio Intro / Basic Descriptive Stats Last Revised February 5, 2018 1 Starting R/RStudio There are two ways you can run the software we will be using for labs, R and RStudio. Option 1

More information

An Introduction to R. Ed D. J. Berry 9th January 2017

An Introduction to R. Ed D. J. Berry 9th January 2017 An Introduction to R Ed D. J. Berry 9th January 2017 Overview Why now? Why R? General tips Recommended packages Recommended resources 2/48 Why now? Efficiency Pointandclick software just isn't time efficient

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Chapter 3: Two-Sample Inferences September, 2017 1 / 44 Stat 427/527: Advanced Data Analysis I Chapter 3: Two-Sample Inferences September, 2017 2 / 44 Topics Suppose

More information

Introduction to R for Beginners, Level II. Jeon Lee Bio-Informatics Core Facility (BICF), UTSW

Introduction to R for Beginners, Level II. Jeon Lee Bio-Informatics Core Facility (BICF), UTSW Introduction to R for Beginners, Level II Jeon Lee Bio-Informatics Core Facility (BICF), UTSW Basics of R Powerful programming language and environment for statistical computing Useful for very basic analysis

More information

Lecture 09. Graphics::ggplot I R Teaching Team. October 1, 2018

Lecture 09. Graphics::ggplot I R Teaching Team. October 1, 2018 Lecture 09 Graphics::ggplot I 2018 R Teaching Team October 1, 2018 Acknowledgements 1. Mike Fliss & Sara Levintow! 2. stackoverflow (particularly user David for lecture styling - link) 3. R Markdown: The

More information

Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016

Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 Why R? FREE Open source Constantly updating the functions is has Constantly adding new functions Learning R will help you learn other programming

More information

ggplot2 basics Hadley Wickham Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University September 2011

ggplot2 basics Hadley Wickham Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University September 2011 ggplot2 basics Hadley Wickham Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University September 2011 1. Diving in: scatterplots & aesthetics 2. Facetting 3. Geoms

More information

Plotting with Rcell (Version 1.2-5)

Plotting with Rcell (Version 1.2-5) Plotting with Rcell (Version 1.2-) Alan Bush October 7, 13 1 Introduction Rcell uses the functions of the ggplots2 package to create the plots. This package created by Wickham implements the ideas of Wilkinson

More information

Intro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf. Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University

Intro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf. Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University Intro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University Overview Import data Move around the file system, save an image

More information

Making sense of census microdata

Making sense of census microdata Making sense of census microdata Tutorial 3: Creating aggregated variables and visualisations First, open a new script in R studio and save it in your working directory, so you will be able to access this

More information

ggplot2 for beginners Maria Novosolov 1 December, 2014

ggplot2 for beginners Maria Novosolov 1 December, 2014 ggplot2 for beginners Maria Novosolov 1 December, 214 For this tutorial we will use the data of reproductive traits in lizards on different islands (found in the website) First thing is to set the working

More information

Instructions and Result Summary

Instructions and Result Summary Instructions and Result Summary VU Biostatistics and Experimental Design PLA.216 Exercise 1 Introduction to R & Biostatistics Name and Student ID MAXIMILIANE MUSTERFRAU 01330974 Name and Student ID JOHN

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

SPSS. (Statistical Packages for the Social Sciences)

SPSS. (Statistical Packages for the Social Sciences) Inger Persson SPSS (Statistical Packages for the Social Sciences) SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform basic statistical calculations in SPSS.

More information

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data. Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting

More information

Graphics in R Ira Sharenow January 2, 2019

Graphics in R Ira Sharenow January 2, 2019 Graphics in R Ira Sharenow January 2, 2019 library(ggplot2) # graphing library library(rcolorbrewer) # nice colors R Markdown This is an R Markdown document. The purpose of this document is to show R users

More information

Intro to R for Epidemiologists

Intro to R for Epidemiologists Lab 9 (3/19/15) Intro to R for Epidemiologists Part 1. MPG vs. Weight in mtcars dataset The mtcars dataset in the datasets package contains fuel consumption and 10 aspects of automobile design and performance

More information

k-nn classification with R QMMA

k-nn classification with R QMMA k-nn classification with R QMMA Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20labs/l1-knn-eng.html#(1) 1/16 HW (Height and weight) of adults Statistics

More information

Lecture 4: Data Visualization I

Lecture 4: Data Visualization I Lecture 4: Data Visualization I Data Science for Business Analytics Thibault Vatter Department of Statistics, Columbia University and HEC Lausanne, UNIL 11.03.2018 Outline 1 Overview

More information

Data visualization with ggplot2

Data visualization with ggplot2 Data visualization with ggplot2 Visualizing data in R with the ggplot2 package Authors: Mateusz Kuzak, Diana Marek, Hedi Peterson, Dmytro Fishman Disclaimer We will be using the functions in the ggplot2

More information

Outline day 4 May 30th

Outline day 4 May 30th Graphing in R: basic graphing ggplot2 package Outline day 4 May 30th 05/2017 117 Graphing in R: basic graphing 05/2017 118 basic graphing Producing graphs R-base package graphics offers funcaons for producing

More information

Session 3 Nick Hathaway;

Session 3 Nick Hathaway; Session 3 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Manipulating Data frames and matrices 1 Converting to long vs wide formats.................................... 2 Manipulating data in table........................................

More information

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,

More information

Data Visualization. Andrew Jaffe Instructor

Data Visualization. Andrew Jaffe Instructor Module 9 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots previously, but we are going to expand the ability to customize these basic graphics first. 2/45 Read in Data

More information

> glucose = c(81, 85, 93, 93, 99, 76, 75, 84, 78, 84, 81, 82, 89, + 81, 96, 82, 74, 70, 84, 86, 80, 70, 131, 75, 88, 102, 115, + 89, 82, 79, 106)

> glucose = c(81, 85, 93, 93, 99, 76, 75, 84, 78, 84, 81, 82, 89, + 81, 96, 82, 74, 70, 84, 86, 80, 70, 131, 75, 88, 102, 115, + 89, 82, 79, 106) This document describes how to use a number of R commands for plotting one variable and for calculating one variable summary statistics Specifically, it describes how to use R to create dotplots, histograms,

More information