Introduction to R and R-Studio Toy Program #2 Excel to R & Basic Descriptives

Size: px
Start display at page:

Download "Introduction to R and R-Studio Toy Program #2 Excel to R & Basic Descriptives"

Transcription

1 Introduction to R and R-Studio Toy Program #2 Basic Descriptives Summary The goal of this toy program is to give you a boiler for working with your own excel data. So, I m hoping you ll try!. In this illustration, you will: 1. Import excel data and save to an.rdata dataset 2. Produce some basic numerical descriptive statistics 3. Produce some basic graphs. Packages Used in This Illustration To install these packages (one time), at the console window, type install.packages( openxlsx ) install.packages( summarytools ) install.packages( stargazer ) install.packages( ggplot2 ) install.packages( psych ) Tips 1) When installing packages, don t forget. Install.packages has a period between install and packages 2) When installing packages, the package name must be enclosed in quotes 3) If you think it might be helpful, skim the summary at the end of this illustration before starting your session Note Prefacing command name with packagename:: In the illustration that follows, often, you will see a command that is of the form packagename::command. Strictly speaking, I do not need to provide the package name. It s optional. For us beginners, however, I find this to be a helpful learning tool. Preliminary - Before you begin Download from the course website the excel file depress_small.xlsx and place this on your working directory. Preliminary - Set working directory. setwd("/users/cbigelow/desktop") R handout Fall 2018 Toy Program 2.docx Page 1 of 11

2 2. Import Excel Data Using Package openxlsx. Output/save to an R dataset ( name.rdata ) # Following assumes that the following package has been installed: openxlsx # Also assumes depress_small.xlsx has been downloaded and is on your desktop # library(openxlsx) depress_small <- read.xlsx("depress_small.xlsx") save(depress_small, file="depress_small.rdata") str(depress_small) ## 'data.frame': 294 obs. of 3 variables: ## $ id : num ## $ age : num ## $ depressed: num # Quick look using command summary() that comes with basic installation summary(depress_small) ## id age depressed ## Min. : 1.00 Min. :18.00 Min. : ## 1st Qu.: st Qu.: st Qu.: ## Median : Median :42.50 Median : ## Mean : Mean :44.41 Mean : ## 3rd Qu.: rd Qu.: rd Qu.: ## Max. : Max. :89.00 Max. : ONE VARIABLE Descriptive Statistics Using Packages summarytools and stargazer # Following assumes the following packages have been installed: summarytools and stargazer # library(summarytools) library(stargazer) #3a - I. Single continuous variable using package summarytools and command descr() summarytools::descr(depress_small$age, stats = c("n.valid","mean", "sd", "min", "med", "max"), transpose = TR UE) ## Descriptive Statistics ## depress_small$age ## N: 294 ## ## N.Valid Mean Std.Dev Min Median Max ## ## age R handout Fall 2018 Toy Program 2.docx Page 2 of 11

3 #3a - II. Single continuous variable using package stargazer and command stargazer() stargazer::stargazer(depress_small[c("age")], type = "text",summary.stat=c("n", "mean", "sd", "min", "p25", " median", "p75", "max")) ## ## ============================================================== ## Statistic N Mean St. Dev. Min Pctl(25) Median Pctl(75) Max ## ## age ## #3b. Single discrete variable using package summarytools and command freq(). Note: order= freq is optional. summarytools::freq(depress_small$depressed, order = "freq") ## Frequencies ## depress_small$depressed ## Type: Numeric ## ## Freq % Valid % Valid Cum. % Total % Total Cum. ## ## ## ## <NA> ## Total ONE VARIABLE Graphical Summaries Using Package ggplot2 # Following assumes the following package has been installed: ggplot2r # #4a - I. Single continuous variable - HISTOGRAM # Tip - Develop plot layer by layer p <- ggplot(data=depress_small, aes(x=age)) + geom_histogram(color="black", fill="blue", binwidth=2) p <- p + ggtitle("histogram of Age") p1 <- p + theme_bw() p1 R handout Fall 2018 Toy Program 2.docx Page 3 of 11

4 #4a - II. Single continuous variable BOXPLOT (See summary for basic command. Here I have added options) # Tip - Develop plot layer by layer p <- ggplot(data=depress_small, aes(x=1, y=age)) + geom_boxplot(color="black", fill="blue") p <- p + xlab(".") p <- p + ylab("age (years)") p <- p + ggtitle("box Plot of Age") p2 <- p + theme_bw() p2 #4b - I. Single discrete variable - BARPLOT of frequencies (Again, see summary for basic command) # Tip - Develop plot layer by layer p <- ggplot(data=depress_small, aes(x=factor(depressed))) p <- p + geom_bar(color="black", fill="blue") p <- p + xlab("depression Status (0=No, 1=Yes)") p <- p + ylab("frequency (#)") p <- p + ggtitle("depression Dataset") p3 <- p + theme_bw() p3 R handout Fall 2018 Toy Program 2.docx Page 4 of 11

5 #4b - II. Single discrete variable - BARPLOT of relative frequencies (See summary for basic command) # Tip - Develop plot layer by layer p <- ggplot(data=depress_small, aes(x=factor(depressed)), stat="count") p <- p + scale_y_continuous(labels = scales::percent_format()) p <- p + geom_bar(color="black", fill="blue") p <- p + xlab("depression Status (0=No, 1=Yes)") p <- p + ylab("relative Frequency (%)") p <- p + ggtitle("depression Dataset") p4 <- p + theme_bw() p4 5. ONE CONTINUOUS & ONE GROUPING VARIABLE Descriptives Using Package psych # Following assumes the following packages have been installed: psych # library(psych) #5a - I. Using package psych and command describeby() depress_small$depressed <- as.factor(depress_small$depressed) psych::describeby(depress_small$age, group = depress_small$depressed) ## $`0` ## vars n mean sd median trimmed mad min max range skew kurtosis ## X ## se ## X ## ## $`1` ## vars n mean sd median trimmed mad min max range skew kurtosis ## X ## se ## X ## ## attr(,"call") ## by.default(data = x, INDICES = group, FUN = describe, type = type) R handout Fall 2018 Toy Program 2.docx Page 5 of 11

6 6. ONE CONTINUOUS & ONE GROUPING VARIABLE Graphical Summary Using Package ggplot2 # Following assumes the following package has been installed: ggplot2r # TAKE CARE - packages are installed from the CONSOLE, never from within an R Markdown # install.packages("ggplot2") # #6a. Side- by- side BOX PLOTS using command ggplot()(see summary for basic command) p <- ggplot(data=depress_small, aes(x=as.factor(depressed), y=age)) p <- p + geom_boxplot(color="black", fill="blue") p <- p + xlab("depression Status (0=No, 1=Yes") p <- p + ylab("age (years)") p <- p + ggtitle("side- by Side Box Plot of Age (years), by Depression Status") p5 <- p + theme_bw() p5 #6b - I. Side- by- side HISTOGRAMS, stacked, using command ggplot(). (See summary for basic command) p <- ggplot(data=depress_small, aes(x=age, fill=depressed)) p <- p + geom_histogram(color="black", fill="blue",binwidth=5) p <- p + facet_wrap(~depressed, ncol=1) p <- p + ggtitle("histogram of Age (years), by Depression Status (0=No, 1=Yes)") p6 <- p + theme_bw() p6 R handout Fall 2018 Toy Program 2.docx Page 6 of 11

7 #6b - II. Side- by- side HISTOGRAMS, overlay, using command ggplot()(see summary for basic command) p <- ggplot(data=depress_small, aes(x=age, fill=depressed, color=depressed)) + geom_histogram(position="ident ity", alpha=0.4, binwidth=5) p <- p + ggtitle("histogram of Age (years), over Depression Status (0=No, 1=Yes)") p7 <- p + theme_bw() p7 R handout Fall 2018 Toy Program 2.docx Page 7 of 11

8 Toy Program #2 Basic Descriptives Summary Notes 1) This summary describes the basics. Some of the options used on previous pages are not shown. It s always possible to add options. 2) To see options, at right, click on the help tab and enter the command name. You ll get an expanded help page. 3) If you like you can precede a command with the package name followed by two colons (as I did in this handout). This is OPTIONAL! 4) What you edit is shaded in yellow. A. WORKING WITH DATA B. NUMERICAL DESCRIPTIVES C. DATA VISUALIZATION A. WORKING WITH DATA 1. Import Excel Data library(openxlsx) currentdataframee <- read.xlsx("excelfilename.xlsx") depress_small <- read.xlsx("depress_small.xlsx") 2. Save Current Data to R Dataset save(currentdataframe, file="rdatasetname.rdata") save(depress_small, file="depress_small.rdata") B. NUMERICAL DESCRIPTIVES Take care!!! The statistic for the sample size goes by different names, depending on the package!! In summarytools, it is n.valid. In stargazer, it is n. 1. Single Continuous Variable Using Package summarytools library(summarytools) descr(dataframe$variable, stats = c("statistic, statistic )) descr(depress_small$age, stats = c("n.valid","mean", "sd", "min", "med", "max")) 2. Single Continuous Variable Using Package stargazer library(stargazer) stargazer(dataframe[c("variable")], type = "text",summary.stat=c("statistic", "statistic")) stargazer(depress_small[c("age")], type = "text",summary.stat=c("n", "mean", "sd", "min", "p25", "median", "p 75", "max")) R handout Fall 2018 Toy Program 2.docx Page 8 of 11

9 3. Single Discrete Variable Using Package summarytools library(summarytools) freq(dataframe$variable) freq(depress_small$depressed, order = "freq") 4. One Continuous Variable & One Grouping Variable Using Package psych Tip! The grouping variable must be declared to be a factor variable. So I always do this, needed or not. library(psych) dataframe$groupvariable <- as.factor(dataframe$groupvariable) describeby(dataframe$continuousvariable, group = dataframe$groupvariable) depress_small$depressed <- as.factor(depress_small$depressed) describeby(depress_small$age, group = depress_small$depressed) C. DATA VISUALIZATION Suggestion! I find it the least error prone if I build my graphs layer by layer. I start with a basic graph which I echo. Then, if I like it, I add an embellishment. I echo that and edit until I m happy. And so on. Try it! 1. Single Continuous Variable HISOTGRAM Using Package ggplot2 p <- ggplot(data=dataframe, aes(x=variable)) + geom_histogram() p <- ggplot(data=depress_small, aes(x=age)) + geom_histogram(color="black", fill="blue", binwidth=2) p <- p + ggtitle("histogram of Age") p1 <- p + theme_bw() p1 2. Single Continuous Variable BOXPLOT Using Package ggplot2 p <- ggplot(data=dataframe, aes(x=1, y=variable)) + geom_boxplot() p <- ggplot(data=depress_small, aes(x=1, y=age)) + geom_boxplot(color="black", fill="blue") p <- p + xlab(".") p <- p + ylab("age (years)") p <- p + ggtitle("box Plot of Age") p2 <- p + theme_bw() p2 R handout Fall 2018 Toy Program 2.docx Page 9 of 11

10 2. Single Discrete Variable BAR PLOT of frequencies (also called bar chart) Using Package ggplot2 p <- ggplot(data=dataframe, aes(x=factor(discretevariable))) p <- p + geom_bar() p <- ggplot(data=depress_small, aes(x=factor(depressed))) p <- p + geom_bar(color="black", fill="blue") p <- p + xlab("depression Status (0=No, 1=Yes)") p <- p + ylab("frequency (#)") p <- p + ggtitle("depression Dataset") p3 <- p + theme_bw() p3 3. Single Discrete Variable BAR PLOT of relative frequencies (also called bar chart) Using Package ggplot2 p <- ggplot(data=dataframe, aes(x=factor(discretevariable)), stat="count") p <- p + scale_y_continuous(labels = scales::percent_format()) p <- p + geom_bar() p <- ggplot(data=depress_small, aes(x=factor(depressed)), stat="count") p <- p + scale_y_continuous(labels = scales::percent_format()) p <- p + geom_bar(color="black", fill="blue") p <- p + xlab("depression Status (0=No, 1=Yes)") p <- p + ylab("relative Frequency (%)") p <- p + ggtitle("depression Dataset") p4 <- p + theme_bw() p4 4. One Continuous Variable & One Grouping Variable SIDE-BY-SIDE BOX PLOT Using Package ggplot2 p <- ggplot(data=dataframe, aes(x=as.factor(groupvariable), y=continuousvariable)) p <- p + geom_boxplot( ) p <- ggplot(data=depress_small, aes(x=as.factor(depressed), y=age)) p <- p + geom_boxplot(color="black", fill="blue") p <- p + xlab("depression Status (0=No, 1=Yes") p <- p + ylab("age (years)") p <- p + ggtitle("side- by Side Box Plot of Age (years), by Depression Status") p5 <- p + theme_bw() p5 R handout Fall 2018 Toy Program 2.docx Page 10 of 11

11 5. One Continuous Variable & One Grouping Variable SIDE-BY-SIDE HISTOGRAMS stacked Using Package ggplot2 p <- ggplot(data=dataframe, aes(x=continuousvariable, fill=groupvariable)) p <- p + geom_histogram() p <- p + facet_wrap(~groupvariable, ncol=1) p <- ggplot(data=depress_small, aes(x=age, fill=depressed)) p <- p + geom_histogram(color="black", fill="blue",binwidth=5) p <- p + facet_wrap(~depressed, ncol=1) p <- p + ggtitle("histogram of Age (years), by Depression Status (0=No, 1=Yes)") p6 <- p + theme_bw() p6 6. One Continuous Variable & One Grouping Variable SIDE-BY-SIDE HISTOGRAMS overlaid Using Package ggplot2 p <- ggplot(data=dataframe, aes(x=continuousvariable, fill=groupvariable)) p <- p + geom_histogram(position="identity") p <- p + facet_wrap(~groupvariable, ncol=1) p <- ggplot(data=depress_small, aes(x=age, fill=depressed, color=depressed)) p <- p + geom_histogram(position="identity", alpha=0.4, binwidth=5) p <- p + ggtitle("histogram of Age (years), over Depression Status (0=No, 1=Yes)") p7 <- p + theme_bw() p7 R handout Fall 2018 Toy Program 2.docx Page 11 of 11

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... 2. Load R Data.. a. Load R data frames...

More information

Dataset Used in This Lab (download from course website framingham_1000.rdata

Dataset Used in This Lab (download from course website   framingham_1000.rdata Introduction to R and R- Studio Sring 2019 Lab #1 Some Basics Before you begin: If you have not already installed R and RStudio, lease see Windows Users: htt://eole.umass.edu/bie540w/df/how%20to%20install%20r%20and%20r%20studio%20windows%20users%20fall%20201

More information

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio Introduction to R and R-Studio 2018-19 Toy Program #1 R Essentials This illustration Assumes that You Have Installed R and R-Studio If you have not already installed R and RStudio, please see: Windows

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

Introduction to R and R-Studio In-Class Lab Activity The 1970 Draft Lottery

Introduction to R and R-Studio In-Class Lab Activity The 1970 Draft Lottery Introduction to R and R-Studio 2018-19 In-Class Lab Activity The 1970 Draft Lottery Summary The goal of this activity is to give you practice with R Markdown for saving your work. It s also a fun bit of

More information

Psychology 405: Psychometric Theory Homework 1: answers

Psychology 405: Psychometric Theory Homework 1: answers Psychology 405: Psychometric Theory Homework 1: answers William Revelle Department of Psychology Northwestern University Evanston, Illinois USA April, 2017 1 / 12 Outline Preliminaries Assignment Analysis

More information

Statistical transformations

Statistical transformations Statistical transformations Next, let s take a look at a bar chart. Bar charts seem simple, but they are interesting because they reveal something subtle about plots. Consider a basic bar chart, as drawn

More information

Introduction to R. Dataset Basics. March 2018

Introduction to R. Dataset Basics. March 2018 Introduction to R March 2018 1. Preliminaries.... a) Suggested packages for importing/exporting data.... b) FAQ: How to find the path of your dataset (or whatever). 2. Import/Export Data........ a) R (.Rdata)

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

Rstudio GGPLOT2. Preparations. The first plot: Hello world! W2018 RENR690 Zihaohan Sang

Rstudio GGPLOT2. Preparations. The first plot: Hello world! W2018 RENR690 Zihaohan Sang Rstudio GGPLOT2 Preparations There are several different systems for creating data visualizations in R. We will introduce ggplot2, which is based on Leland Wilkinson s Grammar of Graphics. The learning

More information

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R...

Introduction to R and R-Studio Getting Data Into R. 1. Enter Data Directly into R... Introduction to R and R-Studio 2017-18 02. Getting Data Into R 1. Enter Data Directly into R...... 2. Import Excel Data (.xlsx ) into R..... 3. Import Stata Data (.dta ) into R...... a) From a folder on

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

Demo yeast mutant analysis

Demo yeast mutant analysis Demo yeast mutant analysis Jean-Yves Sgro February 20, 2018 Contents 1 Analysis of yeast growth data 1 1.1 Set working directory........................................ 1 1.2 List all files in directory.......................................

More information

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot.

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. 3 confint A metafor package function that gives you the confidence intervals of effect sizes.

More information

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from

More information

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R

R Workshop Guide. 1 Some Programming Basics. 1.1 Writing and executing code in R R Workshop Guide This guide reviews the examples we will cover in today s workshop. It should be a helpful introduction to R, but for more details, you can access a more extensive user guide for R on the

More information

Chapter 2 Exploring Data with Graphs and Numerical Summaries

Chapter 2 Exploring Data with Graphs and Numerical Summaries Chapter 2 Exploring Data with Graphs and Numerical Summaries Constructing a Histogram on the TI-83 Suppose we have a small class with the following scores on a quiz: 4.5, 5, 5, 6, 6, 7, 8, 8, 8, 8, 9,

More information

R Workshop Daniel Fuller

R Workshop Daniel Fuller R Workshop Daniel Fuller Welcome to the R Workshop @ Memorial HKR The R project for statistical computing is a free open source statistical programming language and project. Follow these steps to get started:

More information

Statistics Lecture 6. Looking at data one variable

Statistics Lecture 6. Looking at data one variable Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)

More information

050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA

050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA 050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA 55555555555555555555555555555555555555555555555555 NYYNNYNNNYNYYYYYNNYNNNNNYNYYYYYNYNNNNYNNYNNNYNNNNN 01 CAEADDBEDEDBABBBBCBDDDBAAAECEEDCDCDBACCACEECACCCEA

More information

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA Learning objectives: Getting data ready for analysis: 1) Learn several methods of exploring the

More information

EXPLORATORY DATA ANALYSIS. Introducing the data

EXPLORATORY DATA ANALYSIS. Introducing the data EXPLORATORY DATA ANALYSIS Introducing the data Email data set > email # A tibble: 3,921 21 spam to_multiple from cc sent_email time image 1 not-spam 0 1 0 0

More information

Package ggextra. April 4, 2018

Package ggextra. April 4, 2018 Package ggextra April 4, 2018 Title Add Marginal Histograms to 'ggplot2', and More 'ggplot2' Enhancements Version 0.8 Collection of functions and layers to enhance 'ggplot2'. The flagship function is 'ggmarginal()',

More information

Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 10, 2018

Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 10, 2018 Exploratory Data Analysis on NCES Data Developed by Yuqi Liao, Paul Bailey, and Ting Zhang May 1, 218 Vignette Outline This vignette provides examples of conducting exploratory data analysis (EDA) on NAEP

More information

Old Faithful Chris Parrish

Old Faithful Chris Parrish Old Faithful Chris Parrish 17-4-27 Contents Old Faithful eruptions 1 data.................................................. 1 duration................................................ 1 waiting time..............................................

More information

Install RStudio from - use the standard installation.

Install RStudio from   - use the standard installation. Session 1: Reading in Data Before you begin: Install RStudio from http://www.rstudio.com/ide/download/ - use the standard installation. Go to the course website; http://faculty.washington.edu/kenrice/rintro/

More information

Using R to score personality scales

Using R to score personality scales Using R to score personality scales William Revelle Northwestern University February 27, 2013 Contents 1 Overview for the impatient 2 2 An example 2 2.1 Getting the data.................................

More information

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,

More information

Introduction to R and the tidyverse. Paolo Crosetto

Introduction to R and the tidyverse. Paolo Crosetto Introduction to R and the tidyverse Paolo Crosetto Lecture 1: plotting Before we start: Rstudio Interactive console Object explorer Script window Plot window Before we start: R concatenate: c() assign:

More information

Data Science Essentials

Data Science Essentials Data Science Essentials Lab 2 Working with Summary Statistics Overview In this lab, you will learn how to use either R or Python to compute and understand the basics of descriptive statistics. Descriptive

More information

Exercise 1: Introduction to Stata

Exercise 1: Introduction to Stata Exercise 1: Introduction to Stata New Stata Commands use describe summarize stem graph box histogram log on, off exit New Stata Commands Downloading Data from the Web I recommend that you use Internet

More information

Assignments. Math 338 Lab 1: Introduction to R. Atoms, Vectors and Matrices

Assignments. Math 338 Lab 1: Introduction to R. Atoms, Vectors and Matrices Assignments Math 338 Lab 1: Introduction to R. Generally speaking, there are three basic forms of assigning data. Case one is the single atom or a single number. Assigning a number to an object in this

More information

Outline day 4 May 30th

Outline day 4 May 30th Graphing in R: basic graphing ggplot2 package Outline day 4 May 30th 05/2017 117 Graphing in R: basic graphing 05/2017 118 basic graphing Producing graphs R-base package graphics offers funcaons for producing

More information

Week 4: Describing data and estimation

Week 4: Describing data and estimation Week 4: Describing data and estimation Goals Investigate sampling error; see that larger samples have less sampling error. Visualize confidence intervals. Calculate basic summary statistics using R. Calculate

More information

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS. 1 SPSS 11.5 for Windows Introductory Assignment Material covered: Opening an existing SPSS data file, creating new data files, generating frequency distributions and descriptive statistics, obtaining printouts

More information

Facets and Continuous graphs

Facets and Continuous graphs Facets and Continuous graphs One way to add additional variables is with aesthetics. Another way, particularly useful for categorical variables, is to split your plot into facets, subplots that each display

More information

The diamonds dataset Visualizing data in R with ggplot2

The diamonds dataset Visualizing data in R with ggplot2 Lecture 2 STATS/CME 195 Matteo Sesia Stanford University Spring 2018 Contents The diamonds dataset Visualizing data in R with ggplot2 The diamonds dataset The tibble package The tibble package is part

More information

Creating elegant graphics in R with ggplot2

Creating elegant graphics in R with ggplot2 Creating elegant graphics in R with ggplot2 Lauren Steely Bren School of Environmental Science and Management University of California, Santa Barbara What is ggplot2, and why is it so great? ggplot2 is

More information

Graphics in R Ira Sharenow January 2, 2019

Graphics in R Ira Sharenow January 2, 2019 Graphics in R Ira Sharenow January 2, 2019 library(ggplot2) # graphing library library(rcolorbrewer) # nice colors R Markdown This is an R Markdown document. The purpose of this document is to show R users

More information

Solution to Tumor growth in mice

Solution to Tumor growth in mice Solution to Tumor growth in mice Exercise 1 1. Import the data to R Data is in the file tumorvols.csv which can be read with the read.csv2 function. For a succesful import you need to tell R where exactly

More information

050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA

050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA 050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA 55555555555555555555555555555555555555555555555555 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY 01 CAEADDBEDEDBABBBBCBDDDBAAAECEEDCDCDBACCACEECACCCEA

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

R Bootcamp Part I (B)

R Bootcamp Part I (B) R Bootcamp Part I (B) An R Script is available to make it easy for you to copy/paste all the tutorial commands into RStudio: http://statistics.uchicago.edu/~collins/rbootcamp/rbootcamp1b_rcode.r Preliminaries:

More information

Introduction to R. Andy Grogan-Kaylor October 22, Contents

Introduction to R. Andy Grogan-Kaylor October 22, Contents Introduction to R Andy Grogan-Kaylor October 22, 2018 Contents 1 Background 2 2 Introduction 2 3 Base R and Libraries 3 4 Working Directory 3 5 Writing R Code or Script 4 6 Graphical User Interface 4 7

More information

Introductory SAS example

Introductory SAS example Introductory SAS example STAT:5201 1 Introduction SAS is a command-driven statistical package; you enter statements in SAS s language, submit them to SAS, and get output. A fairly friendly user interface

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Chapter 3: Two-Sample Inferences September, 2017 1 / 44 Stat 427/527: Advanced Data Analysis I Chapter 3: Two-Sample Inferences September, 2017 2 / 44 Topics Suppose

More information

This document is designed to get you started with using R

This document is designed to get you started with using R An Introduction to R This document is designed to get you started with using R We will learn about what R is and its advantages over other statistics packages the basics of R plotting data and graphs What

More information

An Introduction to R- Programming

An Introduction to R- Programming An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University

More information

An Introduction to R Graphics

An Introduction to R Graphics An Introduction to R Graphics PnP Group Seminar 25 th April 2012 Why use R for graphics? Fast data exploration Easy automation and reproducibility Create publication quality figures Customisation of almost

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

03 - Intro to graphics (with ggplot2)

03 - Intro to graphics (with ggplot2) 3 - Intro to graphics (with ggplot2) ST 597 Spring 217 University of Alabama 3-dataviz.pdf Contents 1 Intro to R Graphics 2 1.1 Graphics Packages................................ 2 1.2 Base Graphics...................................

More information

Using R to score personality scales

Using R to score personality scales Using R to score personality scales William Revelle Northwestern University December 19, 2017 Abstract The psych package (Revelle, 2017) was developed to perform most basic psychometric functions using

More information

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

Data Science and Machine Learning Essentials

Data Science and Machine Learning Essentials Data Science and Machine Learning Essentials Lab 3A Visualizing Data By Stephen Elston and Graeme Malcolm Overview In this lab, you will learn how to use R or Python to visualize data. If you intend to

More information

STAT 135 Lab 1 Solutions

STAT 135 Lab 1 Solutions STAT 135 Lab 1 Solutions January 26, 2015 Introduction To complete this lab, you will need to have access to R and RStudio. If you have not already done so, you can download R from http://cran.cnr.berkeley.edu/,

More information

A system for statistical analysis. Instructions for installing software. R, R-studio and the R-commander

A system for statistical analysis. Instructions for installing software. R, R-studio and the R-commander Instructions for installing software R, R-studio and the R-commander Graeme.Hutcheson@manchester.ac.uk Manchester Institute of Education, University of Manchester This course uses the following software...

More information

Statistical Software Camp: Introduction to R

Statistical Software Camp: Introduction to R Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities

More information

Stat 302 Statistical Software and Its Applications SAS: Data I/O

Stat 302 Statistical Software and Its Applications SAS: Data I/O Stat 302 Statistical Software and Its Applications SAS: Data I/O Yen-Chi Chen Department of Statistics, University of Washington Autumn 2016 1 / 33 Getting Data Files Get the following data sets from the

More information

ggplot2 for beginners Maria Novosolov 1 December, 2014

ggplot2 for beginners Maria Novosolov 1 December, 2014 ggplot2 for beginners Maria Novosolov 1 December, 214 For this tutorial we will use the data of reproductive traits in lizards on different islands (found in the website) First thing is to set the working

More information

Week 1: Introduction to Stata

Week 1: Introduction to Stata Week 1: Introduction to Stata Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline Log

More information

An introduction to WS 2015/2016

An introduction to WS 2015/2016 An introduction to WS 2015/2016 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Prof. Dr. Martin Hutzenthaler (previously AG Metzler, now University of Duisburg-Essen) course

More information

R Tutorial. Anup Aprem September 13, 2016

R Tutorial. Anup Aprem September 13, 2016 R Tutorial Anup Aprem aaprem@ece.ubc.ca September 13, 2016 Installation Installing R: https://www.r-project.org/ Recommended to also install R Studio: https://www.rstudio.com/ Vectors Basic element is

More information

1 Simple Linear Regression

1 Simple Linear Regression Math 158 Jo Hardin R code 1 Simple Linear Regression Consider a dataset from ISLR on credit scores. Because we don t know the sampling mechanism used to collect the data, we are unable to generalize the

More information

Properties of Data. Digging into Data: Jordan Boyd-Graber. University of Maryland. February 11, 2013

Properties of Data. Digging into Data: Jordan Boyd-Graber. University of Maryland. February 11, 2013 Properties of Data Digging into Data: Jordan Boyd-Graber University of Maryland February 11, 2013 Digging into Data: Jordan Boyd-Graber (UMD) Properties of Data February 11, 2013 1 / 43 Roadmap Munging

More information

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT Python for Data Analysis Prof.Sushila Aghav-Palwe Assistant Professor MIT Four steps to apply data analytics: 1. Define your Objective What are you trying to achieve? What could the result look like? 2.

More information

Plotting with Rcell (Version 1.2-5)

Plotting with Rcell (Version 1.2-5) Plotting with Rcell (Version 1.2-) Alan Bush October 7, 13 1 Introduction Rcell uses the functions of the ggplots2 package to create the plots. This package created by Wickham implements the ideas of Wilkinson

More information

Introduction to Stata Getting Data into Stata. 1. Enter Data: Create a New Data Set in Stata...

Introduction to Stata Getting Data into Stata. 1. Enter Data: Create a New Data Set in Stata... Introduction to Stata 2016-17 02. Getting Data into Stata 1. Enter Data: Create a New Data Set in Stata.... 2. Enter Data: How to Import an Excel Data Set.... 3. Import a Stata Data Set Directly from the

More information

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11} Entering Data: Basic Commands Consider the data set: {15, 22, 32, 31, 52, 41, 11} Data is stored in Lists on the calculator. Locate and press the STAT button on the calculator. Choose EDIT. The calculator

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

Minitab Notes for Activity 1

Minitab Notes for Activity 1 Minitab Notes for Activity 1 Creating the Worksheet 1. Label the columns as team, heat, and time. 2. Have Minitab automatically enter the team data for you. a. Choose Calc / Make Patterned Data / Simple

More information

Organizing and Summarizing Data

Organizing and Summarizing Data Section 2.2 9 Organizing and Summarizing Data Section 2.2 C H A P T E R 2 4 Example 2 (pg. 72) A Histogram for Discrete Data To create a histogram, you have two choices: 1): enter all the individual data

More information

Creating a Box-and-Whisker Graph in Excel: Step One: Step Two:

Creating a Box-and-Whisker Graph in Excel: Step One: Step Two: Creating a Box-and-Whisker Graph in Excel: It s not as simple as selecting Box and Whisker from the Chart Wizard. But if you ve made a few graphs in Excel before, it s not that complicated to convince

More information

PRESENTING DATA. Overview. Some basic things to remember

PRESENTING DATA. Overview. Some basic things to remember PRESENTING DATA This handout is one of a series that accompanies An Adventure in Statistics: The Reality Enigma by me, Andy Field. These handouts are offered for free (although I hope you will buy the

More information

Chapter 1 Histograms, Scatterplots, and Graphs of Functions

Chapter 1 Histograms, Scatterplots, and Graphs of Functions Chapter 1 Histograms, Scatterplots, and Graphs of Functions 1.1 Using Lists for Data Entry To enter data into the calculator you use the statistics menu. You can store data into lists labeled L1 through

More information

R programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program

R programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program R programming Philip J Cwynar University of Pittsburgh School of Information Sciences and Intelligent Systems Program Background R is a programming language and software environment for statistical analysis,

More information

Advanced Econometric Methods EMET3011/8014

Advanced Econometric Methods EMET3011/8014 Advanced Econometric Methods EMET3011/8014 Lecture 2 John Stachurski Semester 1, 2011 Announcements Missed first lecture? See www.johnstachurski.net/emet Weekly download of course notes First computer

More information

Tutorial: SeqAPass Boxplot Generator

Tutorial: SeqAPass Boxplot Generator 1 Tutorial: SeqAPass Boxplot Generator 1. Access SeqAPASS by opening https://seqapass.epa.gov/seqapass/ using Mozilla Firefox web browser 2. Open the About link on the login page or upon logging in to

More information

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva Econ 329 - Stata Tutorial I: Reading, Organizing and Describing Data Sanjaya DeSilva September 8, 2008 1 Basics When you open Stata, you will see four windows. 1. The Results window list all the commands

More information

Desktop Command window

Desktop Command window Chapter 1 Matlab Overview EGR1302 Desktop Command window Current Directory window Tb Tabs to toggle between Current Directory & Workspace Windows Command History window 1 Desktop Default appearance Command

More information

Introduction to R for Beginners, Level II. Jeon Lee Bio-Informatics Core Facility (BICF), UTSW

Introduction to R for Beginners, Level II. Jeon Lee Bio-Informatics Core Facility (BICF), UTSW Introduction to R for Beginners, Level II Jeon Lee Bio-Informatics Core Facility (BICF), UTSW Basics of R Powerful programming language and environment for statistical computing Useful for very basic analysis

More information

Practical 2: Plotting

Practical 2: Plotting Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory

More information

file:///users/williams03/a/workshops/2015.march/final/intro_to_r.html

file:///users/williams03/a/workshops/2015.march/final/intro_to_r.html Intro to R R is a functional programming language, which means that most of what one does is apply functions to objects. We will begin with a brief introduction to R objects and how functions work, and

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

History, installation and connection

History, installation and connection History, installation and connection The men behind our software Jim Goodnight, CEO SAS Inc Ross Ihaka Robert Gentleman (Duncan Temple Lang) originators of R 2 / 75 History SAS From late 1960s, North Carolina

More information

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics

Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics Stat 302 Statistical Software and Its Applications SAS: Data I/O & Descriptive Statistics Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 19, 2015 2 Getting

More information

Recoding and Labeling Variables

Recoding and Labeling Variables Updated July 2018 Recoding and Labeling Variables This set of notes describes how to use the computer program Stata to recode variables and save them as new variables as well as how to label variables.

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S0 SPSS Intro November 2013 Wilma Heemsbergen w.heemsbergen@nki.nl 1 13.00 ~ 15.30 Database (20 min) SPSS (40 min) Short break Exercise (60 min) This Afternoon During the

More information

The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan.

The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan. Graphics Agenda Grammer of Graphics Using ggplot2 The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan. ggplot2 (wiki) ggplot2 is a data visualization package Created

More information

Lab5A - Intro to GGPLOT2 Z.Sang Sept 24, 2018

Lab5A - Intro to GGPLOT2 Z.Sang Sept 24, 2018 LabA - Intro to GGPLOT2 Z.Sang Sept 24, 218 In this lab you will learn to visualize raw data by plotting exploratory graphics with ggplot2 package. Unlike final graphs for publication or thesis, exploratory

More information

The Average and SD in R

The Average and SD in R The Average and SD in R The Basics: mean() and sd() Calculating an average and standard deviation in R is straightforward. The mean() function calculates the average and the sd() function calculates the

More information

Exploring and Understanding Data Using R.

Exploring and Understanding Data Using R. Exploring and Understanding Data Using R. Loading the data into an R data frame: variable

More information

Name Date Types of Graphs and Creating Graphs Notes

Name Date Types of Graphs and Creating Graphs Notes Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.

More information

Statistics 528: Minitab Handout 1

Statistics 528: Minitab Handout 1 Statistics 528: Minitab Handout 1 Throughout the STAT 528-530 sequence, you will be asked to perform numerous statistical calculations with the aid of the Minitab software package. This handout will get

More information

Introduction to SAS Procedures SAS Basics III. Susan J. Slaughter, Avocet Solutions

Introduction to SAS Procedures SAS Basics III. Susan J. Slaughter, Avocet Solutions Introduction to SAS Procedures SAS Basics III Susan J. Slaughter, Avocet Solutions SAS Essentials Section for people new to SAS Core presentations 1. How SAS Thinks 2. Introduction to DATA Step Programming

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users BIOSTATS 640 Spring 2019 Review of Introductory Biostatistics R solutions Page 1 of 16 Preliminary Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - R Users a) How are homeworks graded? This

More information

Lab #7 - More on Regression in R Econ 224 September 18th, 2018

Lab #7 - More on Regression in R Econ 224 September 18th, 2018 Lab #7 - More on Regression in R Econ 224 September 18th, 2018 Robust Standard Errors Your reading assignment from Chapter 3 of ISL briefly discussed two ways that the standard regression inference formulas

More information

LAB #1: DESCRIPTIVE STATISTICS WITH R

LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

Data Visualization. Andrew Jaffe Instructor

Data Visualization. Andrew Jaffe Instructor Module 9 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots previously, but we are going to expand the ability to customize these basic graphics first. 2/45 Read in Data

More information