Introduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics

Size: px
Start display at page:

Download "Introduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics"

Transcription

1 Introduction to R: Using R for Statistics and Data Analysis BaRC Hot Topics

2 Why use R? Perform inferential statistics (e.g., use a statistical test to calculate a p-value) Analyze numbers (vectors and matrices) Create custom figures Automate analysis routines (and make them more reproducible) RStudio: R markdown (Rmd) and knit (html) Reduce copying and pasting Some Unix commands may be easier ask us! Use up-to-date analysis algorithms Real statisticians use it, and it s free! 2 2

3 Why not use R? A spreadsheet application already works fine You re already using another statistics package Ex: Prism, MatLab It s hard to use at first You have to know what commands/syntax to use You don t know how to get started Irrelevant if you re here today 3 3

4 About R Originally written by Ross Ihaka and Robert Gentleman Written in mostly C R is accessible from other languages (Python/Perl) Packages many statisticians/developers have extended R by creating packages (libraries) containing a set of commands/code, data, documentation in a well-defined format: Comprehensive R Archive Network (CRAN) cran.r-project.org BioConductor bioconductor.org Install package: install.packages("packagename") 4

5 Class Object Method About R: object-oriented (OO) systems 5

6 Getting Started: Running R 1. Using Rstudio (tak.wi.mit.edu/rstudio) Enter your tak username and password 2. On the command line (on tak) Log into tak ssh USERNAME@tak -Y Start R by typing R 3. On your own computer Go to R ( Download base from CRAN and install it on your computer Open/install the program Install RStudio (optional) 6 6

7 Start of an R session 7

8 RStudio Interface Editor; R Script Workspace; History Console Output: Plots rstudio.org 8

9 Good practices Save all useful commands and rationale Add comments (starting with # ) Reproducibility - several approaches: Write commands in R and then paste into a text file, or By convention, we end files of R commands with.r Use a specific name for file (ex: compare_wt_ko_weights.r) Start file with #!/usr/bin/rscript to make it a R script In RStudio, save the file as R markdown (.Rmd) and knit (.html) Use the up-arrow to get to previous command Minimize typing, as this increases potential errors. To clear your R window, use Ctrl-L Names of variables should not begin with numbers or underscore, use letters (case sensitive) uppercase + lowercase helps (mywtmice) can include dots (my.wt.mice) 9 9

10 Getting help Use the Help menu Check out Manuals contributed documentation Use R s help and example?median [show info for cmd]??median [search docs] example(median) Search the web r-project median Our favorite book: Introductory Statistics with R (Peter Dalgard) html help 10

11 Useful Commands dir() #list the files in the directory getwd() #get working dir setwd ("/nfs/barc_public") #change working dir history(n) #command history (default n=25) savehistory(file="myfile.txt") #saves as.rhistory ls() #list objects sessioninfo() #print session info about R and loaded packages (useful for knowing version) quit() # quit R Save workspace (.RData file) Assigning value: = or <- X <- 2 X = 2 11

12 Data Types: Mode Mode: components of the same type Commonly used*: character (eg. "dog", "A") integer (eg. 3) numeric (eg. 3.15) logical (eg. TRUE, FALSE) Useful commands: is.numeric(x) #returns if x is TRUE or FALSE as.numeric(x) #converts x to numeric *incomplete 12

13 Data Types/Structure Vector: one-dimensional, most commonly used data structure of the same type chr_vector<-c("dna", "RNA","Protein") List: one-dimensional, elements can be of any type x_list<-list(1,2,"a",c(true,false)) Matrix: two-dimensional, elements of the same type x_matrix<-matrix(1:9, ncol=3,nrow=3) Data frame: two-dimensional, similar to a matrix but elements can be of different type df<-data.frame(x=1:4,y=c("a","b","c","d")) 13

14 Commands to Explore Data head(object) length(object) dim(object) names(object) mode(object) class(object) str(object) # see the top of your data # length of a vector # dimensions of a matrix or data frame # (get or set) names of an object # check if it s numeric, character, etc. #class or type of an object #structure of an object 14

15 Simple Workflow for a t-test # Number of tumors (from litter 2 on 11 July 2010) wt = c(5, 6, 7) ko = c(8, 9, 11)) # Try default t-test settings (Welch's 2-sample t-test) t.test(wt, ko) # Do standard 2-sample t-test t.test(wt, ko, var.equal=t) # Save the results as a variable wt.vs.ko = t.test(wt, ko, var.equal=t) # What are the different parts of this data frame? names(wt.vs.ko) # Just print the p-value wt.vs.ko$p.value # What commands did we use? history(max.show=inf) 15 15

16 Reading Files Take R to your preferred directory On R GUI, go to File > Change dir Check where you are (e.g., get your working directory) and see what files are there > getwd() [1] "X:/bell/Hot_Topics/Intro_to_R" > dir() [1] "compare_wt_ko_weights.r" 16 16

17 Reading Data Files Usually it s easiest to read data from a file Organize in Excel with one-word column names Save as tab-delimited text Check that file is there dir() Read file tumors = read.delim("tumors_wt_ko.txt", header=t) (Other Options: row.names=t, check.names=f) Check that it s OK > tumors wt ko

18 Accessing Data > tumors$wt # Use the column name [1] > tumors[1:3,1] # [rows, columns] [1] > tumors[,1] # missing row or column => all [1] > tumors[1:2,1:2] # select a submatrix wt ko > tumors wt ko > t.test(tumors$wt, tumors$ko) # t-test as before 18 18

19 Creating an output table Most analyses involve several outputs You may want to create a matrix to hold it all Create an empty matrix name rows and columns pvals.out = matrix(data=na, ncol=2, nrow=2) colnames(pvals.out) = c( two.tail", one.tail") rownames(pvals.out) = c("welch", "Wilcoxon") pvals.out two.tail one.tail Welch NA NA Wilcoxon NA NA 19 19

20 Filling the output table (matrix) Do the stats # Welch s test (t-test with pooled variance) pvals.out[1,1] = t.test(tumors$wt, tumors$ko)$p.value pvals.out[1,2] = t.test(tumors$wt, tumors$ko, alt="less")$p.value # Wilcoxon rank sum test (non-parametric alternative to t- test) pvals.out[2,1] = wilcox.test(tumors$wt, tumors$ko)$p.value pvals.out[2,2] = wilcox.test(tumors$wt, tumors$ko, alt="less")$p.value pvals.out two.tail one.tail Welch Wilcoxon

21 Printing the output table We may want to round the p-values pvals.out.rounded = signif(pvals.out, 3) Print the matrix (table) write.table(pvals.out.rounded, file="tumor_pvals.txt", quote=f, sep="\t") Warning: output column names are shifted by 1 when read in Excel 21 21

22 Running a series of commands 1. Copy and paste commands into R session, or 2. Run a script in R, or source("compare_wt_ko_weights.r") [but not so useful in this case, since we aren t creating any files] 3. Run a script from Unix terminal i) method 1 Print output on screen R --vanilla < compare_wt_ko_weights.r Print output in file R --vanilla < compare_wt_ko_weights.r > R_out.txt ii) method 2./compare_WT_KO_weights.R but script must have this as the first line: #!/usr/bin/rscript 22 22

23 User-defined functions Easily perform repetitive task, useful within a script. Syntax myfunction <- function (arg1, arg2, ){ body return(object) } Example #Given a vector, calculate percent for each element calcpercent <- function(x){ percent<- 100*x/sum(x) return(percent) } x<-c(7,5,8,10) calcpercent(x) [1]

24 Introduction to figures R is very powerful and very flexible with its figure generation Any aspect of a figure should be modifiable Some figures aren t available in spreadsheets Boxplot example boxplot(tumors) # Simplest case # Add some more details boxplot(tumors, col=c("gray", "red"), main="mfg appears to be a tumor suppressor", ylab="number of tumors") 24 24

25 Boxplot description RStudio Any points beyond the whiskers are defined as outliers IQR <= 1.5 x IQR 75 th percentile median 25 th percentile Right-click to save figure (for R GUI) 25 25

26 Figure formats and sizes By default, figures on tak are saved as Rplots.pdf Helpful figure names can be included in code To select name and size (in inches) of pdf file pdf( tumor_boxplot.pdf, w=11, h=8.5) boxplot(tumors) # can have >1 page dev.off() # tell R that we re done with this figure To create another format (with size in pixels) png( tumor_boxplot.png, w=1800, h=1200) boxplot(tumors) dev.off() 26 26

27 Useful Packages I: tidyverse Originally written by Hadley Wickham Packages for data analysis dplyr tidyr readr ggplot2 others 27

28 Tidyverse: readr and tidyr readr read in tabular data read_tsv or read_csv data<-read_tsv("normalizedcounts_subset.txt") tidyr "tidy" your data messy_data<-data.frame( gene=c("gapdh", "ACTA1","ZEB1"), heart=c(100,140,450),liver=c(241,10,20) ) gene heart liver 1 GAPDH ACTA ZEB tidy_data<-gather(messy_data,tissue,count,heart:liver) gene tissue count 1 GAPDH heart ACTA1 heart ZEB1 heart GAPDH liver ACTA1 liver 10 6 ZEB1 liver 20 28

29 Tidyverse: dplyr Verb filter select arrange mutate summarise group_by Description Select rows based on criteria Select columns by name Reorder (rows) Create new column Summarise values Group operations 29

30 Useful Packages II: BioConductor Packages/tools for analysis of highthroughput genomic data To see all packages: library() For searchable listing of packages: All require the package to be installed AND explicitly called (or in Rstudio, checked), for example, library(limma) Install what you need on your computer or, for tak, ask the IT group to install packages via

31 Sampling of Popular BioConductor limma biomart Rsamtools edger DESeq/DESeq2 affy topgo ShortRead Packages* *Top75 listing from based on download 31

32 Other Useful Commands library() mean() round(x, n) dir() median() min() length() sd() max() dim() rbind() paste() nrow() cbind() x[x>0] ncol() sort() x[c(1,3,5)] unique() rev() seq(from, to, by) t() log(x, base) commandargs() 32 32

33 R Resources R/Bioconductor short course R scripts (and commands) for Bioinformatics We re glad to share commands and/or scripts to get you started BaRC scripts in \\BaRC_Public\BaRC_code\R Rstudio Cheat Sheets:

Introduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics

Introduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics Introduction to R: Using R for Statistics and Data Analysis BaRC Hot Topics http://barc.wi.mit.edu/hot_topics/ Why use R? Perform inferential statistics (e.g., use a statistical test to calculate a p-value)

More information

Introduction to R: Using R for statistics and data analysis

Introduction to R: Using R for statistics and data analysis Why use R? Introduction to R: Using R for statistics and data analysis George W Bell, Ph.D. BaRC Hot Topics November 2015 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/

More information

Introduction to R: Using R for statistics and data analysis

Introduction to R: Using R for statistics and data analysis Why use R? Introduction to R: Using R for statistics and data analysis George W Bell, Ph.D. BaRC Hot Topics November 2014 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/

More information

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics November 2013 George W. Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ To perform inferential

More information

Using R for statistics and data analysis

Using R for statistics and data analysis Introduction ti to R: Using R for statistics and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ Why use R? To perform inferential statistics (e.g.,

More information

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ To perform inferential statistics

More information

Introduction to R (BaRC Hot Topics)

Introduction to R (BaRC Hot Topics) Introduction to R (BaRC Hot Topics) George Bell September 30, 2011 This document accompanies the slides from BaRC s Introduction to R and shows the use of some simple commands. See the accompanying slides

More information

An Introduction to R- Programming

An Introduction to R- Programming An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University

More information

Data Wrangling in the Tidyverse

Data Wrangling in the Tidyverse Data Wrangling in the Tidyverse 21 st Century R DS Portugal Meetup, at Farfetch, Porto, Portugal April 19, 2017 Jim Porzak Data Science for Customer Insights 4/27/2017 1 Outline 1. A very quick introduction

More information

BGGN 213 Working with R packages Barry Grant

BGGN 213 Working with R packages Barry Grant BGGN 213 Working with R packages Barry Grant http://thegrantlab.org/bggn213 Recap From Last Time: Why it is important to visualize data during exploratory data analysis. Discussed data visualization best

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

Tutorial: SeqAPass Boxplot Generator

Tutorial: SeqAPass Boxplot Generator 1 Tutorial: SeqAPass Boxplot Generator 1. Access SeqAPASS by opening https://seqapass.epa.gov/seqapass/ using Mozilla Firefox web browser 2. Open the About link on the login page or upon logging in to

More information

Introduction into R. A Short Overview. Thomas Girke. December 8, Introduction into R Slide 1/21

Introduction into R. A Short Overview. Thomas Girke. December 8, Introduction into R Slide 1/21 Introduction into R A Short Overview Thomas Girke December 8, 212 Introduction into R Slide 1/21 Introduction Look and Feel of the R Environment R Library Depositories Installation Getting Around Basic

More information

Bioinformatics Workshop - NM-AIST

Bioinformatics Workshop - NM-AIST Bioinformatics Workshop - NM-AIST Day 2 Introduction to R Thomas Girke July 24, 212 Bioinformatics Workshop - NM-AIST Slide 1/21 Introduction Look and Feel of the R Environment R Library Depositories Installation

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

file:///users/williams03/a/workshops/2015.march/final/intro_to_r.html

file:///users/williams03/a/workshops/2015.march/final/intro_to_r.html Intro to R R is a functional programming language, which means that most of what one does is apply functions to objects. We will begin with a brief introduction to R objects and how functions work, and

More information

Instruction: Download and Install R and RStudio

Instruction: Download and Install R and RStudio 1 Instruction: Download and Install R and RStudio We will use a free statistical package R, and a free version of RStudio. Please refer to the following two steps to download both R and RStudio on your

More information

Lab 1: Getting started with R and RStudio Questions? or

Lab 1: Getting started with R and RStudio Questions? or Lab 1: Getting started with R and RStudio Questions? david.montwe@ualberta.ca or isaacren@ualberta.ca 1. Installing R and RStudio To install R, go to https://cran.r-project.org/ and click on the Download

More information

Data Import and Formatting

Data Import and Formatting Data Import and Formatting http://datascience.tntlab.org Module 4 Today s Agenda Importing text data Basic data visualization tidyverse vs data.table Data reshaping and type conversion Basic Text Data

More information

Introduction to RStudio

Introduction to RStudio Introduction to RStudio Ulrich Halekoh Epidemiology and Biostatistics, SDU May 4, 2018 R R is a language that started by Ross Ihaka and Robert Gentleman in 1991 as an open source alternative to S emphasizes

More information

Mails : ; Document version: 14/09/12

Mails : ; Document version: 14/09/12 Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary

More information

RNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University

RNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University RNA-Seq Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University joshua.ainsley@tufts.edu Day four Quantifying expression Intro to R Differential expression

More information

Introduction to R statistical environment

Introduction to R statistical environment Introduction to R statistical environment R Nano Course Series Aishwarya Gogate Computational Biologist I Green Center for Reproductive Biology Sciences History of R R is a free software environment for

More information

Recap From Last Time: Today s Learning Goals BIMM 143. Data analysis with R Lecture 4. Barry Grant.

Recap From Last Time: Today s Learning Goals BIMM 143. Data analysis with R Lecture 4. Barry Grant. BIMM 143 Data analysis with R Lecture 4 Barry Grant http://thegrantlab.org/bimm143 Recap From Last Time: Substitution matrices: Where our alignment match and mis-match scores typically come from Comparing

More information

An Introduction to R. Ed D. J. Berry 9th January 2017

An Introduction to R. Ed D. J. Berry 9th January 2017 An Introduction to R Ed D. J. Berry 9th January 2017 Overview Why now? Why R? General tips Recommended packages Recommended resources 2/48 Why now? Efficiency Pointandclick software just isn't time efficient

More information

Lecture 1: Getting Started and Data Basics

Lecture 1: Getting Started and Data Basics Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate

More information

Introduction to Statistics using R/Rstudio

Introduction to Statistics using R/Rstudio Introduction to Statistics using R/Rstudio R and Rstudio Getting Started Assume that R for Windows and Macs already installed on your laptop. (Instructions for installations sent) R on Windows R on MACs

More information

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org

More information

Introduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016

Introduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016 Introduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016 Installing R R is constantly updated and you should download a recent version; the version when this workshop was written was 3.2.4 I also

More information

Getting and Cleaning Data. Biostatistics

Getting and Cleaning Data. Biostatistics Getting and Cleaning Data Biostatistics 140.776 Getting and Cleaning Data Getting data: APIs and web scraping Cleaning data: Tidy data Transforming data: Regular expressions Getting Data Web site Nature

More information

Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center

Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center What is R? R is a statistical computing environment with graphics capabilites It is fully scriptable

More information

Short Introduction to R

Short Introduction to R Short Introduction to R Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México June, 2015. CIMMYT, México-SAGPDB Short Introduction to R 1/51 Contents 1 Introduction 2 Simple objects 3 User defined

More information

SISG/SISMID Module 3

SISG/SISMID Module 3 SISG/SISMID Module 3 Introduction to R Ken Rice Tim Thornton University of Washington Seattle, July 2018 Introduction: Course Aims This is a first course in R. We aim to cover; Reading in, summarizing

More information

Introduction to R Programming

Introduction to R Programming Course Overview Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data

More information

Data-informed collection decisions using R or, learning R using collection data

Data-informed collection decisions using R or, learning R using collection data Data-informed collection decisions using R or, learning R using collection data Heidi Tebbe Collections & Research Librarian for Engineering and Data Science NCSU Libraries Collections & Research Librarian

More information

An Introduction to the R Commander

An Introduction to the R Commander An Introduction to the R Commander BIO/MAT 460, Spring 2011 Christopher J. Mecklin Department of Mathematics & Statistics Biomathematics Research Group Murray State University Murray, KY 42071 christopher.mecklin@murraystate.edu

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

A whirlwind introduction to using R for your research

A whirlwind introduction to using R for your research A whirlwind introduction to using R for your research Jeremy Chacón 1 Outline 1. Why use R? 2. The R-Studio work environment 3. The mock experimental analysis: 1. Writing and running code 2. Getting data

More information

Recap From Last Time:

Recap From Last Time: BIMM 143 More on R functions and packages Lecture 7 Barry Grant http://thegrantlab.org/bimm143 Office hour check-in! Recap From Last Time: Covered data input with the read.table() family of functions including

More information

Introduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center

Introduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center Introduction to R Nishant Gopalakrishnan, Martin Morgan Fred Hutchinson Cancer Research Center 19-21 January, 2011 Getting Started Atomic Data structures Creating vectors Subsetting vectors Factors Matrices

More information

Basic R Part 1 BTI Plant Bioinformatics Course

Basic R Part 1 BTI Plant Bioinformatics Course Basic R Part 1 BTI Plant Bioinformatics Course Spring 2013 Sol Genomics Network Boyce Thompson Institute for Plant Research by Jeremy D. Edwards What is R? Statistical programming language Derived from

More information

Introduction to R Commander

Introduction to R Commander Introduction to R Commander 1. Get R and Rcmdr to run 2. Familiarize yourself with Rcmdr 3. Look over Rcmdr metadata (Fox, 2005) 4. Start doing stats / plots with Rcmdr Tasks 1. Clear Workspace and History.

More information

Session 26 TS, Predictive Analytics: Moving Out of Square One. Moderator: Jean-Marc Fix, FSA, MAAA

Session 26 TS, Predictive Analytics: Moving Out of Square One. Moderator: Jean-Marc Fix, FSA, MAAA Session 26 TS, Predictive Analytics: Moving Out of Square One Moderator: Jean-Marc Fix, FSA, MAAA Presenters: Jean-Marc Fix, FSA, MAAA Jeffery Robert Huddleston, ASA, CERA, MAAA Predictive Modeling: Getting

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth

Exploring cdna Data. Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth Exploring cdna Data Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth Practical DNA Microarray Analysis http://compdiag.molgen.mpg.de/ngfn/pma0nov.shtml The following exercise will guide you

More information

R package

R package R package www.r-project.org Download choose the R version for your OS install R for the first time Download R 3 run R MAGDA MIELCZAREK 2 help help( nameofthefunction )? nameofthefunction args(nameofthefunction)

More information

Getting Started. Slides R-Intro: R-Analytics: R-HPC:

Getting Started. Slides R-Intro:   R-Analytics:   R-HPC: Getting Started Download and install R + Rstudio http://www.r-project.org/ https://www.rstudio.com/products/rstudio/download2/ TACC ssh username@wrangler.tacc.utexas.edu % module load Rstats %R Slides

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Practical DNA Microarray Analysis http://compdiag.molgen.mpg.de/ngfn/pma0nov.shtml The following exercise will guide you

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

Introduction to R. base -> R win32.exe (this will change depending on the latest version)

Introduction to R. base -> R win32.exe (this will change depending on the latest version) Dr Raffaella Calabrese, Essex Business School 1. GETTING STARTED Introduction to R R is a powerful environment for statistical computing which runs on several platforms. R is available free of charge.

More information

BGGN 213 More on R functions and packages Barry Grant

BGGN 213 More on R functions and packages Barry Grant BGGN 213 More on R functions and packages Barry Grant http://thegrantlab.org/bggn213 Recap From Last Time: Data frames are created with the data.frame() function as well as the read.table() family of functions

More information

Basic R programming. Ana Teresa Maia DCBM / CBME

Basic R programming. Ana Teresa Maia DCBM / CBME Basic R programming Ana Teresa Maia DCBM / CBME Today Sources! Introduction Documentation and help Packages R Studio Basics and Syntax Data Types vectors; lists; data.frames; matrices R Programming Basic

More information

a suite of operators for calculations on arrays, in particular

a suite of operators for calculations on arrays, in particular The R Environment (Adapted from the Venables and Smith R Manual on www.r-project.org and from Andreas Buja s web site for Applied Statistics at http://www-stat.wharton.upenn.edu/ buja/stat-541/notes-stat-541.r)

More information

Data types and structures

Data types and structures An introduc+on to Data types and structures Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 3 Review GeFng started with R Crea+ng Objects Data types in R Data structures in R

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Practical DNA Microarray Analysis, Heidelberg, March 2005 http://compdiag.molgen.mpg.de/ngfn/pma2005mar.shtml The following

More information

Barry Grant

Barry Grant Barry Grant bjgrant@umich.edu http://thegrantlab.org What is R? R is a freely distributed and widely used programing language and environment for statistical computing, data analysis and graphics. R provides

More information

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,

More information

An introduction to R WS 2013/2014

An introduction to R WS 2013/2014 An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org

More information

A system for statistical analysis. Instructions for installing software. R, R-studio and the R-commander

A system for statistical analysis. Instructions for installing software. R, R-studio and the R-commander Instructions for installing software R, R-studio and the R-commander Graeme.Hutcheson@manchester.ac.uk Manchester Institute of Education, University of Manchester This course uses the following software...

More information

A Brief Introduction to R

A Brief Introduction to R A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.

More information

Introduction to R. Introduction to Econometrics W

Introduction to R. Introduction to Econometrics W Introduction to R Introduction to Econometrics W3412 Begin Download R from the Comprehensive R Archive Network (CRAN) by choosing a location close to you. Students are also recommended to download RStudio,

More information

Introduction to R & R Commander

Introduction to R & R Commander Introduction to R & R Commander Alexander Ploner 2011-03-18 CONTENTS CONTENTS Contents 1 Getting started 3 1.1 First steps............................................ 3 1.2 A simple

More information

A brief introduction to R

A brief introduction to R A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background

More information

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri) R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine

More information

Introduction to R. Daniel Berglund. 9 November 2017

Introduction to R. Daniel Berglund. 9 November 2017 Introduction to R Daniel Berglund 9 November 2017 1 / 15 R R is available at the KTH computers If you want to install it yourself it is available at https://cran.r-project.org/ Rstudio an IDE for R is

More information

History, installation and connection

History, installation and connection History, installation and connection The men behind our software Jim Goodnight, CEO SAS Inc Ross Ihaka Robert Gentleman (Duncan Temple Lang) originators of R 2 / 75 History SAS From late 1960s, North Carolina

More information

IST Computational Tools for Statistics I. DEÜ, Department of Statistics

IST Computational Tools for Statistics I. DEÜ, Department of Statistics IST 1051 Computational Tools for Statistics I 1 DEÜ, Department of Statistics Course Objectives Computational Tools for Statistics-I course can increase the understanding of statistics and helps to learn

More information

Statistics for Biologists: Practicals

Statistics for Biologists: Practicals Statistics for Biologists: Practicals Peter Stoll University of Basel HS 2012 Peter Stoll (University of Basel) Statistics for Biologists: Practicals HS 2012 1 / 22 Outline Getting started Essentials of

More information

A Whistle-Stop Tour of the Tidyverse

A Whistle-Stop Tour of the Tidyverse A Whistle-Stop Tour of the Tidyverse Aimee Gott Senior Consultant agott@mango-solutions.com @aimeegott_r In This Workshop You will learn What the tidyverse is & why bother using it What tools are available

More information

Introduction to R, Github and Gitlab

Introduction to R, Github and Gitlab Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and

More information

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R MBV4410/9410 Fall 2018 Bioinformatics for Molecular Biology Introduction to R Outline Introduce R Basic operations RStudio Bioconductor? Goal of the lecture Introduce you to R Show how to run R, basic

More information

The History and Use of R. Joseph Kambourakis

The History and Use of R. Joseph Kambourakis The History and Use of R Joseph Kambourakis Ground Rules Interrupt me These are all my opinions and not of EMC or Big Data Analytics, Discovery & Visualization Meetup Slides will be available Joseph

More information

R basics workshop Sohee Kang

R basics workshop Sohee Kang R basics workshop Sohee Kang Math and Stats Learning Centre Department of Computer and Mathematical Sciences Objective To teach the basic knowledge necessary to use R independently, thus helping participants

More information

R Website R Installation and Folder R Packages R Documentation R Search R Workspace Interface R Common and Important Basic Commands

R Website R Installation and Folder R Packages R Documentation R Search R Workspace Interface R Common and Important Basic Commands Table of Content R Website R Installation and Folder R Packages R Documentation R Search R Workspace Interface R Common and Important Basic Commands R Website http://www.r project.org/ Download, Package

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 2: Software Introduction Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University jacoby@msu.edu Getting Started with R What is R? A tiny R session

More information

Canadian Bioinforma,cs Workshops.

Canadian Bioinforma,cs Workshops. Canadian Bioinforma,cs Workshops www.bioinforma,cs.ca Module #: Title of Module 2 Modified from Richard De Borja, Cindy Yao and Florence Cavalli R Review Objectives To review the basic commands in R To

More information

Demo yeast mutant analysis

Demo yeast mutant analysis Demo yeast mutant analysis Jean-Yves Sgro February 20, 2018 Contents 1 Analysis of yeast growth data 1 1.1 Set working directory........................................ 1 1.2 List all files in directory.......................................

More information

Introduction to R 21/11/2016

Introduction to R 21/11/2016 Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? https://cran.r-project.org/ How to install and use it? Follow the steps: you don t need advanced

More information

Computer lab 2 Course: Introduction to R for Biologists

Computer lab 2 Course: Introduction to R for Biologists Computer lab 2 Course: Introduction to R for Biologists April 23, 2012 1 Scripting As you have seen, you often want to run a sequence of commands several times, perhaps with small changes. An efficient

More information

Session 3 Nick Hathaway;

Session 3 Nick Hathaway; Session 3 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Manipulating Data frames and matrices 1 Converting to long vs wide formats.................................... 2 Manipulating data in table........................................

More information

Part I. An Introduction to R

Part I. An Introduction to R Part I An Introduction to R 1 Chapter 1 Getting Started R is a programming language and comprehensive statistical platform for data exploration and analysis. It is free and open source, which means anyone

More information

LAB #1: DESCRIPTIVE STATISTICS WITH R

LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

Package ezsummary. August 29, 2016

Package ezsummary. August 29, 2016 Type Package Title Generate Data Summary in a Tidy Format Version 0.2.1 Package ezsummary August 29, 2016 Functions that simplify the process of generating print-ready data summary using 'dplyr' syntax.

More information

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises Introduction to R Course in Practical Analysis of Microarray Data Computational Exercises 2010 March 22-26, Technischen Universität München Amin Moghaddasi, Kurt Fellenberg 1. Installing R. Check whether

More information

Data Manipulation. Module 5

Data Manipulation.   Module 5 Data Manipulation http://datascience.tntlab.org Module 5 Today s Agenda A couple of base-r notes Advanced data typing Relabeling text In depth with dplyr (part of tidyverse) tbl class dplyr grammar Grouping

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Methods@Manchester Summer School Manchester University July 2 6, 2018 Software and Data www.research-training.net/manchester2018 Graeme.Hutcheson@manchester.ac.uk University of

More information

Step-by-step user instructions to the hamlet-package

Step-by-step user instructions to the hamlet-package Step-by-step user instructions to the hamlet-package Teemu Daniel Laajala May 26, 2018 Contents 1 Analysis workflow 2 2 Loading data into R 2 2.1 Excel format data.......................... 4 2.2 CSV-files...............................

More information

The Tidyverse BIOF 339 9/25/2018

The Tidyverse BIOF 339 9/25/2018 The Tidyverse BIOF 339 9/25/2018 What is the Tidyverse? The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar,

More information

Breeding Guide. Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel

Breeding Guide. Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel Breeding Guide Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel www.phenome-netwoks.com Contents PHENOME ONE - INTRODUCTION... 3 THE PHENOME ONE LAYOUT... 4 THE JOBS ICON...

More information

An Introduction to R. Subhajit Dutta Stat-Math Unit. Indian Statistical Institute, Kolkata October 17, 2012

An Introduction to R. Subhajit Dutta Stat-Math Unit. Indian Statistical Institute, Kolkata October 17, 2012 An Introduction to R Subhajit Dutta Stat-Math Unit Indian Statistical Institute, Kolkata October 17, 2012 Why R? It is FREE!! Basic as well as specialized data analysis technique at your fingertips. Highly

More information

R version has been released on (Linux source code versions)

R version has been released on (Linux source code versions) Installation of R and Bioconductor R is a free software environment for statistical computing and graphics. It is based on the statistical computer language S. It is famous for its wide set of statistical

More information

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here:

Lab 1. Introduction to R & SAS. R is free, open-source software. Get it here: Lab 1. Introduction to R & SAS R is free, open-source software. Get it here: http://tinyurl.com/yfet8mj for your own computer. 1.1. Using R like a calculator Open R and type these commands into the R Console

More information

SQL Server 2017: Data Science with Python or R?

SQL Server 2017: Data Science with Python or R? SQL Server 2017: Data Science with Python or R? Dejan Sarka Sponsor Introduction Dejan Sarka (dsarka@solidq.com, dsarka@siol.net, @DejanSarka) 30 years of experience SQL Server MVP, MCT, 16 books 20+ courses,

More information

Data Import and Export

Data Import and Export Data Import and Export Eugen Buehler October 17, 2018 Importing Data to R from a file CSV (comma separated value) tab delimited files Excel formats (xls, xlsx) SPSS/SAS/Stata RStudio will tell you if you

More information

An Introduction to R 1.3 Some important practical matters when working with R

An Introduction to R 1.3 Some important practical matters when working with R An Introduction to R 1.3 Some important practical matters when working with R Dan Navarro (daniel.navarro@adelaide.edu.au) School of Psychology, University of Adelaide ua.edu.au/ccs/people/dan DSTO R Workshop,

More information

8.3 Come analizzare i dati: introduzione a RStudio

8.3 Come analizzare i dati: introduzione a RStudio 8.3 Come analizzare i dati: introduzione a RStudio Insegnamento di Informatica Elisabetta Ronchieri Corso di Laurea di Economia, Universitá di Ferrara I semestre, anno 2014-2015 Elisabetta Ronchieri (Universitá)

More information

STAT 113: R/RStudio Intro

STAT 113: R/RStudio Intro STAT 113: R/RStudio Intro Colin Reimer Dawson Last Revised September 1, 2017 1 Starting R/RStudio There are two ways you can run the software we will be using for labs, R and RStudio. Option 1 is to log

More information

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3)

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3) Basics of R 1. Arithmetic Operators > 2+2 > sqrt(2) # (2) >2^2 > sin(pi) # sin(π) >(1-2)*3 > exp(1) # e 1 >1-2*3 > log(10) # This is a short form of the full command, log(10, base=e). (Note) For log 10

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber. June 17, 2005

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber. June 17, 2005 Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber June 7, 00 The following exercise will guide you through the first steps of a spotted cdna microarray analysis.

More information

R Visualizing Data. Fall Fall 2016 CS130 - Intro to R 1

R Visualizing Data. Fall Fall 2016 CS130 - Intro to R 1 R Visualizing Data Fall 2016 Fall 2016 CS130 - Intro to R 1 mtcars Data Frame R has a built-in data frame called mtcars Useful R functions length(object) # number of variables str(object) # structure of

More information