BIO5312: R Session 1 An Introduction to R and Descriptive Statistics

Size: px
Start display at page:

Download "BIO5312: R Session 1 An Introduction to R and Descriptive Statistics"

Transcription

1 BIO5312: R Session 1 An Introduction to R and Descriptive Statistics Yujin Chung August 30th, 2016 Fall, 2016 Yujin Chung R Session 1 Fall, /24

2 Introduction to R R software R is both open source and open development. You can look at the source code and you can propose changes You can write R functions and publish them. R is available for many platforms: Unix of many flavors, including Linux, Solaris, FreeBSD, AIX Windows 95 and later Mac OS X Binaries and source code are available from The R Console Basic interaction with R is by typing in the console, a.k.a. terminal or command-line You type in commands, R gives back answers (or errors) Yujin Chung R Session 1 Fall, /24

3 RStudio RStudio allows the user to run R in a more user-friendly environment. It is open-source (i.e. free) and available at R console R script/editor Environment/Workspace: all the active object History: a list of commands used so far Files: shows all the files and folders in your default working directory changing working directory: More Set As Working Directory Plots, Packages, Help Yujin Chung R Session 1 Fall, /24

4 Quick start Mathematical operators/functions > log(64) # natural logarithm [1] > sqrt(2) # square root [1] Q) What are the R outputs of the followings? 7+5, 7/5, 7*5, 7-2, 7^2, 7%%5, 7%/%5 Comparisons are also binary operators: they take two objects, like numbers, and give a Boolean > 7 == 5 # 7 is equal to 5 [1] FALSE 7 > 5 # 7 is larger than 5 [1] TRUE 7!= 5 # 7 is not equal to 5 [1] TRUE Yujin Chung R Session 1 Fall, /24

5 Quick start II Boolean operators: & (and), (or) Q) What are the R outputs of the followings? (5>7) & (6*7 == 42) (5>7) (6*7 == 42)!(5>7) (6*7 == 42) R help > help(log) # or >?log Yujin Chung R Session 1 Fall, /24

6 Operators Arithmetic operators Operator Description + addition - subtraction * multiplication / division ˆ or ** exponentiation x %% y remainder x %/% y quotient Logical Operators Operator Description < less than <= less than or equal to > greater than >= greater than or equal to == exactly equal to! = not equal to!x Not x x y x OR y x & y x AND y istrue(x) test if X is TRUE Yujin Chung R Session 1 Fall, /24

7 Data type We can give names to data objects; these give us variables! Variables are created with the assignment operator: = or <- (arrow) Numeric: numbers, either floating point or integer > x=5 # or x <- 5 character : a character string > x = "I like chocolate ice cream" logical : TRUE or FALSE > x = (1 > 2) built-in variables. E.g. TRUE (or T), FALSE (or F) Yujin Chung R Session 1 Fall, /24

8 Data structures Group of data values into one object of type including vector data frame list matrix factors tables Some R packages have their own data structure Yujin Chung R Session 1 Fall, /24

9 Data structures: vectors Vectors: a sequence of values of numerical, character or logical. Function c() returns a vector containing all its arguments in order. numeric vector > x = c(1,2,3) > x [1] > length(x) # the length of x [1] 3 character > x=c("a","b") > x [1] "a" "b" > length(x) [1] 2 Yujin Chung R Session 1 Fall, /24

10 Data structures: vectors II Sequence generators > x = seq(from=1, to=3, by=1) # or seq(1,3,1) > x [1] > x = 1:3 # same as seq(1,3,1) Extracting sub-vectors > x[2] # return the 2nd elements [1] 2 > x[2:3] # extracting subset from the 2nd to 3rd elements [1] 2 3 > x[c(2,3)] # same as x[2:3] > x[-2] # drop off the 2nd elementsx=c("a","b") [1] 1 3 Yujin Chung R Session 1 Fall, /24

11 Data structures: vectors III Element-wise arithmetic > x = 1:5 > x+1 [1] > x <= 3 [1] TRUE TRUE TRUE FALSE FALSE Pairwise arithmetic > x = 1:5 > y = c(-1, 0, 3:5) > x+y [1] > x == y [1] FALSE FALSE TRUE TRUE TRUE Yujin Chung R Session 1 Fall, /24

12 Data structure: data frames Data frames: a data set that can be represented as a set of observations (rows) on several variables (columns). Example: Fishers or Andersons iris data set (built-in) > iris Sepal.Length Sepal.Width Petal.Length Petal.Width S ## Basic information of data frame > names(iris) # variable names [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Wi > dim(iris) # the numbers of rows and columns [1] > nrow(iris) # the number of rows [1] 150 > ncol(iris) # the number of columns Yujin Chung R Session 1 Fall, /24

13 Data structure: data frames II Extracting subset > iris$sepal.length # extracting variable (1st column) [1] [19] > iris[,1] # extracting the 1st column [1] [19] > iris[2,] # extracting the 2nd row Sepal.Length Sepal.Width Petal.Length Petal.Width Specie setos > iris[1,1] # the element of the 1st column & the 1st row [1] 5.1 Yujin Chung R Session 1 Fall, /24

14 Data structure: data frames III Each of columns or rows is a vector > sepal = iris$sepal.length # extracting variable (1st col > length(sepal) [1] 150 > temp = iris[1:10,2] > length(temp) [1] 10 Yujin Chung R Session 1 Fall, /24

15 Writing data Writing a data set in a text or CSV file write.table(x,file,quote,sep,row.names,col.names,...) x: data object to save, file: file name quote: if TRUE, elements surrounded by double quotes. If FALSE, nothing is quoted sep: the field separator string. Values within each row are separated by this string row.names: a logical value indicating whether the row (or column) names of x are to be written along with x col.names: a logical value indicating whether the column names of x are to be written along with x > write.table(iris, file = "iris.txt", quote=f, sep = " ", row.names = F, col.names=t) Yujin Chung R Session 1 Fall, /24

16 Reading data Reading a data set in a text or CSV file read.table(file, header, sep =,...) file: data file to read header: a logical value indicating whether the file contains the names of the variables as its first line sep: the field separator string. Values within each row are separated by this string > iris2 = read.table("iris.txt", header =T, sep=" ",) > lead = read.table("lead.dat.txt",header=t) Yujin Chung R Session 1 Fall, /24

17 Functions Built-in functions: write.table, read.table, read.cross, etc > names(lead) ## variable names of "lead" [1] "id" "area" "ageyrs" "sex" "iqv_inf" [7] "iqv_ar" "iqv_ds" "iqv_raw" "iqp_pc" "iqp_bd" [13] "iqp_cod" "iqp_raw" "hh_index" "iqv" "iqp" [19] "iq_type" "lead_grp" "Group" "ld72" "ld73" [25] "totyrs" "pica" "colic" "clumsi" "irrit" [31] "X_2plat_r" "X_2plar_l" "visrea_r" "visrea_l" "audrea_r [37] "fwt_r" "fwt_l" "hyperact" "maxfwt" > dim(lead) [1] Yujin Chung R Session 1 Fall, /24

18 min(), max() and range() min(x): the minimum of the argument x max(x): the maximum of the argument x range(): the minimum and maximum of the argument x > fwt = lead$maxfwt # extracting "maxfwt" and creating a new v > fwt [1] [26] > min(fwt) [1] 13 > max(fwt) [1] 99 > range(fwt) [1] > diff(range(fwt)) # the "range" of fwt [1] 86 Yujin Chung R Session 1 Fall, /24

19 mean() mean(): the arithmetic mean of the argument sum(): the sum of the argument > mean(fwt) # arithmetic mean of fwt [1] > sum(fwt)/length(fwt) [1] > colmeans(lead) # the mean of each column id area ageyrs sex iqv_inf iqv_c cf) colmeans(), rowmeans(), colsums(), rowsums() Yujin Chung R Session 1 Fall, /24

20 median() and quantile() median(x): the median of the argument x quantile(x): returns the minimum, Q1, Q2 (median), Q3, maximum of x IQR(x): the IQR of x > mean(fwt) # the median of fwt [1] 56 > quantile(fwt) 0% 25% 50% 75% 100% > IQR(fwt) [1] 24 > quantile(fwt,probs=.25) # Q1, the 25th percentile 25% 48 > quantile(fwt,probs=.75) - quantile(fwt,probs=.25) # IQR 75% 24 Yujin Chung R Session 1 Fall, /24

21 var() and sd() var(x): the variance of the argument x sd(x): the standard deviation of the argument x > var(fwt) [1] > sd(fwt) [1] > n = length(fwt) > n [1] 124 > sum( (fwt - mean(fwt))^2 )/ (n-1) # variance [1] > sqrt( sum( (fwt - mean(fwt))^2 )/(n-1) ) # standard deviatio [1] Yujin Chung R Session 1 Fall, /24

22 summary() summary(x): the minimum, Q1, median, mean, Q3 and maximum of the argument x > summary(fwt) Min. 1st Qu. Median Mean 3rd Qu. Max Yujin Chung R Session 1 Fall, /24

23 Writing and calling functions The structure of a function <function name> = function(arg1, arg2,... ){ statements return(object) } We write another summary function, called mysummary, that returns the mean and standard deviation of an argument variable after removing missing values. > mysummary = function(dat){ # define a function res = c(mean(dat), sd(dat)) return(res) } > mysummary(fwt) # calling a function [1] Yujin Chung R Session 1 Fall, /24

24 More intro Some R resources The official intro, An Introduction to R, available online in Norman Matloff, The Art of R Programming: A Tour of Statistical Software Design Phil Spector, Data Manipulation with R Paul Teetor, The R Cookbook Yujin Chung R Session 1 Fall, /24

BINF702 SPRING CHAPTER 2 Descriptive Statistics. BINF702 - SPRING2014- SOLKA - CHAPTER 2 Descriptive Statistics

BINF702 SPRING CHAPTER 2 Descriptive Statistics. BINF702 - SPRING2014- SOLKA - CHAPTER 2 Descriptive Statistics BINF702 SPRING 2014 CHAPTER 2 Section 2.1 - Introduction If our set of observations is small we may study them via enumeration. This is usually not possible though. Ex. 2.1 Some investigators have proposed

More information

Business Statistics: R tutorials

Business Statistics: R tutorials Business Statistics: R tutorials Jingyu He September 29, 2017 Install R and RStudio R is a free software environment for statistical computing and graphics. Download free R and RStudio for Windows/Mac:

More information

An Introduction to Statistical Computing in R

An Introduction to Statistical Computing in R An Introduction to Statistical Computing in R K2I Data Science Boot Camp - Day 1 AM Session May 15, 2017 Statistical Computing in R May 15, 2017 1 / 55 AM Session Outline Intro to R Basics Plotting In

More information

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3)

Basics of R. > x=2 (or x<-2) > y=x+3 (or y<-x+3) Basics of R 1. Arithmetic Operators > 2+2 > sqrt(2) # (2) >2^2 > sin(pi) # sin(π) >(1-2)*3 > exp(1) # e 1 >1-2*3 > log(10) # This is a short form of the full command, log(10, base=e). (Note) For log 10

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics

More information

MULTIVARIATE ANALYSIS USING R

MULTIVARIATE ANALYSIS USING R MULTIVARIATE ANALYSIS USING R B N Mandal I.A.S.R.I., Library Avenue, New Delhi 110 012 bnmandal @iasri.res.in 1. Introduction This article gives an exposition of how to use the R statistical software for

More information

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)

R: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri) R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine

More information

A Brief Introduction to R

A Brief Introduction to R A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.

More information

Introduction to R. Daniel Berglund. 9 November 2017

Introduction to R. Daniel Berglund. 9 November 2017 Introduction to R Daniel Berglund 9 November 2017 1 / 15 R R is available at the KTH computers If you want to install it yourself it is available at https://cran.r-project.org/ Rstudio an IDE for R is

More information

IST 3108 Data Analysis and Graphics Using R. Summarizing Data Data Import-Export

IST 3108 Data Analysis and Graphics Using R. Summarizing Data Data Import-Export IST 3108 Data Analysis and Graphics Using R Summarizing Data Data Import-Export Engin YILDIZTEPE, PhD Working with Vectors and Logical Subscripts >xsum(x) how many of the values were less than

More information

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R

MBV4410/9410 Fall Bioinformatics for Molecular Biology. Introduction to R MBV4410/9410 Fall 2018 Bioinformatics for Molecular Biology Introduction to R Outline Introduce R Basic operations RStudio Bioconductor? Goal of the lecture Introduce you to R Show how to run R, basic

More information

Weekly Discussion Sections & Readings

Weekly Discussion Sections & Readings Weekly Discussion Sections & Readings Teaching Fellows (TA) Name Office Email Mengting Gu Bass 437 mengting.gu (at) yale.edu Paul Muir Bass437 Paul.muir (at) yale.edu Please E-mail cbb752@gersteinlab.org

More information

R package

R package R package www.r-project.org Download choose the R version for your OS install R for the first time Download R 3 run R MAGDA MIELCZAREK 2 help help( nameofthefunction )? nameofthefunction args(nameofthefunction)

More information

Mails : ; Document version: 14/09/12

Mails : ; Document version: 14/09/12 Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary

More information

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Data Mining - Data Dr. Jean-Michel RICHER 2018 jean-michel.richer@univ-angers.fr Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Outline 1. Introduction 2. Data preprocessing 3. CPA with R 4. Exercise

More information

This document is designed to get you started with using R

This document is designed to get you started with using R An Introduction to R This document is designed to get you started with using R We will learn about what R is and its advantages over other statistics packages the basics of R plotting data and graphs What

More information

Reading and wri+ng data

Reading and wri+ng data An introduc+on to Reading and wri+ng data Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 4 Course outline Review Data types and structures Reading data How should data look

More information

LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N.

LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N. LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I Part Two Introduction to R Programming RStudio November 2016 Written by N.Nilgün Çokça Introduction to R Programming 5 Installing R & RStudio 5 The R Studio

More information

Statistical Programming with R

Statistical Programming with R Statistical Programming with R Lecture 5: Simple Programming Bisher M. Iqelan biqelan@iugaza.edu.ps Department of Mathematics, Faculty of Science, The Islamic University of Gaza 2017-2018, Semester 1 Functions

More information

Lecture 1: Getting Started and Data Basics

Lecture 1: Getting Started and Data Basics Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate

More information

a suite of operators for calculations on arrays, in particular

a suite of operators for calculations on arrays, in particular The R Environment (Adapted from the Venables and Smith R Manual on www.r-project.org and from Andreas Buja s web site for Applied Statistics at http://www-stat.wharton.upenn.edu/ buja/stat-541/notes-stat-541.r)

More information

R basics workshop Sohee Kang

R basics workshop Sohee Kang R basics workshop Sohee Kang Math and Stats Learning Centre Department of Computer and Mathematical Sciences Objective To teach the basic knowledge necessary to use R independently, thus helping participants

More information

What is Matlab? The command line Variables Operators Functions

What is Matlab? The command line Variables Operators Functions What is Matlab? The command line Variables Operators Functions Vectors Matrices Control Structures Programming in Matlab Graphics and Plotting A numerical computing environment Simple and effective programming

More information

Part I { Getting Started & Manipulating Data with R

Part I { Getting Started & Manipulating Data with R Part I { Getting Started & Manipulating Data with R Gilles Lamothe February 21, 2017 Contents 1 URL for these notes and data 2 2 Origins of R 2 3 Downloading and Installing R 2 4 R Console and Editor 3

More information

Fundamentals: Expressions and Assignment

Fundamentals: Expressions and Assignment Fundamentals: Expressions and Assignment A typical Python program is made up of one or more statements, which are executed, or run, by a Python console (also known as a shell) for their side effects e.g,

More information

Introduction to R. Dr. Emile R. Chimusa Department of Integrative Biomedical Sciences University of Cape Town. May 9, 2016

Introduction to R. Dr. Emile R. Chimusa Department of Integrative Biomedical Sciences University of Cape Town. May 9, 2016 Introduction to R Dr. Emile R. Chimusa Department of Integrative Biomedical Sciences University of Cape Town May 9, 2016 1 CONTENTS CONTENTS Contents 1 Getting started in R-RStudio 3 1.1 Getting R and

More information

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics

Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,

More information

Introduction to R and Statistical Data Analysis

Introduction to R and Statistical Data Analysis Microarray Center Introduction to R and Statistical Data Analysis PART II Petr Nazarov petr.nazarov@crp-sante.lu 22-11-2010 OUTLINE PART II Descriptive statistics in R (8) sum, mean, median, sd, var, cor,

More information

GRAD6/8104; INES 8090 Spatial Statistic Spring 2017

GRAD6/8104; INES 8090 Spatial Statistic Spring 2017 Lab #1 Basics in Spatial Statistics (Due Date: 01/30/2017) PURPOSES 1. Get familiar with statistics and GIS 2. Learn to use open-source software R for statistical analysis Before starting your lab, create

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 1 Week 2 (Verzani Chapter 1: Sections 1.2.4 1.4.31) Remember to complete the entire handout and submit the precept questions to the Blackboard

More information

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio

Introduction to R and R-Studio Toy Program #1 R Essentials. This illustration Assumes that You Have Installed R and R-Studio Introduction to R and R-Studio 2018-19 Toy Program #1 R Essentials This illustration Assumes that You Have Installed R and R-Studio If you have not already installed R and RStudio, please see: Windows

More information

Intro to R. Fall Fall 2017 CS130 - Intro to R 1

Intro to R. Fall Fall 2017 CS130 - Intro to R 1 Intro to R Fall 2017 Fall 2017 CS130 - Intro to R 1 Intro to R R is a language and environment that allows: Data management Graphs and tables Statistical analyses You will need: some basic statistics We

More information

Input/Output Data Frames

Input/Output Data Frames Input/Output Data Frames Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Input/Output Importing text files Rectangular (n rows, c columns) Usually you want to use read.table read.table(file,

More information

STATGRAPHICS Operators

STATGRAPHICS Operators STATGRAPHICS Operators An important feature of STATGRAPHICS is the ability to construct expressions that create or transform data on-the-fly. For example, assume that a datasheet contains columns named

More information

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises

Introduction to R. Course in Practical Analysis of Microarray Data Computational Exercises Introduction to R Course in Practical Analysis of Microarray Data Computational Exercises 2010 March 22-26, Technischen Universität München Amin Moghaddasi, Kurt Fellenberg 1. Installing R. Check whether

More information

EPIB Four Lecture Overview of R

EPIB Four Lecture Overview of R EPIB-613 - Four Lecture Overview of R R is a package with enormous capacity for complex statistical analysis. We will see only a small proportion of what it can do. The R component of EPIB-613 is divided

More information

CSC Advanced Scientific Computing, Fall Numpy

CSC Advanced Scientific Computing, Fall Numpy CSC 223 - Advanced Scientific Computing, Fall 2017 Numpy Numpy Numpy (Numerical Python) provides an interface, called an array, to operate on dense data buffers. Numpy arrays are at the core of most Python

More information

Statistical Computing (36-350)

Statistical Computing (36-350) Statistical Computing (36-350) Lecture 1: Introduction to the course; Data Cosma Shalizi and Vincent Vu 29 August 2011 Why good statisticians learn how to program Independence: otherwise, you rely on someone

More information

Short Introduction to R

Short Introduction to R Short Introduction to R Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México June, 2015. CIMMYT, México-SAGPDB Short Introduction to R 1/51 Contents 1 Introduction 2 Simple objects 3 User defined

More information

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot.

No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. No Name What it does? 1 attach Attach your data frame to your working environment. 2 boxplot Creates a boxplot. 3 confint A metafor package function that gives you the confidence intervals of effect sizes.

More information

Introduction to R Statistical Package. Eng. Mohammad Khalaf Dep. of Statistics

Introduction to R Statistical Package. Eng. Mohammad Khalaf Dep. of Statistics Introduction to R Statistical Package Eng. Mohammad Khalaf Dep. of Statistics Introduction R is an integrated suite of software facilities for data manipulation, calculation and graphical display. Among

More information

The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up.

The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up. Fall 2010 Handout on Using R Page: 1 The goal of this handout is to allow you to install R on a Windows-based PC and to deal with some of the issues that can (will) come up. 1. Installing R First off,

More information

Basic R Part 1. Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York U.S.A. by Aureliano Bombarely Gomez

Basic R Part 1. Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York U.S.A. by Aureliano Bombarely Gomez Basic R Part 1 Boyce Thompson Institute for Plant Research Tower Road Ithaca, New York 14853-1801 U.S.A. by Aureliano Bombarely Gomez A Brief Introduction to R: 1. What is R? 2. Software and documentation.

More information

Getting Started in R

Getting Started in R Getting Started in R Giles Hooker May 28, 2007 1 Overview R is a free alternative to Splus: a nice environment for data analysis and graphical exploration. It uses the objectoriented paradigm to implement

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources

Description/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org

More information

6 Subscripting. 6.1 Basics of Subscripting. 6.2 Numeric Subscripts. 6.3 Character Subscripts

6 Subscripting. 6.1 Basics of Subscripting. 6.2 Numeric Subscripts. 6.3 Character Subscripts 6 Subscripting 6.1 Basics of Subscripting For objects that contain more than one element (vectors, matrices, arrays, data frames, and lists), subscripting is used to access some or all of those elements.

More information

k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)

k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing) k Nearest Neighbors k Nearest Neighbors To classify an observation: Look at the labels of some number, say k, of neighboring observations. The observation is then classified based on its nearest neighbors

More information

8.1 R Computational Toolbox Tutorial 3

8.1 R Computational Toolbox Tutorial 3 8.1 R Computational Toolbox Tutorial 3 Introduction to Computational Science: Modeling and Simulation for the Sciences, 2 nd Edition Angela B. Shiflet and George W. Shiflet Wofford College 2014 by Princeton

More information

15 Wyner Statistics Fall 2013

15 Wyner Statistics Fall 2013 15 Wyner Statistics Fall 2013 CHAPTER THREE: CENTRAL TENDENCY AND VARIATION Summary, Terms, and Objectives The two most important aspects of a numerical data set are its central tendencies and its variation.

More information

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ To perform inferential statistics

More information

University of Wollongong School of Mathematics and Applied Statistics. STAT231 Probability and Random Variables Introductory Laboratory

University of Wollongong School of Mathematics and Applied Statistics. STAT231 Probability and Random Variables Introductory Laboratory 1 R and RStudio University of Wollongong School of Mathematics and Applied Statistics STAT231 Probability and Random Variables 2014 Introductory Laboratory RStudio is a powerful statistical analysis package.

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

Using R for statistics and data analysis

Using R for statistics and data analysis Introduction ti to R: Using R for statistics and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ Why use R? To perform inferential statistics (e.g.,

More information

Advanced Statistics 1. Lab 11 - Charts for three or more variables. Systems modelling and data analysis 2016/2017

Advanced Statistics 1. Lab 11 - Charts for three or more variables. Systems modelling and data analysis 2016/2017 Advanced Statistics 1 Lab 11 - Charts for three or more variables 1 Preparing the data 1. Run RStudio Systems modelling and data analysis 2016/2017 2. Set your Working Directory using the setwd() command.

More information

Getting Started in R

Getting Started in R Getting Started in R Phil Beineke, Balasubramanian Narasimhan, Victoria Stodden modified for Rby Giles Hooker January 25, 2004 1 Overview R is a free alternative to Splus: a nice environment for data analysis

More information

x= suppose we want to calculate these large values 1) x= ) x= ) x=3 100 * ) x= ) 7) x=100!

x= suppose we want to calculate these large values 1) x= ) x= ) x=3 100 * ) x= ) 7) x=100! HighPower large integer calculator intended to investigate the properties of large numbers such as large exponentials and factorials. This application is written in Delphi 7 and can be easily ported to

More information

Introduction to R: Using R for statistics and data analysis

Introduction to R: Using R for statistics and data analysis Why use R? Introduction to R: Using R for statistics and data analysis George W Bell, Ph.D. BaRC Hot Topics November 2014 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/

More information

Introduction to R: Using R for statistics and data analysis

Introduction to R: Using R for statistics and data analysis Why use R? Introduction to R: Using R for statistics and data analysis George W Bell, Ph.D. BaRC Hot Topics November 2015 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/

More information

Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center

Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center Introduction to R Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center What is R? R is a statistical computing environment with graphics capabilites It is fully scriptable

More information

R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:

R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: Introduction to R R R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: http://www.r-project.org/ Code Editor: http://rstudio.org/

More information

Numeric Vectors STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley

Numeric Vectors STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley Numeric Vectors STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Data Types and Structures To make the

More information

An introduction to R 1 / 29

An introduction to R 1 / 29 An introduction to R 1 / 29 What is R? R is an integrated suite of software facilities for data manipulation, calculation and graphical display. Among other things it has: an effective data handling and

More information

R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean

R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean Copyright 2017 by Joseph W. McKean at Western Michigan University. All rights reserved. Reproduction or translation of

More information

Introduction to R 21/11/2016

Introduction to R 21/11/2016 Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? https://cran.r-project.org/ How to install and use it? Follow the steps: you don t need advanced

More information

Module 1: Introduction RStudio

Module 1: Introduction RStudio Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator

More information

The Very Basics of the R Interpreter

The Very Basics of the R Interpreter Chapter 2 The Very Basics of the R Interpreter OK, the computer is fired up. We have R installed. It is time to get started. 1. Start R by double-clicking on the R desktop icon. 2. Alternatively, open

More information

ClaNC: The Manual (v1.1)

ClaNC: The Manual (v1.1) ClaNC: The Manual (v1.1) Alan R. Dabney June 23, 2008 Contents 1 Installation 3 1.1 The R programming language............................... 3 1.2 X11 with Mac OS X....................................

More information

Introduction to Programming

Introduction to Programming Introduction to Programming Python Lab 3: Arithmetic PythonLab3 lecture slides.ppt 16 October 2018 Ping Brennan (p.brennan@bbk.ac.uk) 1 Getting Started Create a new folder in your disk space with the name

More information

Partitioning Cluster Analysis with Possibilistic C-Means Zeynel Cebeci

Partitioning Cluster Analysis with Possibilistic C-Means Zeynel Cebeci Partitioning Cluster Analysis with Possibilistic C-Means Zeynel Cebeci 2017-11-10 Contents 1 PREPARING FOR THE ANALYSIS 1 1.1 Install and load the package ppclust................................ 1 1.2

More information

Clojure & Incanter. Introduction to Datasets & Charts. Data Sorcery with. David Edgar Liebke

Clojure & Incanter. Introduction to Datasets & Charts. Data Sorcery with. David Edgar Liebke Data Sorcery with Clojure & Incanter Introduction to Datasets & Charts National Capital Area Clojure Meetup 18 February 2010 David Edgar Liebke liebke@incanter.org Outline Overview What is Incanter? Getting

More information

Introduction to Programming

Introduction to Programming Introduction to Programming Python Lab 3: Arithmetic PythonLab3 lecture slides.ppt 26 January 2018 Ping Brennan (p.brennan@bbk.ac.uk) 1 Getting Started Create a new folder in your disk space with the name

More information

Applied Calculus. Lab 1: An Introduction to R

Applied Calculus. Lab 1: An Introduction to R 1 Math 131/135/194, Fall 2004 Applied Calculus Profs. Kaplan & Flath Macalester College Lab 1: An Introduction to R Goal of this lab To begin to see how to use R. What is R? R is a computer package for

More information

Basic R QMMA. Emanuele Taufer. 2/19/2018 Basic R (1)

Basic R QMMA. Emanuele Taufer. 2/19/2018 Basic R (1) Basic R QMMA Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20classes/1-3_basic_r.html#(1) 1/21 Preliminary R is case sensitive: a is not the same as A.

More information

mmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones

mmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones CONTRIBUTED RESEARCH ARTICLE 1 mmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones Abstract Machine learning methods can often learn high-dimensional functions which generalize well

More information

Introduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center

Introduction to R. Nishant Gopalakrishnan, Martin Morgan January, Fred Hutchinson Cancer Research Center Introduction to R Nishant Gopalakrishnan, Martin Morgan Fred Hutchinson Cancer Research Center 19-21 January, 2011 Getting Started Atomic Data structures Creating vectors Subsetting vectors Factors Matrices

More information

II.Matrix. Creates matrix, takes a vector argument and turns it into a matrix matrix(data, nrow, ncol, byrow = F)

II.Matrix. Creates matrix, takes a vector argument and turns it into a matrix matrix(data, nrow, ncol, byrow = F) II.Matrix A matrix is a two dimensional array, it consists of elements of the same type and displayed in rectangular form. The first index denotes the row; the second index denotes the column of the specified

More information

STAT 540 Computing in Statistics

STAT 540 Computing in Statistics STAT 540 Computing in Statistics Introduces programming skills in two important statistical computer languages/packages. 30-40% R and 60-70% SAS Examples of Programming Skills: 1. Importing Data from External

More information

Operators & Expressions

Operators & Expressions Operators & Expressions Operator An operator is a symbol used to indicate a specific operation on variables in a program. Example : symbol + is an add operator that adds two data items called operands.

More information

Getting Started. Slides R-Intro: R-Analytics: R-HPC:

Getting Started. Slides R-Intro:   R-Analytics:   R-HPC: Getting Started Download and install R + Rstudio http://www.r-project.org/ https://www.rstudio.com/products/rstudio/download2/ TACC ssh username@wrangler.tacc.utexas.edu % module load Rstats %R Slides

More information

Stochastic Models. Introduction to R. Walt Pohl. February 28, Department of Business Administration

Stochastic Models. Introduction to R. Walt Pohl. February 28, Department of Business Administration Stochastic Models Introduction to R Walt Pohl Universität Zürich Department of Business Administration February 28, 2013 What is R? R is a freely-available general-purpose statistical package, developed

More information

A (very) short introduction to R

A (very) short introduction to R A (very) short introduction to R Paul Torfs & Claudia Brauer Hydrology and Quantitative Water Management Group Wageningen University, The Netherlands 1 Introduction 16 April 2012 R is a powerful language

More information

S CHAPTER return.data S CHAPTER.Data S CHAPTER

S CHAPTER return.data S CHAPTER.Data S CHAPTER 1 S CHAPTER return.data S CHAPTER.Data MySwork S CHAPTER.Data 2 S e > return ; return + # 3 setenv S_CLEDITOR emacs 4 > 4 + 5 / 3 ## addition & divison [1] 5.666667 > (4 + 5) / 3 ## using parentheses [1]

More information

Introduction to R for Epidemiologists

Introduction to R for Epidemiologists Introduction to R for Epidemiologists Jenna Krall, PhD Thursday, January 29, 2015 Final project Epidemiological analysis of real data Must include: Summary statistics T-tests or chi-squared tests Regression

More information

The SQLiteDF Package

The SQLiteDF Package The SQLiteDF Package August 25, 2006 Type Package Title Stores data frames & matrices in SQLite tables Version 0.1.18 Date 2006-08-18 Author Maintainer Transparently stores data frames

More information

Introduction to R, Github and Gitlab

Introduction to R, Github and Gitlab Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and

More information

>>> * *(25**0.16) *10*(25**0.16)

>>> * *(25**0.16) *10*(25**0.16) #An Interactive Session in the Python Shell. #When you type a statement in the Python Shell, #the statement is executed immediately. If the #the statement is an expression, its value is #displayed. #Lines

More information

Getting To Know Matlab

Getting To Know Matlab Getting To Know Matlab The following worksheets will introduce Matlab to the new user. Please, be sure you really know each step of the lab you performed, even if you are asking a friend who has a better

More information

Part 1: Getting Started

Part 1: Getting Started Part 1: Getting Started 140.776 Statistical Computing Ingo Ruczinski Thanks to Thomas Lumley and Robert Gentleman of the R-core group (http://www.r-project.org/) for providing some tex files that appear

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

Introduction to Matlab

Introduction to Matlab Introduction to Matlab By:Mohammad Sadeghi *Dr. Sajid Gul Khawaja Slides has been used partially to prepare this presentation Outline: What is Matlab? Matlab Screen Basic functions Variables, matrix, indexing

More information

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical

Why use R? Getting started. Why not use R? Introduction to R: It s hard to use at first. To perform inferential statistics (e.g., use a statistical Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics November 2013 George W. Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ To perform inferential

More information

Univariate Data - 2. Numeric Summaries

Univariate Data - 2. Numeric Summaries Univariate Data - 2. Numeric Summaries Young W. Lim 2018-08-01 Mon Young W. Lim Univariate Data - 2. Numeric Summaries 2018-08-01 Mon 1 / 36 Outline 1 Univariate Data Based on Numerical Summaries R Numeric

More information

Computing Fundamentals

Computing Fundamentals Computing Fundamentals Salvatore Filippone salvatore.filippone@uniroma2.it 2012 2013 (salvatore.filippone@uniroma2.it) Computing Fundamentals 2012 2013 1 / 18 Octave basics Octave/Matlab: f p r i n t f

More information

Week 4: Describing data and estimation

Week 4: Describing data and estimation Week 4: Describing data and estimation Goals Investigate sampling error; see that larger samples have less sampling error. Visualize confidence intervals. Calculate basic summary statistics using R. Calculate

More information

Numerical Methods 5633

Numerical Methods 5633 Numerical Methods 5633 Lecture 1 Marina Krstic Marinkovic marina.marinkovic@cern.ch School of Mathematics Trinity College Dublin Marina Krstic Marinkovic 1 / 15 5633-Numerical Methods R programming https://www.r-project.org/

More information

Excel R Tips. is used for multiplication. + is used for addition. is used for subtraction. / is used for division

Excel R Tips. is used for multiplication. + is used for addition. is used for subtraction. / is used for division Excel R Tips EXCEL TIP 1: INPUTTING FORMULAS To input a formula in Excel, click on the cell you want to place your formula in, and begin your formula with an equals sign (=). There are several functions

More information

Getting Started with MATLAB

Getting Started with MATLAB APPENDIX B Getting Started with MATLAB MATLAB software is a computer program that provides the user with a convenient environment for many types of calculations in particular, those that are related to

More information

Instructions and Result Summary

Instructions and Result Summary Instructions and Result Summary VU Biostatistics and Experimental Design PLA.216 Exercise 1 Introduction to R & Biostatistics Name and Student ID MAXIMILIANE MUSTERFRAU 01330974 Name and Student ID JOHN

More information