R programming. 19 February, University of Trento - FBK 1 / 50
|
|
- Adele Marshall
- 6 years ago
- Views:
Transcription
1 R programming University of Trento - FBK 19 February, / 50
2 Hints on programming 1 Save all your commands in a SCRIPT FILE, they will be useful in future...no one knows... 2 Save your script file any time you can! You swet a lot writing those instructions; You don t want to loose them! 3 Try to give smart name to variables and functions (try to avoid pippo, pluto a, b etc...) 4 Use comments to define sections in your script and describe what the section does If you read the code after 2 month you won t be able to remember what it does, unless you try to read all the instructions...it s not worth spending time reading codes, use COMMENT instead 5 If using values in more than one instruction, try to avoid code repetitions and static values. BAD: sum(a[a>0]) GOOD: thr <- 0 sum(a[a>thr]) 2 / 50
3 Programming with R The if then else statement Check whether a condition is TRUE or FALSE Syntax: if (expr is TRUE){ do something } else { do something else} expr can be one logical expression as seen before A simple if statement: If the instruction is on one line and there is no else -> no need for curly brackets x <- 5 y <- 2 ## if (y!=0) xy <- x/y ## xy A more complex if statement: x <- 5 y <- 3 if (x > 5){ xy <- x - y ## expr = TRUE } else { xy <- x + y ## expr = FALSE } ## [1] 8 3 / 50
4 Testing condition using combination of epression (& ) a<-2 b<-3 d<-4 # Using & to test two conditions, both true if(a<b & b<d) x<-a+b+d x ## [1] 9 # Using & to test two conditions, one is false if(a>b & b<c) y<-a-b-d ## Error in b < c: comparison (3) is possible only for atomic and list types y ## Error in eval(expr, envir, enclos): object y not found # Using to test two conditions, both false if(a==b a>d) z<-a*b*d z ## [1] 24 # Using or to test two conditions, one true if(a<b a>d) z<-a*b*d z ## [1] 24 4 / 50
5 Looping The while() statement Syntax: while( expr ){ do something } An example x <- 0 ## set the counter to 0 while( x<5 ){ ## do the same operation until x is < 5 x <- x + 1 ## update x } x ## [1] 5 Pay attention to the condition x <- 0 y <- 0 ## while (x < 5){ ## y <- y + 1 ## } 5 / 50
6 Looping II The for() statement Syntax: for (i in start:stop ){ do something } An example y <- vector(mode="numeric") ## Allocating an empty vector of mode "numeric" for (i in 1:5){ y[i] <- i + 2 } Nested Loops mat <- matrix(nrow=2,ncol=4) for (i in 1:2){ for (j in 1:4){ mat[i,j] <- i + j } } mat ## [,1] [,2] [,3] [,4] ## [1,] ## [2,] / 50
7 Vectors I I Indexing Use the square brackets to access a slot in a vector [] a[2] ## Extract the second element ## [1] 89 R stats counting from 1 a[0] ## Does not exists! ## integer(0) We can pass multiple indexes using c() function a[2:3] ## [1] ## a[2,3] ## What happen here? What happen when I use a negative number as index b[-1] ## All but the first element ## [1] e[-c(1,4)] ## All but the first and the fourth elements ## Error in eval(expr, envir, enclos): object e not found NB: Do not use c as variable name 7 / 50
8 Subsetting using logical operators Using logic operator inside indexes Logical operator can be use to subset a vector Select only the element of the vector matching the TRUE condition x <- 5:15 y <- 10 x[x > y] ## [1] x[x==y] ## [1] 10 can be used also in matrices mymat <- matrix(3:9, ncol=3) ## Warning in matrix(3:9, ncol = 3): data length [7] is not a sub-multiple or multiple of the number of rows [3] mymat > 7 ## Get TRUE where mymat is bigger than 7 ## [,1] [,2] [,3] ## [1,] FALSE FALSE TRUE ## [2,] FALSE FALSE FALSE ## [3,] FALSE TRUE FALSE mymat[mymat>7] ## Get the actual values where mymat is bigger than 7 ## [1] / 50
9 Subsetting using logical operators II Getting indexes The which() function Syntax:which(expr) works only on vectors (matrix and data.frame) returns the indexes where the expr is TRUE expr can be any logical expression; combination of AND, OR are accepted mymat > 7 ## [,1] [,2] [,3] ## [1,] FALSE FALSE TRUE ## [2,] FALSE FALSE FALSE ## [3,] FALSE TRUE FALSE ## Get the indexes where mymat > 7 which(mymat>7) ## [1] 6 7 which(mymat>7, arr.ind=true) ## row col ## [1,] 3 2 ## [2,] / 50
10 Exercises I 1 Given an integer number x check all its divisors. 2 Given an integer number x compute the sum of all its divisors. 3 A perfect number is a number whose sum of the divisors (apart from itself) is equal to the number itself. For example 6 is perfect because (the divisors) = 6. 1 Given an integer number check if it is perfect. 2 Given an integer number x find all perfect numbers i < x. 10 / 50
11 Functions I Define your own function We have seen many function such as: sum(mymat) ## [1] 49 mean(mymat) ## [1] Now you can define your custom function myfunction <- function(arg1, arg2){ do something with arg1 and arg2 return(results) } Define a function to convert Fahrenheit to Celsius FtoC <- function(f){ cels <- (F - 32) * (5/9) return(cels) } FtoC(212) ## [1] / 50
12 Functions II Define a function to make the power of a number/vector Use default argument mypow <- function(x, exponent=2){ res <- x^exponent return(res) } mypow(2) ## [1] 4 mypow(3,5) ## [1] 243 Variables defined inside a function will be valid only inside the function res ## Error in eval(expr, envir, enclos): object res not found Use debug() for debugging a function It will run line by line It allows to see the values of the variable inside the function Each time the function is defined the debug mode will be removed To exit the debug mode type c debug(mypow) 12 / 50
13 Functions II Function arguments can be call according to positions bt <- read.table("../lesson1/example1/bodytemperature.txt",true, " ") ## This will assign the f ## Gender Age HeartRate Temperature ## 1 M ## 2 M ## 3 M ## 4 F ## 5 F ## 6 M Function arguments can be call by name ## Call arguments by name (position does not count) bt <- read.table("../lesson1/example1/bodytemperature.txt",sep=" ", header=true) ## Gender Age HeartRate Temperature ## 1 M ## 2 M ## 3 M ## 4 F ## 5 F ## 6 M / 50
14 Data Exploration and summary statistic Develop high level understanding of the data Given a data.frame let s understand the data inside. What variables do we have? Do they have meaningful names? What are the variable types? (numeric, boolean, categorical) What is the distribution of the data? Are there any categorical variable? The aim is to reduce the amount of information and focus only on key aspect of the data 14 / 50
15 Working with data objects As an example let s work on the labdf dataset. bt <- read.table("bodytemperature.txt", header=true, sep=" ", as.is=true) head(bt) ## Let's look onlyt the firsts rows of the data.frame ## Gender Age HeartRate Temperature ## 1 M ## 2 M ## 3 M ## 4 F ## 5 F ## 6 M / 50
16 Working with data objects Get the structure and some useful statistic str(bt) ## See the structure of the data object ## 'data.frame': 100 obs. of 4 variables: ## $ Gender : chr "M" "M" "M" "F"... ## $ Age : int ## $ HeartRate : int ## $ Temperature: num summary(bt) ## Compute some statistic on each variable in the data.frame ## Gender Age HeartRate Temperature ## Length:100 Min. :21.0 Min. :61.0 Min. : 96.2 ## Class :character 1st Qu.:33.8 1st Qu.:69.0 1st Qu.: 97.7 ## Mode :character Median :37.0 Median :73.0 Median : 98.3 ## Mean :37.6 Mean :73.7 Mean : 98.3 ## 3rd Qu.:42.0 3rd Qu.:78.0 3rd Qu.: 98.9 ## Max. :50.0 Max. :87.0 Max. :101.3 names(bt) ## Get the variable names ## [1] "Gender" "Age" "HeartRate" "Temperature" 16 / 50
17 Working with data objects I Change the variable mode of the columns: Check the variable modes is.data.frame(bt) ## Check if the object is a data.frame ## [1] TRUE is.numeric(bt$age) ## Check if the mode of the column is numeric ## [1] TRUE is.character(bt$gender) ## Check if the mode of the variable Gender is character ## [1] TRUE Look at the variable Gender, it is categorical, but it s stored as character as.factor(bt$gender) ## Change variable mode Gender into factor (categorical) ## [1] M M M F F M F F F M M F F F F M F M F F F F F M F M M M M F F F M M M ## [36] F F M F F M M F M M M F F F F M F M M F F F M F F F M M F M M F M M M ## [71] F F M M M M F M F M M F F M F M M M F M F F M M F M F F F M ## Levels: F M 17 / 50
18 Working with data objects II Store the changes on the data.frame and check the data.frame bt$gender <- as.factor(bt$gender) ## Store the previous change str(bt) ## Look at the structure ## 'data.frame': 100 obs. of 4 variables: ## $ Gender : Factor w/ 2 levels "F","M": ## $ Age : int ## $ HeartRate : int ## $ Temperature: num summary(bt) ## Compute some statistic ## Gender Age HeartRate Temperature ## F:51 Min. :21.0 Min. :61.0 Min. : 96.2 ## M:49 1st Qu.:33.8 1st Qu.:69.0 1st Qu.: 97.7 ## Median :37.0 Median :73.0 Median : 98.3 ## Mean :37.6 Mean :73.7 Mean : 98.3 ## 3rd Qu.:42.0 3rd Qu.:78.0 3rd Qu.: 98.9 ## Max. :50.0 Max. :87.0 Max. : / 50
19 Exercise II 1 Define a function that converts km to miles and viceversa. 2 Define a function that check wheter a number is perfect (vd Exercise I). 3 Define a function that given a numeric matrix returns the log of the matrix where the matrix element is > 0 and NA otherwise. 4 Get the dataset SAheart_sub.data from the website and check the type for each column. Add a column of factor type with Alchoolic where the value of alchol consumption is > 13 and Non-Alcoholic otherwise. 19 / 50
20 Probability Distributions in R Probability functions: Every probability function in R has 4 functions denoted by the root (e.g. norm for normal distribution) and a prefix: p for probability, the cumulative distribution function (c.d.f.) F (x) = P(X <= x) q for quantile, the inverse of c.d.f. x = F 1 (p) d for density, the density function (p.d.f.) f (x) = 1 e x2 /2 2π r for random, the random variable having the specified distribution Example: For the normal distribution we have the functions: pnorm, qnorm, dnorm, rnorm 20 / 50
21 Probability distribution in R Available functions Distributions Functions Binomial pbinom qbinom dbinom rbinom Chi-Square pchisq qchisq dchisq rchisq Exponential pexp qexp dexp rexp Log Normal plnorm qlnorm dlnorm rlnorm Normal pnorm qnorm dnorm rnorm Poisson ppois qpois dpois rpois Student t pt qt dt rt Uniform punif qunif dunif runif Check the help (?<function>) for further information on the parameters and the usage of each function. 21 / 50
22 The Normal Distribution in R Cumulative Distribution Function pnorm: computes the Cumulative Distribution Function where X is normally distributed F (x) = P(X <= x) ## P(X<=2), X=N(0,1) pnorm(2) ## [1] ## P(X<=12), X=N(10,4) pnorm(12, mean=10, sd=2) ## [1] What is the P(X > 19) where X = N (17.4, )? pnorm Normal Cumulative x 22 / 50
23 The Normal Distribution in R The quantiles qnorm: computes the inverse of thd c.d.f. Given a number 0 p 1 it returns the p th quantile of the distribution. p = F (X) X = F 1 (p) ## X = F^-1(0.95), N(0,1) qnorm(0.95) Normal Density ## [1] ## X = F^-1(0.95), N(100,625) qnorm(0.95, mean=100, sd=25) ## [1] What is the 85-th quantile of X = N (72, 68)? pnorm p qnorm(p) x 23 / 50
24 The Normal Distribution in R The Density Function dnorm: computes the Probability Density Function (p.d.f.) of the normal distribution. f (x) = 1 e (x µ)2 2σ 2 2π ## F(0.5), X = N(0,1) dnorm(0.5) ## [1] ## F(-2.5), X = N(-1.5,2) dnorm(-2.5, mean=-1.5, sd=sqrt(2)) ## [1] dnorm Density Function x 24 / 50
25 The Normal Distribution in R The Random Function rnorm: simulates a random variates having a specified normal distribution. ## Extract 1000 samples X = N(0,1) x <- rnorm(1000) ## Extract 1000 samples X = N(100,225) x <- rnorm(1000, mean=100, sd=15) xx <- seq(min(x), max(x), length=100) hist(x, probability=true) lines(xx, dnorm(xx, mean=100, sd=15)) Density Histogram of x x 25 / 50
26 Exercise III 1 Compute the values for p = [0.01, 0.05, 0.1, 0.2, 0.25] given X = N ( 2, 8) 2 What is P(X = 1) when X = Bin(25, 0.005)? 3 What is P(13 X 22) where X = N (17.46, )? 26 / 50
27 Plotting in R High level plot functions Function Name plot(x,y) boxplot(x) hist(x) barplot(x) pairs(x) image(x,y,z) Plot Produced Plot vector x against vector y "Box and whiskers" plot Histogram of the frequencies of x Histogram of the value of x For a matrix or data.frame plots all bivariate pairs 3D plot using colors instead of lines 27 / 50
28 Simple visualization on numeric variables y Visualizing two vectors x <- 1:10 y <- 1:10 plot(x,y) x 28 / 50
29 Simple visualization on numeric variables Visualizing two vectors, adding axis labels and changin the line type plot(x,y, xlab="x values", ylab="y values", main="x vs Y", type="b") X vs Y Y values X values More graphical parameter can be seen looking at the help of par 29 / 50
30 y Additional parameter to graphical functions Low level plotting functions Adding point/line to an existing graph using points(x,y) and lines(x,y) Adding text to an existing plot using text(x,y,label= ") Adding a legend to a plot using legend(x,y,legend= ") plot(x,y) abline(0,1) points(2,3, pch=19) lines(x,y) text(4,6, label="slope=1") Slope= x 30 / 50
31 Barplot The function barplot() It plots the frequencies of the values of a variable It is useful for looking at categorical values It takes a vector or a matrix as input and use the values as frequencies barplot(1:10) / 50
32 Barplot The function barplot() Given a matrix as input (Death rates per 1000 population per year in Virginia) VADeaths ## Rural Male Rural Female Urban Male Urban Female ## ## ## ## ## barplot(vadeaths) Rural Male Rural Female Urban Male Urban Female 32 / 50
33 Visualization on Categorical variables Summarize the count for factors table(bt$gender) ## Collect the factors and count occurences for each factor ## ## F M ## Look at the summarization in a bar plot barplot(table(bt$gender), xlab="gender", ylab="frequency", main="summarize Gender variable") Summarize Gender variable Frequency F M Gender 33 / 50
34 Histograms The function hist() Normaly used to visualize numerical variables It is similar to a barplot but values are grouped into bins For each interval the bar height correspond to the frequency (count) of observation in that interval The heights sum to sample size 34 / 50
35 Look at the distribution of the data How the heart rate is distributed over our dataset? Histogram of the HeartRate variable using frequency on the Y axis hist(bt$heartrate, col="gray80") Histogram of bt$heartrate Frequency bt$heartrate 35 / 50
36 Look at the distribution of the data Density on the Y axis hist(bt$heartrate, col="gray80", freq=false) ## Use parameter freq to change behaviour Histogram of bt$heartrate Density bt$heartrate 36 / 50
37 Look at the distribution of the data Changing the intervals hist(bt$heartrate, col="gray80", breaks=50) ## Use parameter breaks to change intervals Histogram of bt$heartrate Frequency bt$heartrate 37 / 50
38 Look at the distribution of the data Adding information to the histogram, mean and median hist(bt$heartrate, col="gray80", main="histogram of Hear Rate") abline(v=mean(bt$heartrate), lwd=3) abline(v=median(bt$heartrate), lty=3, lwd=3) legend("right", legend=c("mean", "Median"), lty=c(1,3)) Histogram of Hear Rate Frequency Mean Median bt$heartrate 38 / 50
39 Boxplots The function boxplot() Visualize the 5-number summary, the range and the quartiles 39 / 50
40 Boxplots Look at the boxplot for the HearRate Variable boxplot(bt$heartrate, horizontal=true, col="grey80") / 50
41 Boxplots Look at the boxplot for the HeartRate Variable boxplot(bt$heartrate, horizontal=true, col="grey80") points(bt$heartrate, rep(1,length(bt$heartrat)), pch=19) ## See where the data are abline(h=1, lty=2) / 50
42 Using factors and formula objects Using a factor as categorical variable to condition the plot Conditioning a plot using the factor using the formula object: bt$heartrate ~ bt$gender The numeric values in bt$heartrate will be divided according to categories in bt$gender boxplot(bt$heartrate~bt$gender, horizontal=true, col="grey80") F M / 50
43 Pairs The pairs() function It plots all the possible pairwise comparison in a data.frame It allows a fast visual data exploration pairs(bt) ## Look at all possible comparison at once Gender Age HeartRate Temperature 43 / 50
44 Normal plot Let s look at the variable HearRate vs Temperature See the use of in the plot command ## plot(bt$heartrate, bt$temperature) plot(bt$heartrate~bt$temperature, main="heart Rate vs Temperature") Heart Rate vs Temperature bt$temperature bt$heartrate 44 / 50
45 Multiple plots on the same windows Put more information together on the same plot par(mfrow=c(2,1)) ## Note mfrow defining 2 rows and 1 column for allowing 2 plots hist(bt$heartrate, col="grey80", main="heartrate histogram") abline(v=mean(bt$heartrate), lwd=3) abline(v=median(bt$heartrate), lty=3, lwd=3) legend("right", legend=c("mean", "Median"), lty=c(1,3)) boxplot(bt$heartrate~bt$gender, horizontal=true, col=c( "pink", "blue")) title("boxplot for different gender") points(bt$heartrate[bt$gender=="f"], rep(1,length(bt$heartrate[bt$gender=="f"])), pch=19) points(bt$heartrate[bt$gender=="m"], rep(2,length(bt$heartrate[bt$gender=="m"])), pch=19) HeartRate histogram Frequency Mean Median bt$heartrate Boxplot for different gender F M / 50
46 Exporting graphs It is possible to export graph in different formats Png, Jpg, Pdf, Eps, Tiff Look at the help for the functions pdf,png pdf("myfirstgraph.pdf") ## Start the png device par(mfrow=c(2,1)) hist(bt$heartrate, col="grey80", main="heartrate histogram") boxplot(bt$heartrate, horizontal=true, col="grey80", main="boxplot") dev.off() ## switch off the device nif / 50
47 Look probability distribution in plot How an extraction from a N distribution looks like? Extract enough samples from a N (0, 1) Use Histogram to look at the data x <- seq(-3,3,by=0.1) ## Create a vector of x values y <- dnorm(x) ## Compute the normal density function over the vector x plot(x,y,type="l") ## Plot it y x 47 / 50
48 Data in R R comes with a lot of dataset included Look at all the available data sets with: data() ## See all the availabel datasets data(package =.packages(all.available = TRUE)) ## See all the available dataset in all the pav ## Warning in data(package =.packages(all.available = TRUE)): datasets have been moved from package base to package datasets ## Warning in data(package =.packages(all.available = TRUE)): datasets have been moved from package stats to package datasets Get the VADeaths dataset from the datasets package data(vadeaths, package="datasets") ## Load the dataset ## ls() ## Look if the dataseta has been loaded ##?VADeaths ## Look at the documentation 48 / 50
49 Exercise I 1 Define a function that transform Celsius to Fahrenheit Given the function defined before think on using an argument to compute the inverse (Fahreneit to Celsius) 2 Define a function that given a number it computes the Fibonacci series What can happen if a float number or a negative number is given? 3 Define a function that given a number it checks if it is a prime number 4 Two integer number are friends if the quotient between the number itself and the sum of the divisors are equal. For example the sum of divisors of 6 is =12. The sum of divisors of 28 is = 56. Then 12 /6 = 56 / 28 = 2, thus 6 and 28 are friends. Define a function that given 2 number as input checks if the numbers are friends. 5 Fix the number of samples to 1000 and extract at least 8 N (m, 1) where m [ 3, 3]. With the same number of samples extract at least 8 N (0, s) where s [0.1, 2]. Plot the results in a same window with 3 different plot, one for N (m, 1), one for N (0, s) and one for N (m, 1) and N (0, s) together. Decide the color code for each line suggestion: search for R color charts in google and the function colors() in R Plot the different distribution on the sample plot 49 / 50
50 Exercise II 6 Extract form a normal distribution an increasing number of samples ( ) and look at the differences in the distribution between sample sizes 7 The dataset Pima.tr collects samples from the US National Institute of Diabetes and Difestive and Kidney Disease. It includes 200 women of Pima Indian heritage living near Phoenix, Arizona. Get the dataset from the MASS package or download it from the website. Describe the dataset, how many variables, which type of variable, how many samples... What do the variable mean? Get the frquencies of the women affected by diabetes. Explore the dataset using histograms, barplot and plots. For each plot you do describe what you see and why did you do that plot. Using categorical variable type to see if there is any difference in age distribution, bmi, and glu variables 50 / 50
R Programming Basics - Useful Builtin Functions for Statistics
R Programming Basics - Useful Builtin Functions for Statistics Vectorized Arithmetic - most arthimetic operations in R work on vectors. Here are a few commonly used summary statistics. testvect = c(1,3,5,2,9,10,7,8,6)
More informationAdvanced Econometric Methods EMET3011/8014
Advanced Econometric Methods EMET3011/8014 Lecture 2 John Stachurski Semester 1, 2011 Announcements Missed first lecture? See www.johnstachurski.net/emet Weekly download of course notes First computer
More informationDescription/History Objects/Language Description Commonly Used Basic Functions. More Specific Functionality Further Resources
R Outline Description/History Objects/Language Description Commonly Used Basic Functions Basic Stats and distributions I/O Plotting Programming More Specific Functionality Further Resources www.r-project.org
More informationBasics of Plotting Data
Basics of Plotting Data Luke Chang Last Revised July 16, 2010 One of the strengths of R over other statistical analysis packages is its ability to easily render high quality graphs. R uses vector based
More informationLecture 3 - Object-oriented programming and statistical programming examples
Lecture 3 - Object-oriented programming and statistical programming examples Björn Andersson (w/ Ronnie Pingel) Department of Statistics, Uppsala University February 1, 2013 Table of Contents 1 Some notes
More informationStat 290: Lab 2. Introduction to R/S-Plus
Stat 290: Lab 2 Introduction to R/S-Plus Lab Objectives 1. To introduce basic R/S commands 2. Exploratory Data Tools Assignment Work through the example on your own and fill in numerical answers and graphs.
More informationAn introduction to WS 2015/2016
An introduction to WS 2015/2016 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Prof. Dr. Martin Hutzenthaler (previously AG Metzler, now University of Duisburg-Essen) course
More informationChapter 6: DESCRIPTIVE STATISTICS
Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling
More informationIntroduction to RStudio
First, take class through processes of: Signing in Changing password: Tools -> Shell, then use passwd command Installing packages Check that at least these are installed: MASS, ISLR, car, class, boot,
More informationStatistics 251: Statistical Methods
Statistics 251: Statistical Methods Summaries and Graphs in R Module R1 2018 file:///u:/documents/classes/lectures/251301/renae/markdown/master%20versions/summary_graphs.html#1 1/14 Summary Statistics
More informationIST 3108 Data Analysis and Graphics Using R Week 9
IST 3108 Data Analysis and Graphics Using R Week 9 Engin YILDIZTEPE, Ph.D 2017-Spring Introduction to Graphics >y plot (y) In R, pictures are presented in the active graphical device or window.
More informationStatistical Programming with R
Statistical Programming with R Lecture 9: Basic graphics in R Part 2 Bisher M. Iqelan biqelan@iugaza.edu.ps Department of Mathematics, Faculty of Science, The Islamic University of Gaza 2017-2018, Semester
More informationPractical 2: Plotting
Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory
More informationTypes of Plotting Functions. Managing graphics devices. Further High-level Plotting Functions. The plot() Function
3 / 23 5 / 23 Outline The R Statistical Environment R Graphics Peter Dalgaard Department of Biostatistics University of Copenhagen January 16, 29 1 / 23 2 / 23 Overview Standard R Graphics The standard
More informationAz R adatelemzési nyelv
Az R adatelemzési nyelv alapjai II. Egészségügyi informatika és biostatisztika Gézsi András gezsi@mit.bme.hu Functions Functions Functions do things with data Input : function arguments (0,1,2, ) Output
More informationPackage simed. November 27, 2017
Version 1.0.3 Title Simulation Education Author Barry Lawson, Larry Leemis Package simed November 27, 2017 Maintainer Barry Lawson Imports graphics, grdevices, methods, stats, utils
More informationIntroduction to R, Github and Gitlab
Introduction to R, Github and Gitlab 27/11/2018 Pierpaolo Maisano Delser mail: maisanop@tcd.ie ; pm604@cam.ac.uk Outline: Why R? What can R do? Basic commands and operations Data analysis in R Github and
More information36-402/608 HW #1 Solutions 1/21/2010
36-402/608 HW #1 Solutions 1/21/2010 1. t-test (20 points) Use fullbumpus.r to set up the data from fullbumpus.txt (both at Blackboard/Assignments). For this problem, analyze the full dataset together
More informationR: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)
R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine
More informationLecture 3: Basics of R Programming
Lecture 3: Basics of R Programming This lecture introduces you to how to do more things with R beyond simple commands. Outline: 1. R as a programming language 2. Grouping, loops and conditional execution
More informationGetting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018
Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from
More information1 Pencil and Paper stuff
Spring 2008 - Stat C141/ Bioeng C141 - Statistics for Bioinformatics Course Website: http://www.stat.berkeley.edu/users/hhuang/141c-2008.html Section Website: http://www.stat.berkeley.edu/users/mgoldman
More informationInstall RStudio from - use the standard installation.
Session 1: Reading in Data Before you begin: Install RStudio from http://www.rstudio.com/ide/download/ - use the standard installation. Go to the course website; http://faculty.washington.edu/kenrice/rintro/
More informationIntroduction to R. Biostatistics 615/815 Lecture 23
Introduction to R Biostatistics 615/815 Lecture 23 So far We have been working with C Strongly typed language Variable and function types set explicitly Functional language Programs are a collection of
More informationA (very) brief introduction to R
A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce
More informationAssignments. Math 338 Lab 1: Introduction to R. Atoms, Vectors and Matrices
Assignments Math 338 Lab 1: Introduction to R. Generally speaking, there are three basic forms of assigning data. Case one is the single atom or a single number. Assigning a number to an object in this
More informationStatistics Lecture 6. Looking at data one variable
Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)
More informationSTENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015
STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................
More informationHomework 1 Excel Basics
Homework 1 Excel Basics Excel is a software program that is used to organize information, perform calculations, and create visual displays of the information. When you start up Excel, you will see the
More informationR is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website:
Introduction to R R R is a programming language of a higher-level Constantly increasing amount of packages (new research) Free of charge Website: http://www.r-project.org/ Code Editor: http://rstudio.org/
More informationWHOLE NUMBER AND DECIMAL OPERATIONS
WHOLE NUMBER AND DECIMAL OPERATIONS Whole Number Place Value : 5,854,902 = Ten thousands thousands millions Hundred thousands Ten thousands Adding & Subtracting Decimals : Line up the decimals vertically.
More informationIntroduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010
UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview
More informationUsing R. Liang Peng Georgia Institute of Technology January 2005
Using R Liang Peng Georgia Institute of Technology January 2005 1. Introduction Quote from http://www.r-project.org/about.html: R is a language and environment for statistical computing and graphics. It
More informationINTRODUCTION TO R. Basic Graphics
INTRODUCTION TO R Basic Graphics Graphics in R Create plots with code Replication and modification easy Reproducibility! graphics package ggplot2, ggvis, lattice graphics package Many functions plot()
More informationData Visualization. Andrew Jaffe Instructor
Module 9 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots previously, but we are going to expand the ability to customize these basic graphics first. 2/45 Read in Data
More informationCIND123 Module 6.2 Screen Capture
CIND123 Module 6.2 Screen Capture Hello, everyone. In this segment, we will discuss the basic plottings in R. Mainly; we will see line charts, bar charts, histograms, pie charts, and dot charts. Here is
More informationLecture 3: Basics of R Programming
Lecture 3: Basics of R Programming This lecture introduces how to do things with R beyond simple commands. We will explore programming in R. What is programming? It is the act of instructing a computer
More informationThis document is designed to get you started with using R
An Introduction to R This document is designed to get you started with using R We will learn about what R is and its advantages over other statistics packages the basics of R plotting data and graphs What
More informationR syntax guide. Richard Gonzalez Psychology 613. August 27, 2015
R syntax guide Richard Gonzalez Psychology 613 August 27, 2015 This handout will help you get started with R syntax. There are obviously many details that I cannot cover in these short notes but these
More informationLAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT
NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling
More informationR Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean
R Primer for Introduction to Mathematical Statistics 8th Edition Joseph W. McKean Copyright 2017 by Joseph W. McKean at Western Michigan University. All rights reserved. Reproduction or translation of
More informationFathom Dynamic Data TM Version 2 Specifications
Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other
More informationAn Introduction to Minitab Statistics 529
An Introduction to Minitab Statistics 529 1 Introduction MINITAB is a computing package for performing simple statistical analyses. The current version on the PC is 15. MINITAB is no longer made for the
More informationComputational statistics Jamie Griffin. Semester B 2018 Lecture 1
Computational statistics Jamie Griffin Semester B 2018 Lecture 1 Course overview This course is not: Statistical computing Programming This course is: Computational statistics Statistical methods that
More informationAn Introduction to R 2.2 Statistical graphics
An Introduction to R 2.2 Statistical graphics Dan Navarro (daniel.navarro@adelaide.edu.au) School of Psychology, University of Adelaide ua.edu.au/ccs/people/dan DSTO R Workshop, 29-Apr-2015 Scatter plots
More informationfile:///users/williams03/a/workshops/2015.march/final/intro_to_r.html
Intro to R R is a functional programming language, which means that most of what one does is apply functions to objects. We will begin with a brief introduction to R objects and how functions work, and
More informationR Workshop Module 3: Plotting Data Katherine Thompson Department of Statistics, University of Kentucky
R Workshop Module 3: Plotting Data Katherine Thompson (katherine.thompson@uky.edu Department of Statistics, University of Kentucky October 15, 2013 Reading in Data Start by reading the dataset practicedata.txt
More informationCMPSC 390 Visual Computing Spring 2014 Bob Roos Notes on R Graphs, Part 2
Notes on R Graphs, Part 2 1 CMPSC 390 Visual Computing Spring 2014 Bob Roos http://cs.allegheny.edu/~rroos/cs390s2014 Notes on R Graphs, Part 2 Bar Graphs in R So far we have looked at basic (x, y) plots
More informationUniversity of California, Los Angeles Department of Statistics
University of California, Los Angeles Department of Statistics Statistics 12 Instructor: Nicolas Christou Data analysis with R - Some simple commands When you are in R, the command line begins with > To
More informationAn introduction to R WS 2013/2014
An introduction to R WS 2013/2014 Dr. Noémie Becker (AG Metzler) Dr. Sonja Grath (AG Parsch) Special thanks to: Dr. Martin Hutzenthaler (previously AG Metzler, now University of Frankfurt) course development,
More information8. MINITAB COMMANDS WEEK-BY-WEEK
8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are
More informationMATH11400 Statistics Homepage
MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 1.1 A Framework for Statistical Problems Many statistical problems can be described by a simple framework in which
More informationSTATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS
STATISTICAL LABORATORY, April 3th, 21 BIVARIATE PROBABILITY DISTRIBUTIONS Mario Romanazzi 1 MULTINOMIAL DISTRIBUTION Ex1 Three players play 1 independent rounds of a game, and each player has probability
More informationIQR = number. summary: largest. = 2. Upper half: Q3 =
Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number
More informationIntroduction to scientific programming in R
Introduction to scientific programming in R John M. Drake & Pejman Rohani 1 Introduction This course will use the R language programming environment for computer modeling. The purpose of this exercise
More informationIntroduction to R 21/11/2016
Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? https://cran.r-project.org/ How to install and use it? Follow the steps: you don t need advanced
More information1 R Basics Introduction to R What is R? Advantages of R Downloading R Getting Started in R...
Contents 1 R Basics 5 1.1 Introduction to R..................................... 5 1.1.1 What is R?.................................... 5 1.1.2 Advantages of R................................. 5 1.1.3
More informationAN INTRODUCTION TO R
AN INTRODUCTION TO R DEEPAYAN SARKAR Language Overview II In this tutorial session, we will learn more details about the R language. Objects. Objects in R are anything that can be assigned to a variable.
More informationS CHAPTER return.data S CHAPTER.Data S CHAPTER
1 S CHAPTER return.data S CHAPTER.Data MySwork S CHAPTER.Data 2 S e > return ; return + # 3 setenv S_CLEDITOR emacs 4 > 4 + 5 / 3 ## addition & divison [1] 5.666667 > (4 + 5) / 3 ## using parentheses [1]
More informationSolution to Tumor growth in mice
Solution to Tumor growth in mice Exercise 1 1. Import the data to R Data is in the file tumorvols.csv which can be read with the read.csv2 function. For a succesful import you need to tell R where exactly
More informationPSS718 - Data Mining
Lecture 5 - Hacettepe University October 23, 2016 Data Issues Improving the performance of a model To improve the performance of a model, we mostly improve the data Source additional data Clean up the
More informationControl Flow Structures
Control Flow Structures STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Expressions 2 Expressions R code
More informationStatistical Graphics
Idea: Instant impression Statistical Graphics Bad graphics abound: From newspapers, magazines, Excel defaults, other software. 1 Color helpful: if used effectively. Avoid "chartjunk." Keep level/interests
More informationA brief introduction to R
A brief introduction to R Cavan Reilly September 29, 2017 Table of contents Background R objects Operations on objects Factors Input and Output Figures Missing Data Random Numbers Control structures Background
More informationStatistical Programming Camp: An Introduction to R
Statistical Programming Camp: An Introduction to R Handout 3: Data Manipulation and Summarizing Univariate Data Fox Chapters 1-3, 7-8 In this handout, we cover the following new materials: ˆ Using logical
More informationSTAT 135 Lab 1 Solutions
STAT 135 Lab 1 Solutions January 26, 2015 Introduction To complete this lab, you will need to have access to R and RStudio. If you have not already done so, you can download R from http://cran.cnr.berkeley.edu/,
More informationLECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I. Part Two. Introduction to R Programming. RStudio. November Written by. N.
LECTURE NOTES FOR ECO231 COMPUTER APPLICATIONS I Part Two Introduction to R Programming RStudio November 2016 Written by N.Nilgün Çokça Introduction to R Programming 5 Installing R & RStudio 5 The R Studio
More informationA Brief Introduction to R
A Brief Introduction to R Babak Shahbaba Department of Statistics, University of California, Irvine, USA Chapter 1 Introduction to R 1.1 Installing R To install R, follow these steps: 1. Go to http://www.r-project.org/.
More informationTopics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics
Topics for today Input / Output Using data frames Mathematics with vectors and matrices Summary statistics Basic graphics Introduction to S-Plus 1 Input: Data files For rectangular data files (n rows,
More informationThe R statistical computing environment
The R statistical computing environment Luke Tierney Department of Statistics & Actuarial Science University of Iowa June 17, 2011 Luke Tierney (U. of Iowa) R June 17, 2011 1 / 27 Introduction R is a language
More informationIntroduction to R Commander
Introduction to R Commander 1. Get R and Rcmdr to run 2. Familiarize yourself with Rcmdr 3. Look over Rcmdr metadata (Fox, 2005) 4. Start doing stats / plots with Rcmdr Tasks 1. Clear Workspace and History.
More informationStatistical Software Camp: Introduction to R
Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities
More informationAdvanced Graphics with R
Advanced Graphics with R Paul Murrell Universitat de Barcelona April 30 2009 Session overview: (i) Introduction Graphic formats: Overview and creating graphics in R Graphical parameters in R: par() Selected
More informationMath 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency
Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,
More informationStatistical Tests for Variable Discrimination
Statistical Tests for Variable Discrimination University of Trento - FBK 26 February, 2015 (UNITN-FBK) Statistical Tests for Variable Discrimination 26 February, 2015 1 / 31 General statistics Descriptional:
More informationAcquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.
Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting
More informationIntroductory Applied Statistics: A Variable Approach TI Manual
Introductory Applied Statistics: A Variable Approach TI Manual John Gabrosek and Paul Stephenson Department of Statistics Grand Valley State University Allendale, MI USA Version 1.1 August 2014 2 Copyright
More informationlimma: A brief introduction to R
limma: A brief introduction to R Natalie P. Thorne September 5, 2006 R basics i R is a command line driven environment. This means you have to type in commands (line-by-line) for it to compute or calculate
More informationLecture Programming in C++ PART 1. By Assistant Professor Dr. Ali Kattan
Lecture 08-1 Programming in C++ PART 1 By Assistant Professor Dr. Ali Kattan 1 The Conditional Operator The conditional operator is similar to the if..else statement but has a shorter format. This is useful
More informationComputing With R Handout 1
Computing With R Handout 1 Getting Into R To access the R language (free software), go to a computing lab that has R installed, or a computer on which you have downloaded R from one of the distribution
More informationR Command Summary. Steve Ambler Département des sciences économiques École des sciences de la gestion. April 2018
R Command Summary Steve Ambler Département des sciences économiques École des sciences de la gestion Université du Québec à Montréal c 2018 : Steve Ambler April 2018 This document describes some of the
More informationUP School of Statistics Student Council Education and Research
w UP School of Statistics Student Council Education and Research erho.weebly.com 0 erhomyhero@gmail.com f /erhoismyhero t @erhomyhero S133_HOA_001 Statistics 133 Bayesian Statistical Inference Use of R
More informationStatistical Programming with R
Statistical Programming with R Dan Mazur, McGill HPC daniel.mazur@mcgill.ca guillimin@calculquebec.ca May 14, 2015 2015-05-14 1 Outline R syntax and basic data types Documentation and help pages Reading
More informationMAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015
MAT 142 College Mathematics Statistics Module ST Terri Miller revised July 14, 2015 2 Statistics Data Organization and Visualization Basic Terms. A population is the set of all objects under study, a sample
More informationData Management Project Using Software to Carry Out Data Analysis Tasks
Data Management Project Using Software to Carry Out Data Analysis Tasks This activity involves two parts: Part A deals with finding values for: Mean, Median, Mode, Range, Standard Deviation, Max and Min
More informationIntroduction to R: Day 2 September 20, 2017
Introduction to R: Day 2 September 20, 2017 Outline RStudio projects Base R graphics plotting one or two continuous variables customizable elements of plots saving plots to a file Create a new project
More informationIntro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf. Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University
Intro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University Overview Import data Move around the file system, save an image
More informationExploratory Data Analysis September 8, 2010
Exploratory Data Analysis p. 1/2 Exploratory Data Analysis September 8, 2010 Exploratory Data Analysis p. 2/2 Scatter Plots plot(x,y) plot(y x) Note use of model formula Today: how to add lines/smoothed
More informationStochastic Models. Introduction to R. Walt Pohl. February 28, Department of Business Administration
Stochastic Models Introduction to R Walt Pohl Universität Zürich Department of Business Administration February 28, 2013 What is R? R is a freely-available general-purpose statistical package, developed
More informationA (very) short introduction to R
A (very) short introduction to R Paul Torfs & Claudia Brauer Hydrology and Quantitative Water Management Group Wageningen University, The Netherlands 1 Introduction 16 April 2012 R is a powerful language
More informationSTA 570 Spring Lecture 5 Tuesday, Feb 1
STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row
More informationPrepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.
Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good
More informationExcel 2010 with XLSTAT
Excel 2010 with XLSTAT J E N N I F E R LE W I S PR I E S T L E Y, PH.D. Introduction to Excel 2010 with XLSTAT The layout for Excel 2010 is slightly different from the layout for Excel 2007. However, with
More informationMaximum Likelihood Estimation of Neighborhood Models using Simulated Annealing
Lora Murphy October, 2005 Maximum Likelihood Estimation of Neighborhood Models using Simulated Annealing A User s Guide to the neighparam Package for R 1. Syntax...2 2. Getting Ready for R...2 3. The First
More informationplots Chris Parrish August 20, 2015
plots Chris Parrish August 20, 2015 plots We construct some of the most commonly used types of plots for numerical data. dotplot A stripchart is most suitable for displaying small data sets. data
More informationModule 10. Data Visualization. Andrew Jaffe Instructor
Module 10 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots on Wednesday, but we are going to expand the ability to customize these basic graphics first. 2/37 But first...
More informationDr. V. Alhanaqtah. Econometrics. Graded assignment
LABORATORY ASSIGNMENT 4 (R). SURVEY: DATA PROCESSING The first step in econometric process is to summarize and describe the raw information - the data. In this lab, you will gain insight into public health
More informationWork through the sheet in any order you like. Skip the starred (*) bits in the first instance, unless you re fairly confident.
CDT R Review Sheet Work through the sheet in any order you like. Skip the starred (*) bits in the first instance, unless you re fairly confident. 1. Vectors (a) Generate 100 standard normal random variables,
More informationR practice. Eric Gilleland. 20th May 2015
R practice Eric Gilleland 20th May 2015 1 Preliminaries 1. The data set RedRiverPortRoyalTN.dat can be obtained from http://www.ral.ucar.edu/staff/ericg. Read these data into R using the read.table function
More informationChapter2 Description of samples and populations. 2.1 Introduction.
Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that
More information