R. Muralikrishnan Max Planck Institute for Empirical Aesthetics Frankfurt. 08 June 2017
|
|
- Shannon Reed
- 5 years ago
- Views:
Transcription
1 R R. Muralikrishnan Max Planck Institute for Empirical Aesthetics Frankfurt 08 June 2017
2 Introduction
3 What is R?! R is a programming language for statistical computing and graphics R is free and open-source software available on Linux, Windows and OS X R implements a wide variety of statistical and graphical techniques User-created packages vastly extend and enhance the capabilities of R
4 Is it useful for me? If you are at all going to run some empirical study and you would like to run the statistical analysis independently and make some beautiful & meaningful graphical depictions of your data or you would like to do some text corpus analysis well then, Yes, very much!
5 Is it difficult??? Well, let s say the learning curve in the beginning is a bit steep ;-) But rest assured that it is every bit worth it. R related questions / problems??? Most probably someone else has already looked for answers online which means: The answers are in most cases just an online search away really!
6 Installing R Download and install for free from Core packages, functions and the R console are installed by default R commands can then be issued via the text-based R console Additional packages can be installed on the fly as and when necessary If you like, you could learn the very basics of R even before installing it Try-R, a browser-based basic R tutorial
7 Installing RStudio (optional, but recommended) RStudio is one of the development environments available for R Download and install RStudio from Using RStudio, you could use the full capability of R plus design web apps or even create presentations of this sort
8 Alright, I ve installed things, what next? You re all set to explore, visualise and analyse your data and more. Just a couple of things to know before starting the R console: 1. If you would like to install a package that is not already installed: In R Studio: Tools -> Install Packages In R console: Enter install.packages("packagename") 2. Set the working directory to the one in which you have your data files: In R Studio: Session -> Set Working Directory -> Choose Directory In R console: Use the command setwd("path/to/directory") 3. And at anytime you have questions about a certain R function: Type help(functionname) to read the documentation Type example(functionname) to see a usage example
9 R Basics
10 First steps The > R prompt indicates R is ready to receive & interpret commands We can now type commands into the R console > ## [1] 3 > bla <- 100 # Assign the value 100 to the variable / object bla > # <- is the assignment operator in R > > bla # Now, just typing bla would print its value ## [1] 100 > Bla # Bla is not the same as bla! Everything is case sensitive! > # So this simply returns an error message.
11 More examples > var_a <- 1 > var_a ## [1] 1 > var_b <- 2 > var_b ## [1] 2 > var_c <- var_a + var_b > var_c ## [1] 3
12 Objects and data types Objects / variables are simply handles or names for different kinds of data Object names may be alphanumeric, but must begin with an alphabet No spaces allowed in object names! Every object is of a certain data type > var_a <- 1 > typeof(var_a) # typeof(xyz) : returns the type of the object xyz ## [1] "double" > var_vector <- c(1,2,3) # c() : combines multiple objects > # of the same type into a vector > var_name <- "Mr. Bean" # Notice the " "??? > # ==> character string object > typeof(var_name) ## [1] "character"
13 Let s type > var_name <- "Mr. Bean" > var_age <- 45 And now type > var_x <- var_name + var_age What is the output you get?
14 Let s try > var_y <- c(var_name, var_age) > var_y ## [1] "Mr. Bean" "45" > typeof(var_y) ## [1] "character" Why?
15 Why is the type of an object important? Identifying objects with a certain data type ensures data integrity Because, only functions appropriate for that data type can apply to them. Refer: Quick-R Data Types
16 Scalars Scalars are nothing but singleton values > # Singleton values # typeof(.) returns the following > var_numeric_int <- 1 # 'double'! > > var_numeric_double <- 1.0 # 'double' > > var_char_string <- "A1" # 'char' > > var_logical_tf <- TRUE # 'logical' > # Not a character string! No " ", see? > var_logical_notavailable <- NA # 'logical'!
17 Vectors Vectors are 1D arrays The elements of a vector must all be of the identical type! > var_vector_numeric <- c(1,2,3) > var_vector_char <- c("a","b","1") > var_vector_logical <- c(true, FALSE) # Notice the absence of ""?
18 Matrices and Data Frames Matrices are 2D arrays All columns of a matrix must be of the identical type and length! Data frames are more generic than matrices; comparable to excel tables; Each column can be of any type Each column is accessible as a vector Data frames are the most common type of data in R Refer: Quick-R Data Types Refer: Quick-R Matrices
19 Arithmetic operators Arithmetic operators work on scalars, vectors and matrices Also called binary operators in R Operator Description + addition - subtraction * multiplication / division ** exponentiation (circumflex also works) x %% y modulus (x mod y) 5%%2 is 1 x %/% y integer division 5%/%2 is 2 Refer:
20 Logical Operators Logical operators are for comparing things; they return TRUE / FALSE Operator Description < less than <= less than or equal to > greater than >= greater than or equal to == exactly equal to!= not equal to!x Not x x && y short circuit AND; for single values; used in if checks x y short circuit OR; for single values; used in if checks x & y vectorised AND (applies to all elements in a vector) x y vectorised OR (applies to all elements in a vector)
21 Loops, condition checks, user-defined functions etc. Base R Cheatsheet base-r.pdf
22 Data Analysis Workflow
23 Workflow General workflow These steps often happen in a repeating cycle 1. Read data into R. Input files can be, among other things: an excel sheet, a comma / space / tab separated text file an xml file, or something directly from the web running text such as a corpus 2. Understand the data structure and what you want to do with it 3. Transform data to do what you want 4. Do what you want: calculate descriptive statistics generate plots run various statistical tests you name it! 5. Save your R code for later use, say as SomethingMeaningful.R
24 R Scripts The code we write on the console can be saved as an R Script In RStudio: File -> New -> R Script opens editor panel to write & save code Elsewhere: Simply use any text editor to write & save code Save the file as, say SomethingMeaningful.R To save output generated by R script (and not see it on the console), include: sink("meaningfuloutput.txt") in the beginning of the R script and sink() at the end of the R script Now, you can source the R Script, meaning execute all the commands in it all at once In R Studio, just click the Source button above the editor panel In R console, type source("somethingmeaningful.r")
25 Data formats: Wide-format Data Each row contains multiple variables of interest for each observation > WideData ## # A tibble: 4 7 ## Participant ExpV RT1 RT2 RT3 RT4 YorN ## <chr> <int> <dbl> <dbl> <dbl> <dbl> <chr> ## 1 S Y ## 2 S N ## 3 S Y ## 4 S Y
26 Data formats: Long-format Data Each row contains a single variable of interest for a single observation > LongData ## # A tibble: 16 5 ## Participant ExpV YorN Trial Measurement ## <chr> <int> <chr> <chr> <dbl> ## 1 S001 1 Y RT ## 2 S001 1 Y RT ## 3 S001 1 Y RT ## 4 S001 1 Y RT ## 5 S002 1 N RT ## 6 S002 1 N RT ## 7 S002 1 N RT ## 8 S002 1 N RT ## 9 S003 2 Y RT ## 10 S003 2 Y RT
27 Which format is best? In most cases, long-format data is the easiest to work with, because: each observation is in its own row each variable is in its own column Transforming, visualising and analysing long-format data is straightforward There are R packages to convert between the two formats Refer: R Cookbook wide to long format and vice versa We ll learn one of the methods soon.
28 Read data from text files: a basic example Import tabular text data as a data frame # > BehavData <- read.table("allres.txt") > # BehavData : typing the data frame name displays the whole df > head(behavdata) # head() : displays the first few rows of the df ## V1 V2 V3 V4 V5 V6 V7 V8 ## 1 NF01 16 FOS F OS C 2 ## 2 NF01 25 MOS M OS C 2 ## 3 NF01 13 FSO F SO C 2 ## 4 NF01 12 MSO M SO C 2 ## 5 NF01 4 FOS F OS C 2 ## 6 NF01 8 FSO F SO C 2 > # tail(behavdata) # tail() : displays the last few rows of the df # We won t use this method to import data for ver long! We will learn a better method in a bit.
29 Data structure and its dimensions > str(behavdata) # str() : displays the structure of the object ## 'data.frame': 3478 obs. of 8 variables: ## $ V1: Factor w/ 29 levels "NF01","NF02",..: ## $ V2: int ## $ V3: Factor w/ 4 levels "FOS","FSO","MOS",..: ## $ V4: Factor w/ 2 levels "F","M": ## $ V5: Factor w/ 2 levels "OS","SO": ## $ V6: num ## $ V7: Factor w/ 2 levels "C","X": ## $ V8: int > dim(behavdata) # dim() : displays the dimensions of the object ## [1]
30 Name columns in a data frame > names(behavdata) <- + c("subj", "Item", "Condition", "WF1", "WF2", "RT", + "Accuracy", "Response") > head(behavdata) ## Subj Item Condition WF1 WF2 RT Accuracy Response ## 1 NF01 16 FOS F OS C 2 ## 2 NF01 25 MOS M OS C 2 ## 3 NF01 13 FSO F SO C 2 ## 4 NF01 12 MSO M SO C 2 ## 5 NF01 4 FOS F OS C 2 ## 6 NF01 8 FSO F SO C 2 # Again, we won t need this when we learn the better method to import data soon.
31 Access different fields of a data frame > head(behavdata) ## Subj Item Condition WF1 WF2 RT Accuracy Response ## 1 NF01 16 FOS F OS C 2 ## 2 NF01 25 MOS M OS C 2 ## 3 NF01 13 FSO F SO C 2 ## 4 NF01 12 MSO M SO C 2 ## 5 NF01 4 FOS F OS C 2 ## 6 NF01 8 FSO F SO C 2 > head(behavdata$rt) ## [1]
32 Plots and Statistics
33 Histogram > library(ggplot2) > ggplot(behavdata, aes(x = RT)) + geom_histogram(binwidth = 0.2) count RT Refer: Refer:
34 Density Plot > ggplot(behavdata, aes(x = RT)) + geom_density() 0.3 density RT
35 Checking for Normality: Q-Q Norm Plot > ggplot(behavdata) + geom_qq(aes(sample = RT)) 6 sample theoretical
36 Statistical Normality Tests > # Anderson-Darling normality test > library(nortest) > ad.test(behavdata$rt) ## ## Anderson-Darling normality test ## ## data: BehavData$RT ## A = , p-value < 2.2e-16 > # Shapiro-Wilk normality test > shapiro.test(behavdata$rt) ## ## Shapiro-Wilk normality test ## ## data: BehavData$RT ## W = , p-value < 2.2e-16
37 Statistical Normality Tests > # Kolmogorov-Smirnot normality test > ks.test(behavdata$rt, "pnorm") ## Warning in ks.test(behavdata$rt, "pnorm"): ties should not be present ## the Kolmogorov-Smirnov test ## ## One-sample Kolmogorov-Smirnov test ## ## data: BehavData$RT ## D = , p-value < 2.2e-16 ## alternative hypothesis: two-sided Refer: Blog entry on the topic Refer: Stackexchange page on the topic
38 Mean, Median, Standard Deviation > # Arithmetic Mean > mean(behavdata$rt) ## [1] > # Median > median(behavdata$rt) ## [1] > # Standard Deviation > sd(behavdata$rt) ## [1] > # Variance = SD2 > var(behavdata$rt) ## [1] Refer: Quick-R Descriptive Statistics
39 Aggregating over factors Calculate mean, sd etc. over specified factor(s): aggregate function > # Aggregate Variable by a single Factor > aggregate(variable ~ Factor, data = XyzData, FUN = mean) > # FUN = mean => calculate mean; > # Other possible options: sd, var, length... > > # Aggregate Variable by a multiple Factors > aggregate(variable ~ Factor1 * Factor2, data = XyzData, FUN = mean) > # The Variable ~ Factors part is referred to as the 'formula'
40 Aggregating over factors > RT_m_Subj <- + aggregate(rt ~ Subj, data = BehavData, FUN = mean, na.rm = TRUE) > # RT ~ Subj => aggregate RT by the factor Subj > # na.rm = TRUE => exclude missing values (NA = not available) > head(rt_m_subj) ## Subj RT ## 1 NF ## 2 NF ## 3 NF ## 4 NF ## 5 NF ## 6 NF
41 ANOVA > # Repeated Measures ANOVA : Reaction Time -- Analysis by SUBJECTS > # To test if the SUBJECTS differ significantly between each other > > # First calculate a mean per subject per condition. > RT_m_Subj_WF1_WF2 <- aggregate(rt ~ Subj * WF1 * WF2, + data = BehavData, FUN = mean, na.rm = T > # Run the ANOVA > RT_aov_Subj <- aov(rt ~ WF1 * WF2 + Error(Subj/(WF1*WF2)), + data = RT_m_Subj_WF1_WF2) > > print(summary(rt_aov_subj))
42 ANOVA > # Repeated Measures ANOVA : Reaction Time -- Analysis by ITEMS > # To test if the ITEMS differ significantly between each other > BehavData$Item <- as.factor(behavdata$item) > # First calculate a mean per item per condition. > RT_m_Item_WF1_WF2 <- aggregate(rt ~ Item * WF1 * WF2, + data = BehavData, FUN = mean, + na.rm = T) > # Run the ANOVA > RT_aov_Item <- aov(rt ~ WF1 * WF2 + Error(Item/(WF1*WF2)), + data = RT_m_Item_WF1_WF2 ) > > print(summary(rt_aov_item)) Refer:
43 Correlations, t-tests, An exhaustive list of statistical tests
44 Good to know Many ways to do the same thing Many common tasks can be accomplished in more than one way in R This is both appealing and frustrating, depending on the context Hmmm
45 Good to know Many ways to do the same thing Many common tasks can be accomplished in more than one way in R This is both appealing and frustrating, depending on the context Hmmm This begs the question: wouldn t it be lovely if there s a way to do most of the common tasks in a consistent manner???
46 Good to know Many ways to do the same thing Many common tasks can be accomplished in more than one way in R This is both appealing and frustrating, depending on the context Hmmm This begs the question: wouldn t it be lovely if there s a way to do most of the common tasks in a consistent manner??? Enter The Tidyverse
47 The Tidyverse
48 The Tidyverse A collection of R packages that share common philosophies and are designed to work together tidyverse.org Goal : Solve complex problems by combining simple, uniform pieces! Package Design See Data Science in tidyverse: Hadley Wickham One function = one task Input and output of every function is a tidy dataframe (= tibble) Consequence: tidyverse functions are pipeable! # > install.packages("tidyverse") # Installs the tidyverse collection # Curious what pipeable means??? Wait a bit more to know :-)
49 The Tidyverse > library(tidyverse) # Loads the core tidyverse packages ## Loading tidyverse: tibble ## Loading tidyverse: tidyr ## Loading tidyverse: readr ## Loading tidyverse: purrr ## Loading tidyverse: dplyr ## Conflicts with tidy packages ## filter(): dplyr, stats ## lag(): dplyr, stats > library(readxl) # Other tidyverse packages loaded when needed
50 Tidy Data Each variable is a column, each obser ation / case is a row! See Wickham, H. (2014). Tidy Data. Journal of Statistical Software, 59 (1), > LongData ## # A tibble: 16 5 ## Participant ExpV YorN Trial Measurement ## <chr> <int> <chr> <chr> <dbl> ## 1 S001 1 Y RT ## 2 S001 1 Y RT ## 3 S001 1 Y RT ## 4 S001 1 Y RT ## 5 S002 1 N RT ## 6 S002 1 N RT ## 7 S002 1 N RT ## 8 S002 1 N RT
51 Read data from text files: readr::read_delim > library(tidyverse) # This also loads readr, among other packages! > # For comma separated file with header row present in the input file > ExpData <- read_delim("filename.csv", delim = ",", col_names = TRUE) > # delim => delimiter, i.e., the column separator in the input > > # For tab separated file with no header row present in the input > ExpData <- read_delim("filename.txt", delim = "\t", + col_names = c("subject", "Task", "RT")) > # We provide meaningful column names in the command > > # For space separated file: > ExpData <- read_delim("filename.xyz", delim = " ", col_names = TRUE) > # For semicolon separated file: > ExpData <- read_delim("filename.log", delim = ";", + col_names = c("name", "Age"))
52 Read data from excel files: readxl::read_excel > library(readxl) > # Read a single worksheet (the first by default, if multiple worksheet > ExpData <- read_excel("filename.xlsx", col_names = TRUE) > # Read specific worksheet from the file, by index > ExpData <- read_excel("filename.xlsx", 3, + col_names = c("name", "Age", "RT")) > # Read specific worksheet from the file, by index > ExpData <- read_excel("filename.xlsx", 3, + col_names = c("name", "Age", "RT")) Attention please!!! Spaces are bad bad bad in filenames, column names and basically any names! Bad apples: Exp Data.xlsx, Subj ID, bla bla bla etc. Instead, use: Exp_Data.xlsx, Subj-ID, bla_blabla etc.
53 Read Data : Tidy Example > BehavDataTidy <- + readr::read_delim("allres.txt", delim = " ", + col_names = c("subj", "Item", "Condition", "WF1", + "WF2", "RT", "Accuracy", "Resp")) ## Parsed with column specification: ## cols( ## Subj = col_character(), ## Item = col_integer(), ## Condition = col_character(), ## WF1 = col_character(), ## WF2 = col_character(), ## RT = col_double(), ## Accuracy = col_character(), ## Resp = col_integer() ## )
54 Read Data : Tidy Example > BehavDataTidy ## # A tibble: 3,478 8 ## Subj Item Condition WF1 WF2 RT Accuracy Resp ## <chr> <int> <chr> <chr> <chr> <dbl> <chr> <int> ## 1 NF01 16 FOS F OS C 2 ## 2 NF01 25 MOS M OS C 2 ## 3 NF01 13 FSO F SO C 2 ## 4 NF01 12 MSO M SO C 2 ## 5 NF01 4 FOS F OS C 2 ## 6 NF01 8 FSO F SO C 2 ## 7 NF01 6 MOS M OS X 1 ## 8 NF01 2 FOS F OS C 2 ## 9 NF01 9 MSO M SO C 2 ## 10 NF01 28 MOS M OS C 2 ## #... with 3,468 more rows So what is so tidy about it??? Compare with the dataframe created earlier!
55 Tidy tibble enhanced data frame Most non-tidyverse functions that take a data frame work with tibbles For legacy functions that won t work with a tibble: use as.data.frame() See: > mean(behavdata$rt) ## [1] > mean(behavdatatidy$rt) ## [1] > aggregate(rt ~ WF2, data = BehavData, FUN = mean) ## WF2 RT ## 1 OS ## 2 SO > aggregate(rt ~ WF2, data = BehavDataTidy, FUN = mean) ## WF2 RT ## 1 OS ## 2 SO
56 Should all data be tidy data? Of course not! Other types of non-tidy data have their uses, too. Not every dataset needs to be wrangled into a tidy dataset! Nevertheless, the tidy format works well for most kinds of rectangular data.
57 Data Wrangling and Transformations
58 Why focus on Data Wrangling? Some form of data transformation is almost always inevitable prior to analysis This is usually the most time consuming and error prone part The actual statistical analysis is usually only one or two lines of R code Most analytical functions work best if the data is in a certain format Efficient data wrangling techniques are thus very important
59 Wide-format to long-format conversion Use the gather function from tidyr package of the tidyverse > library(tidyr) > gather(widedata, Trial, Measurement, RT1:RT4) ## # A tibble: 16 5 ## Participant ExpV YorN Trial Measurement ## <chr> <int> <chr> <chr> <dbl> ## 1 S001 1 Y RT ## 2 S002 1 N RT ## 3 S003 2 Y RT ## 4 S004 2 Y RT ## 5 S001 1 Y RT ## 6 S002 1 N RT ## 7 S003 2 Y RT ## 8 S004 2 Y RT ## 9 S001 1 Y RT
60 Long-format to wide-format conversion Use the spread function from tidyr package of the tidyverse > spread(longdata, Trial, Measurement) ## # A tibble: 4 7 ## Participant ExpV YorN RT1 RT2 RT3 RT4 ## * <chr> <int> <chr> <dbl> <dbl> <dbl> <dbl> ## 1 S001 1 Y ## 2 S002 1 N ## 3 S003 2 Y ## 4 S004 2 Y
61 Summarising data: dplyr::summarise, dplyr::count > WideData <- readxl::read_excel("widedata.xlsx", 3, col_names = TRUE) > TidyData <- gather(widedata, Trial, Measurement, RT1:RT4) > OverallMeanRT <- summarise(tidydata, MeanRT = mean(measurement)) > OverallMeanRT ## # A tibble: 1 1 ## MeanRT ## <dbl> ## > N_of_Measurements <- count(tidydata, Participant) > N_of_Measurements ## # A tibble: 4 2 ## Participant n ## <chr> <int> ## 1 S001 4 ## 2 S002 4 ## 3 S003 4
62 The real power and elegance of tidyverse: pipeable functions All functions in the tidyverse share a consistent syntax Therefore the output of one function can be piped to the next function magrittr::%>% Piping avoids having to save temporary intermediate variables Piping results in code that is: simple and more efficient linear, reflecting each simple step that contributed to the complex analysis concise and more legible less error-prone overall
63 Pipe versus no pipe > # The more common non-pipe method ================================= > SomeData_1 <- f1(somedata_0, param1, param2) > SomeData_2 <- f2(somedata_1, bla1, bla2, bla3) > SomeData_3 <- f3(somedata_2, whatever1) > Result_1 <- f4(somedata_3, younameit) > # Another method ================================================== > Result_2 <- f1( f2( f3( f4(somedata_0, param1, param2), + bla1, bla2, bla3), whatever1), younameit) > # And now with the pipe! ========================================== > Result_3 <- + SomeData_0 %>% + f1(param1, param2) %>% + f2(bla1, bla2, bla3) %>% + f3(whatever1) %>% + f4(younameit)
64 Pipe : Example Non-pipe version > WideData <- readxl::read_excel("widedata.xlsx", 3, col_names = TRUE) > TidyData <- gather(widedata, Trial, Measurement, RT1:RT4) > N_of_Measurements <- count(tidydata, Participant) Pipe version > readxl::read_excel("widedata.xlsx", 3, col_names = TRUE) %>% + gather(trial, Measurement, RT1:RT4) %>% + count(meanrt = mean(measurement)) -> N_of_Measurements
65 Grouping data by factor(s): dplyr::group_by > readxl::read_excel("widedata.xlsx", 3, col_names = TRUE) %>% + gather(trial, Measurement, RT1:RT4) %>% + group_by(participant) %>% + count(meanrt = mean(measurement)) ## Source: local data frame [4 x 3] ## Groups: Participant [?] ## ## Participant MeanRT n ## <chr> <dbl> <int> ## 1 S ## 2 S ## 3 S ## 4 S
66 Renaming a column: dplyr::rename > readxl::read_excel("widedata.xlsx", 3, col_names = TRUE) %>% + gather(trial, Measurement, RT1:RT4) -> LongData > > library(magrittr) ## ## Attaching package: 'magrittr' ## The following object is masked from 'package:purrr': ## ## set_names ## The following object is masked from 'package:tidyr': ## ## extract > LongData %<>% rename(rt = Measurement) What s that %<>% thing??? And where did <- go??? Do you see the point?
67 Let s take stock a bit The tidyverse packages share a consistent syntax such that piping is possible Piping with %>% feeds the LHS to the RHS The RHS generates an output to feed further or assign or print or plot Double-piping with %<>% also feeds the LHS to the RHS, but The RHS generates an output and feeds (= assigns) it back to the LHS! There s more: %T% and %$% See
68 and import a new dataset to work further > IntData <- readxl::read_excel("intensity-data.xlsx", col_names = T) > IntData ## # A tibble: ## Participant Note NoteType Time Intensity OnsetInterval ## <chr> <dbl> <chr> <dbl> <dbl> <chr> ## 1 S01 1 NA NA ## 2 S01 2 Note_M ## 3 S01 3 Note_S ## 4 S01 4 Note_S ## 5 S01 5 Note_S ## 6 S01 6 Note_S ## 7 S01 7 Note_M ## 8 S01 8 Note_M ## 9 S01 9 Note_M ## 10 S01 10 Note_M ## #... with 410 more rows
69 Extract columns by name: dplyr::select > IntData %>% select(participant, NoteType, Time, Intensity) ## # A tibble: ## Participant NoteType Time Intensity ## <chr> <chr> <dbl> <dbl> ## 1 S01 NA ## 2 S01 Note_M ## 3 S01 Note_S ## 4 S01 Note_S ## 5 S01 Note_S ## 6 S01 Note_S ## 7 S01 Note_M ## 8 S01 Note_M ## 9 S01 Note_M ## 10 S01 Note_M ## #... with 410 more rows
70 Extract rows that meet certain criteria: dplyr::filter > IntData %>% filter(onsetinterval > 0.75 & OnsetInterval < 0.85) ## # A tibble: 5 6 ## Participant Note NoteType Time Intensity OnsetInterval ## <chr> <dbl> <chr> <dbl> <dbl> <chr> ## 1 S01 16 Note_L ## 2 S07 16 Note_L ## 3 S08 16 Note_L ## 4 S12 16 Note_L ## 5 S14 16 Note_L Notice the use of single & : this is the vectorised AND operator Unlike the scalar AND &&, this applies to all the elements of a column! There s of course the vectorised OR, as opposed to the scalar OR
71 Compute a new column: dplyr::mutate > IntData %>% + select(participant, Intensity) %>% + mutate(sno = row_number(), + GoodBad = if_else(intensity >= 120, "Good", "Bad")) ## # A tibble: ## Participant Intensity SNo GoodBad ## <chr> <dbl> <int> <chr> ## 1 S Good ## 2 S Bad ## 3 S Bad ## 4 S Good ## 5 S Good ## 6 S Good ## 7 S Good ## 8 S Good ## 9 S Good
72 Compute a new column, drop others: dplyr::transmute > IntData %>% + select(participant) %>% + distinct() %>% # Get rid of duplicate rows + transmute(subject = Participant, + NewSubjID = paste("drummer", row_number() + 100, sep="")) ## # A tibble: 14 2 ## Subject NewSubjID ## <chr> <chr> ## 1 S01 Drummer101 ## 2 S02 Drummer102 ## 3 S03 Drummer103 ## 4 S04 Drummer104 ## 5 S05 Drummer105 ## 6 S06 Drummer106 ## 7 S07 Drummer107 ## 8 S08 Drummer108
73 Exercise 1 Add a new column with the mean of the OnsetInterval. This mean should be on a per Participant and per NoteType basis! Before attempting to do this, see if mean(intdata$onsetinterval) works Have a very charful look at the output of typing IntData Do you see a / the problem?
74 Solution : Know your data well OnsetInterval contains the string NA in some cases So read_excel assumed that this column is made up of strings! Type readxl::read_excel("intensity-data.xlsx", col_names = T) Study what you see on the console Now type help(read_excel) to see what could be done
75 Exercise 1 : Solution > IntData <- readxl::read_excel("intensity-data.xlsx", col_names = T, + na = "NA") # <NA> is "NA" in the input vector! > IntData %>% select(-time, -Intensity) %>% # - => drop these vectors + group_by(participant, NoteType) %>% + mutate(oimean = mean(onsetinterval)) ## Source: local data frame [420 x 5] ## Groups: Participant, NoteType [56] ## ## Participant Note NoteType OnsetInterval OIMean ## <chr> <dbl> <chr> <dbl> <dbl> ## 1 S01 1 <NA> NA NA ## 2 S01 2 Note_M ## 3 S01 3 Note_S ## 4 S01 4 Note_S ## 5 S01 5 Note_S ## 6 S01 6 Note_S
76 Exercise 2 Add a new column with the name AdjustedTime This should be the Time for the current Note minus the Time for Note 1. This value should be on a per Participant basis! Do you need something specific to solve this???
77 Solution : Extract first value by position: dplyr::first > IntData %>% + group_by(participant) %>% + mutate(timebegin = first(time)) %>% + select(participant, Time, TimeBegin) ## Source: local data frame [420 x 3] ## Groups: Participant [14] ## ## Participant Time TimeBegin ## <chr> <dbl> <dbl> ## 1 S ## 2 S ## 3 S ## 4 S ## 5 S ## 6 S ## 7 S
78 Exercise 2 : Solution > IntData %>% + group_by(participant) %>% + mutate(timebegin = first(time)) %>% + select(participant, Time, TimeBegin) %>% + mutate(adjustedtime = Time - TimeBegin) ## Source: local data frame [420 x 4] ## Groups: Participant [14] ## ## Participant Time TimeBegin AdjustedTime ## <chr> <dbl> <dbl> <dbl> ## 1 S ## 2 S ## 3 S ## 4 S ## 5 S ## 6 S
79 ..
80 Some resources R Cheatsheets : R for Data Science : Cookbook for R : Graphs with ggplot2 : Tidy Text Mining : Quick R : Advanced R :
81 Thanks! > Thanks <- "Thanks for your attention!" > Thanks ## [1] "Thanks for your attention!" > # Command to quit from R Console > q()
An Introduction to R. Ed D. J. Berry 9th January 2017
An Introduction to R Ed D. J. Berry 9th January 2017 Overview Why now? Why R? General tips Recommended packages Recommended resources 2/48 Why now? Efficiency Pointandclick software just isn't time efficient
More informationData Import and Formatting
Data Import and Formatting http://datascience.tntlab.org Module 4 Today s Agenda Importing text data Basic data visualization tidyverse vs data.table Data reshaping and type conversion Basic Text Data
More informationAssignment 5.5. Nothing here to hand in
Assignment 5.5 Nothing here to hand in Load the tidyverse before we start: library(tidyverse) ## Loading tidyverse: ggplot2 ## Loading tidyverse: tibble ## Loading tidyverse: tidyr ## Loading tidyverse:
More informationData Wrangling in the Tidyverse
Data Wrangling in the Tidyverse 21 st Century R DS Portugal Meetup, at Farfetch, Porto, Portugal April 19, 2017 Jim Porzak Data Science for Customer Insights 4/27/2017 1 Outline 1. A very quick introduction
More informationData Import and Export
Data Import and Export Eugen Buehler October 17, 2018 Importing Data to R from a file CSV (comma separated value) tab delimited files Excel formats (xls, xlsx) SPSS/SAS/Stata RStudio will tell you if you
More informationData Input/Output. Introduction to R for Public Health Researchers
Data Input/Output Introduction to R for Public Health Researchers Common new user mistakes we have seen 1. Working directory problems: trying to read files that R "can't find" RStudio can help, and so
More informationLoading Data into R. Loading Data Sets
Loading Data into R Loading Data Sets Rather than manually entering data using c() or something else, we ll want to load data in stored in a data file. For this class, these will usually be one of three
More informationThe Tidyverse BIOF 339 9/25/2018
The Tidyverse BIOF 339 9/25/2018 What is the Tidyverse? The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar,
More informationModule 1: Introduction RStudio
Module 1: Introduction RStudio Contents Page(s) Installing R and RStudio Software for Social Network Analysis 1-2 Introduction to R Language/ Syntax 3 Welcome to RStudio 4-14 A. The 4 Panes 5 B. Calculator
More informationR: BASICS. Andrea Passarella. (plus some additions by Salvatore Ruggieri)
R: BASICS Andrea Passarella (plus some additions by Salvatore Ruggieri) BASIC CONCEPTS R is an interpreted scripting language Types of interactions Console based Input commands into the console Examine
More informationDplyr Introduction Matthew Flickinger July 12, 2017
Dplyr Introduction Matthew Flickinger July 12, 2017 Introduction to Dplyr This document gives an overview of many of the features of the dplyr library include in the tidyverse of related R pacakges. First
More informationSession 3 Nick Hathaway;
Session 3 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Manipulating Data frames and matrices 1 Converting to long vs wide formats.................................... 2 Manipulating data in table........................................
More informationData Input/Output. Introduction to R for Public Health Researchers
Data Input/Output Introduction to R for Public Health Researchers Common new user mistakes we have seen 1. Working directory problems: trying to read files that R can t find RStudio can help, and so do
More informationComputing With R Handout 1
Computing With R Handout 1 Getting Into R To access the R language (free software), go to a computing lab that has R installed, or a computer on which you have downloaded R from one of the distribution
More informationSubsetting, dplyr, magrittr Author: Lloyd Low; add:
Subsetting, dplyr, magrittr Author: Lloyd Low; Email add: wai.low@adelaide.edu.au Introduction So you have got a table with data that might be a mixed of categorical, integer, numeric, etc variables? And
More informationA Whistle-Stop Tour of the Tidyverse
A Whistle-Stop Tour of the Tidyverse Aimee Gott Senior Consultant agott@mango-solutions.com @aimeegott_r In This Workshop You will learn What the tidyverse is & why bother using it What tools are available
More informationLab2 Jacob Reiser September 30, 2016
Lab2 Jacob Reiser September 30, 2016 Introduction: An R-Blogger recently found a data set from a project of New York s Public Library called What s on the Menu, which can be found at https://www.r-bloggers.com/a-fun-gastronomical-dataset-whats-on-the-menu/.
More informationData Manipulation. Module 5
Data Manipulation http://datascience.tntlab.org Module 5 Today s Agenda A couple of base-r notes Advanced data typing Relabeling text In depth with dplyr (part of tidyverse) tbl class dplyr grammar Grouping
More informationIntroduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics
Introduction to R: Using R for Statistics and Data Analysis BaRC Hot Topics http://barc.wi.mit.edu/hot_topics/ Why use R? Perform inferential statistics (e.g., use a statistical test to calculate a p-value)
More informationComputing With R Handout 1
Computing With R Handout 1 The purpose of this handout is to lead you through a simple exercise using the R computing language. It is essentially an assignment, although there will be nothing to hand in.
More informationAn Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences. Scott C Merrill. September 5 th, 2012
An Introductory Tutorial: Learning R for Quantitative Thinking in the Life Sciences Scott C Merrill September 5 th, 2012 Chapter 2 Additional help tools Last week you asked about getting help on packages.
More informationAn Introduction to R- Programming
An Introduction to R- Programming Hadeel Alkofide, Msc, PhD NOT a biostatistician or R expert just simply an R user Some slides were adapted from lectures by Angie Mae Rodday MSc, PhD at Tufts University
More informationSISG/SISMID Module 3
SISG/SISMID Module 3 Introduction to R Ken Rice Tim Thornton University of Washington Seattle, July 2018 Introduction: Course Aims This is a first course in R. We aim to cover; Reading in, summarizing
More informationPackage tidystats. May 6, 2018
Title Create a Tidy Statistics Output File Version 0.2 Package tidystats May 6, 2018 Produce a data file containing the output of statistical models and assist with a workflow aimed at writing scientific
More informationMails : ; Document version: 14/09/12
Mails : leslie.regad@univ-paris-diderot.fr ; gaelle.lelandais@univ-paris-diderot.fr Document version: 14/09/12 A freely available language and environment Statistical computing Graphics Supplementary
More informationSession 1 Nick Hathaway;
Session 1 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents R Basics 1 Variables/objects.............................................. 1 Functions..................................................
More informationGetting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018
Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from
More informationFinancial Econometrics Practical
Financial Econometrics Practical Practical 3: Plotting in R NF Katzke Table of Contents 1 Introduction 1 1.0.1 Install ggplot2................................................. 2 1.1 Get data Tidy.....................................................
More informationTidy Evaluation. Lionel Henry and Hadley Wickham RStudio
Tidy Evaluation Lionel Henry and Hadley Wickham RStudio Tidy evaluation Our vision for dealing with a special class of R functions Usually called NSE but we prefer quoting functions Most interesting language
More informationEXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression
EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression
More informationSTATS PAD USER MANUAL
STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11
More informationRecap From Last Time: Today s Learning Goals BIMM 143. Data analysis with R Lecture 4. Barry Grant.
BIMM 143 Data analysis with R Lecture 4 Barry Grant http://thegrantlab.org/bimm143 Recap From Last Time: Substitution matrices: Where our alignment match and mis-match scores typically come from Comparing
More informationR Basics / Course Business
R Basics / Course Business We ll be using a sample dataset in class today: CourseWeb: Course Documents " Sample Data " Week 2 Can download to your computer before class CourseWeb survey on research/stats
More informationIntroduction to R and the tidyverse
Introduction to R and the tidyverse Paolo Crosetto Paolo Crosetto Introduction to R and the tidyverse 1 / 58 Lecture 3: merging & tidying data Paolo Crosetto Introduction to R and the tidyverse 2 / 58
More informationLAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT
NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling
More informationIntroduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016
Introduction to R Jason Huff, QB3 CGRL UC Berkeley April 15, 2016 Installing R R is constantly updated and you should download a recent version; the version when this workshop was written was 3.2.4 I also
More information1 Introduction to Matlab
1 Introduction to Matlab 1. What is Matlab? Matlab is a computer program designed to do mathematics. You might think of it as a super-calculator. That is, once Matlab has been started, you can enter computations,
More informationIntroduction to R. Andy Grogan-Kaylor October 22, Contents
Introduction to R Andy Grogan-Kaylor October 22, 2018 Contents 1 Background 2 2 Introduction 2 3 Base R and Libraries 3 4 Working Directory 3 5 Writing R Code or Script 4 6 Graphical User Interface 4 7
More informationCS1114: Matlab Introduction
CS1114: Matlab Introduction 1 Introduction The purpose of this introduction is to provide you a brief introduction to the features of Matlab that will be most relevant to your work in this course. Even
More informationIntroduction to Statistics using R/Rstudio
Introduction to Statistics using R/Rstudio R and Rstudio Getting Started Assume that R for Windows and Macs already installed on your laptop. (Instructions for installations sent) R on Windows R on MACs
More informationSession 26 TS, Predictive Analytics: Moving Out of Square One. Moderator: Jean-Marc Fix, FSA, MAAA
Session 26 TS, Predictive Analytics: Moving Out of Square One Moderator: Jean-Marc Fix, FSA, MAAA Presenters: Jean-Marc Fix, FSA, MAAA Jeffery Robert Huddleston, ASA, CERA, MAAA Predictive Modeling: Getting
More informationCITS2401 Computer Analysis & Visualisation
FACULTY OF ENGINEERING, COMPUTING AND MATHEMATICS CITS2401 Computer Analysis & Visualisation SCHOOL OF COMPUTER SCIENCE AND SOFTWARE ENGINEERING Topic 3 Introduction to Matlab Material from MATLAB for
More informationIncident Response Programming with R. Eric Zielinski Sr. Consultant, Nationwide
Incident Response Programming with R Eric Zielinski Sr. Consultant, Nationwide About Me? Cyber Defender for Nationwide Over 15 years in Information Security Speaker at various conferences FIRST, CEIC,
More informationLab 1: Getting started with R and RStudio Questions? or
Lab 1: Getting started with R and RStudio Questions? david.montwe@ualberta.ca or isaacren@ualberta.ca 1. Installing R and RStudio To install R, go to https://cran.r-project.org/ and click on the Download
More informationsocial data science Introduction to R Sebastian Barfort August 07, 2016 University of Copenhagen Department of Economics 1/40
social data science Introduction to R Sebastian Barfort August 07, 2016 University of Copenhagen Department of Economics 1/40 welcome Course Description The objective of this course is to learn how to
More informationData Input/Output. Andrew Jaffe. January 4, 2016
Data Input/Output Andrew Jaffe January 4, 2016 Before we get Started: Working Directories R looks for files on your computer relative to the working directory It s always safer to set the working directory
More informationExercise 1-Solutions TMA4255 Applied Statistics
Exercise 1-Solutions TMA4255 Applied Statistics January 16, 2017 Intro 0.1 Start MINITAB Start MINITAB on your laptop, or remote desktop to cauchy.math.ntnu.no and log in with win-ntnu-no\yourusername
More informationPython for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT
Python for Data Analysis Prof.Sushila Aghav-Palwe Assistant Professor MIT Four steps to apply data analytics: 1. Define your Objective What are you trying to achieve? What could the result look like? 2.
More informationLecture 12: Data carpentry with tidyverse
http://127.0.0.1:8000/.html Lecture 12: Data carpentry with tidyverse STAT598z: Intro. to computing for statistics Vinayak Rao Department of Statistics, Purdue University options(repr.plot.width=5, repr.plot.height=3)
More informationIntroduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics
Introduction to R: Using R for Statistics and Data Analysis BaRC Hot Topics http://barc.wi.mit.edu/hot_topics/ Why use R? Perform inferential statistics (e.g., use a statistical test to calculate a p-value)
More informationLecture 5. Essential skills for bioinformatics: Unix/Linux
Lecture 5 Essential skills for bioinformatics: Unix/Linux UNIX DATA TOOLS Text processing with awk We have illustrated two ways awk can come in handy: Filtering data using rules that can combine regular
More informationLecture 1: Getting Started and Data Basics
Lecture 1: Getting Started and Data Basics The first lecture is intended to provide you the basics for running R. Outline: 1. An Introductory R Session 2. R as a Calculator 3. Import, export and manipulate
More informationЛекция 4 Трансформация данных в R
Анализ данных Лекция 4 Трансформация данных в R Гедранович Ольга Брониславовна, старший преподаватель кафедры ИТ, МИУ volha.b.k@gmail.com 2 Вопросы лекции Фильтрация (filter) Сортировка (arrange) Выборка
More informationStatistics for Biologists: Practicals
Statistics for Biologists: Practicals Peter Stoll University of Basel HS 2012 Peter Stoll (University of Basel) Statistics for Biologists: Practicals HS 2012 1 / 22 Outline Getting started Essentials of
More informationThe Average and SD in R
The Average and SD in R The Basics: mean() and sd() Calculating an average and standard deviation in R is straightforward. The mean() function calculates the average and the sd() function calculates the
More informationUAccess ANALYTICS Next Steps: Working with Bins, Groups, and Calculated Items: Combining Data Your Way
UAccess ANALYTICS Next Steps: Working with Bins, Groups, and Calculated Items: Arizona Board of Regents, 2014 THE UNIVERSITY OF ARIZONA created 02.07.2014 v.1.00 For information and permission to use our
More informationSESSION 9: Data Entry
Data Entry 74 SESSION 9: Data Entry 9.1 Introduction and general principles for entering data using Excel Excel is a powerful tool to extract meaningful information and insights from the data you have
More informationSECTION 1: INTRODUCTION. ENGR 112 Introduction to Engineering Computing
SECTION 1: INTRODUCTION ENGR 112 Introduction to Engineering Computing 2 Course Overview What is Programming? 3 Programming The implementation of algorithms in a particular computer programming language
More informationTUTORIAL. HCS- Tools + Scripting Integrations
TUTORIAL HCS- Tools + Scripting Integrations HCS- Tools... 3 Setup... 3 Task and Data... 4 1) Data Input Opera Reader... 7 2) Meta data integration Expand barcode... 8 3) Meta data integration Join Layout...
More informationGetting Started. Slides R-Intro: R-Analytics: R-HPC:
Getting Started Download and install R + Rstudio http://www.r-project.org/ https://www.rstudio.com/products/rstudio/download2/ TACC ssh username@wrangler.tacc.utexas.edu % module load Rstats %R Slides
More informationPackage infer. July 11, Type Package Title Tidy Statistical Inference Version 0.3.0
Type Package Title Tidy Statistical Inference Version 0.3.0 Package infer July 11, 2018 The objective of this package is to perform inference using an epressive statistical grammar that coheres with the
More informationTabular data management. Jennifer Bryan RStudio, University of British Columbia
Tabular data management Jennifer Bryan RStudio, University of British Columbia @JennyBryan @jennybc data cleaning data wrangling descriptive stats inferential stats reporting data cleaning data wrangling
More informationGetting started with ggplot2
Getting started with ggplot2 STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 ggplot2 2 Resources for
More informationSTAT 113: R/RStudio Intro
STAT 113: R/RStudio Intro Colin Reimer Dawson Last Revised September 1, 2017 1 Starting R/RStudio There are two ways you can run the software we will be using for labs, R and RStudio. Option 1 is to log
More informationLastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.
Welcome to Basic Excel, presented by STEM Gateway as part of the Essential Academic Skills Enhancement, or EASE, workshop series. Before we begin, I want to make sure we are clear that this is by no means
More informationAssignment 0. Nothing here to hand in
Assignment 0 Nothing here to hand in The questions here have solutions attached. Follow the solutions to see what to do, if you cannot otherwise guess. Though there is nothing here to hand in, it is very
More informationIntroduction to R. Introduction to Econometrics W
Introduction to R Introduction to Econometrics W3412 Begin Download R from the Comprehensive R Archive Network (CRAN) by choosing a location close to you. Students are also recommended to download RStudio,
More informationECO375 Tutorial 1 Introduction to Stata
ECO375 Tutorial 1 Introduction to Stata Matt Tudball University of Toronto Mississauga September 14, 2017 Matt Tudball (University of Toronto) ECO375H5 September 14, 2017 1 / 25 What Is Stata? Stata is
More informationUNIT 4. Research Methods in Business
UNIT 4 Preparing Data for Analysis:- After data are obtained through questionnaires, interviews, observation or through secondary sources, they need to be edited. The blank responses, if any have to be
More informationSpectroscopic Analysis: Peak Detector
Electronics and Instrumentation Laboratory Sacramento State Physics Department Spectroscopic Analysis: Peak Detector Purpose: The purpose of this experiment is a common sort of experiment in spectroscopy.
More information1 Introduction to Using Excel Spreadsheets
Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)
More informationAn Introduction to MATLAB
An Introduction to MATLAB Day 1 Simon Mitchell Simon.Mitchell@ucla.edu High level language Programing language and development environment Built-in development tools Numerical manipulation Plotting of
More informationSTATA 13 INTRODUCTION
STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA
More informationPython Programming Exercises 1
Python Programming Exercises 1 Notes: throughout these exercises >>> preceeds code that should be typed directly into the Python interpreter. To get the most out of these exercises, don t just follow them
More informationAN INTRODUCTION TO R FOR MANAGEMENT SCHOLARS
AN INTRODUCTION TO R FOR MANAGEMENT SCHOLARS 24 January 2017 Stefan Breet breet@rsm.nl www.stefanbreet.com TODAY What is R? How to use R? The Basics How to use R? The Data Analysis Process WHAT IS R? AN
More informationData Science and Machine Learning Essentials
Data Science and Machine Learning Essentials Lab 3A Visualizing Data By Stephen Elston and Graeme Malcolm Overview In this lab, you will learn how to use R or Python to visualize data. If you intend to
More informationJME Language Reference Manual
JME Language Reference Manual 1 Introduction JME (pronounced jay+me) is a lightweight language that allows programmers to easily perform statistic computations on tabular data as part of data analysis.
More informationModeling in the Tidyverse. Max Kuhn (RStudio)
Modeling in the Tidyverse Max Kuhn (RStudio) Modeling in R R has always had a rich set of modeling tools that it inherited from S. For example, the formula interface has made it simple to specify potentially
More informationIntroduction to R Programming
Course Overview Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data
More informationLecture 4 CSE July 1992
Lecture 4 CSE 110 6 July 1992 1 More Operators C has many operators. Some of them, like +, are binary, which means that they require two operands, as in 4 + 5. Others are unary, which means they require
More informationExtremely short introduction to R Jean-Yves Sgro Feb 20, 2018
Extremely short introduction to R Jean-Yves Sgro Feb 20, 2018 Contents 1 Suggested ahead activities 1 2 Introduction to R 2 2.1 Learning Objectives......................................... 2 3 Starting
More informationCSSS 512: Lab 1. Logistics & R Refresher
CSSS 512: Lab 1 Logistics & R Refresher 2018-3-30 Agenda 1. Logistics Labs, Office Hours, Homeworks Goals and Expectations R, R Studio, R Markdown, L ATEX 2. Time Series Data in R Unemployment in Maine
More informationInstall RStudio from - use the standard installation.
Session 1: Reading in Data Before you begin: Install RStudio from http://www.rstudio.com/ide/download/ - use the standard installation. Go to the course website; http://faculty.washington.edu/kenrice/rintro/
More informationComputer lab 2 Course: Introduction to R for Biologists
Computer lab 2 Course: Introduction to R for Biologists April 23, 2012 1 Scripting As you have seen, you often want to run a sequence of commands several times, perhaps with small changes. An efficient
More informationSource df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB
Keppel, G. Design and Analysis: Chapter 17: The Mixed Two-Factor Within-Subjects Design: The Overall Analysis and the Analysis of Main Effects and Simple Effects Keppel describes an Ax(BxS) design, which
More informationData types and structures
An introduc+on to Data types and structures Noémie Becker & Benedikt Holtmann Winter Semester 16/17 Course outline Day 3 Review GeFng started with R Crea+ng Objects Data types in R Data structures in R
More informationTOPIC 2 INTRODUCTION TO JAVA AND DR JAVA
1 TOPIC 2 INTRODUCTION TO JAVA AND DR JAVA Notes adapted from Introduction to Computing and Programming with Java: A Multimedia Approach by M. Guzdial and B. Ericson, and instructor materials prepared
More informationAn Introduction to Stata
An Introduction to Stata Instructions Statistics 111 - Probability and Statistical Inference Jul 3, 2013 Lab Objective To become familiar with the software package Stata. Lab Procedures Stata gives us
More informationStat 302 Statistical Software and Its Applications SAS: Data I/O
Stat 302 Statistical Software and Its Applications SAS: Data I/O Yen-Chi Chen Department of Statistics, University of Washington Autumn 2016 1 / 33 Getting Data Files Get the following data sets from the
More informationAn Introduction to MATLAB See Chapter 1 of Gilat
1 An Introduction to MATLAB See Chapter 1 of Gilat Kipp Martin University of Chicago Booth School of Business January 25, 2012 Outline The MATLAB IDE MATLAB is an acronym for Matrix Laboratory. It was
More informationUsing Excel for Graphical Analysis of Data
Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are
More informationMinitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.
Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Introduction to Minitab The interface for Minitab is very user-friendly, with a spreadsheet orientation. When you first launch Minitab, you will see
More informationLecture 09. Graphics::ggplot I R Teaching Team. October 1, 2018
Lecture 09 Graphics::ggplot I 2018 R Teaching Team October 1, 2018 Acknowledgements 1. Mike Fliss & Sara Levintow! 2. stackoverflow (particularly user David for lecture styling - link) 3. R Markdown: The
More informationInstructions on Adding Zeros to the Comtrade Data
Instructions on Adding Zeros to the Comtrade Data Required: An excel spreadshheet with the commodity codes for all products you want included. In this exercise we will want all 4-digit SITC Revision 2
More informationSTENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015
STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................
More informationIf you re using a Mac, follow these commands to prepare your computer to run these demos (and any other analysis you conduct with the Audio BNC
If you re using a Mac, follow these commands to prepare your computer to run these demos (and any other analysis you conduct with the Audio BNC sample). All examples use your Workshop directory (e.g. /Users/peggy/workshop)
More informationPackage furniture. November 10, 2017
Package furniture November 10, 2017 Type Package Title Furniture for Quantitative Scientists Version 1.7.2 Date 2017-10-16 Maintainer Tyson S. Barrett Contains three main
More informationA QUICK INTRODUCTION TO MATLAB
A QUICK INTRODUCTION TO MATLAB Very brief intro to matlab Basic operations and a few illustrations This set is independent from rest of the class notes. Matlab will be covered in recitations and occasionally
More informationSurvey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9
Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2
More informationWelcome to Workshop: Introduction to R, Rstudio, and Data
Welcome to Workshop: Introduction to R, Rstudio, and Data I. Please sign in on the sign in sheet (this is so I can follow up to get feedback). II. If you haven t already, download R and Rstudio, install
More information