Session 3 Nick Hathaway;
|
|
- Marcia Rich
- 5 years ago
- Views:
Transcription
1 Session 3 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Manipulating Data frames and matrices 1 Converting to long vs wide formats Manipulating data in table Piping Summarizing data Filtering data Part 1. Excercises 12 Plotting 12 ggplot2 Basics Modifying colors Changing point shape and line types Changing plot aspect not dependent on input data Switching to another layer type Controlling plotting order using factors Saving plots Part 2. Excercises 39 Manipulating Data frames and matrices The readr package reads in data as what is called a tibble which is different from the default R data.frame. The tibble class was invented to be more efficient and more user friendly than the data.frame but one major difference that trips up most people use to data.frame is that the tibble class doesn t allow rownames. While this doesn t make a big difference for most uses of the data.frame class there are instances when you need rownames for the matrix class. Below is how you would read in data that has rownames and then convert to a matrix and add the rownames. library(tidyverse) ts = read_tsv(".series.data.txt") ts_mat = as.matrix(ts[, 2:ncol(ts)]) rownames(ts_mat) = ts$x1 1
2 Converting to long vs wide formats Tidyr The tidyr package is about making your data.frames tidy. Now what is meant by tidy? There are considered two ways to organize data tables. One is referred as wide format where each cell is a different observation and you have row and column names to explain what those observations are. The other format is called long format and this format is that every column is a different variable and each row is a different observation and this long format is the format that R is the best at for organizing. tidyr is all about switching between the two formats. gather gather() will take a table in wide format and change it into long format. It takes four important arguments, 1) the data.frame to work on, 2) the name of a new column that contain the old column names, 3) the name of new column to contain the observation that were spread out in the column table, 4) the column indexes to gather together. ts = read_tsv(".series.data.txt") #rename first column colnames(ts)[1] = "gene" # or the rename function can also be used ts = read_tsv(".series.data.txt") ts = rename(ts, gene = X1) ts # A tibble: 25,87 x 2 gene Ctrl_h Lps_1h Lps_2h Lps_4h Lps_6h Lps_12h <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 A1BG e+ 3.65e+ 3.28e+ 2.78e+ 3.59e+ 2 A1BG-~ e+ 1.92e+ 1.9e+ 1.24e+ 2.12e+ 3 A1CF e e-2. 4 A2M e e e e e+2 5 A2M-A~ e+ 1.e+ 1.66e+ 9.62e e-1 6 A2ML e-2 1.e e e-2 3.8e-2 7 A2MP e e-2 1.2e-1 4.4e-2 8 A3GAL~ e A4GALT e+ 4.22e+ 9.77e+ 7.95e+ 7.7e+ 1 A4GNT e-1 3.e-2 5.e-2 3.e-2. #... with 25,797 more rows, and 13 more variables: # Lps_24h <dbl>, R848_1h <dbl>, R848_2h <dbl>, # R848_4h <dbl>, R848_6h <dbl>, R848_12h <dbl>, # R848_24h <dbl>, Ifnb_1h <dbl>, Ifnb_2h <dbl>, # Ifnb_4h <dbl>, Ifnb_6h <dbl>, Ifnb_12h <dbl>, # Ifnb_24h <dbl> ts_gat = gather(ts, Condition,, 2:ncol(ts) ) ts_gat # A tibble: 49,333 x 3 gene Condition <chr> <chr> <dbl> 2
3 Figure 1: 3
4 1 A1BG Ctrl_h A1BG-AS1 Ctrl_h A1CF Ctrl_h.54 4 A2M Ctrl_h A2M-AS1 Ctrl_h A2ML1 Ctrl_h A2MP1 Ctrl_h A3GALT2 Ctrl_h A4GALT Ctrl_h A4GNT Ctrl_h.12 #... with 49,323 more rows Figure 2: spread The opposite of the gather() function is the spread() function which can be used to undo the gather() ts_gat_sp = spread(ts_gat, Condition, ) ts_gat_sp # A tibble: 25,87 x 2 gene Ctrl_h Ifnb_12h Ifnb_1h Ifnb_24h Ifnb_2h Ifnb_4h <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 4
5 Figure 3: 5
6 1 A1BG 5.41e e e+ 6.6e+ 2 A1BG-~ 1.72e e e+ 8.65e-1 3 A1CF 5.4e-2. 5.e A2M 7.8e e e e+2 5 A2M-A~ 1.38e e e-1 7.e-2 6 A2ML1 2.75e e e-2 2.e-2 7 A2MP1 3.29e A3GAL~ 2.29e A4GALT 3.85e e e e-1 1 A4GNT 1.2e #... with 25,797 more rows, and 13 more variables: # Ifnb_6h <dbl>, Lps_12h <dbl>, Lps_1h <dbl>, # Lps_24h <dbl>, Lps_2h <dbl>, Lps_4h <dbl>, # Lps_6h <dbl>, R848_12h <dbl>, R848_1h <dbl>, # R848_24h <dbl>, R848_2h <dbl>, R848_4h <dbl>, # R848_6h <dbl> separate tidyr also has functions for manipulating columns into multiple columns, the separate function ts_gat = separate(ts_gat, Condition, c("exposure", "") ) ts_gat # A tibble: 49,333 x 4 gene exposure <chr> <chr> <chr> <dbl> 1 A1BG Ctrl h A1BG-AS1 Ctrl h A1CF Ctrl h.54 4 A2M Ctrl h A2M-AS1 Ctrl h A2ML1 Ctrl h A2MP1 Ctrl h A3GALT2 Ctrl h A4GALT Ctrl h A4GNT Ctrl h.12 #... with 49,323 more rows unite The opposite of the separate function is the the unite function ts_gat = unite(ts_gat, Condition, ts_gat exposure, ) # A tibble: 49,333 x 3 gene Condition <chr> <chr> <dbl> 1 A1BG Ctrl_h A1BG-AS1 Ctrl_h A1CF Ctrl_h.54 4 A2M Ctrl_h 78. 6
7 5 A2M-AS1 Ctrl_h A2ML1 Ctrl_h A2MP1 Ctrl_h A3GALT2 Ctrl_h A4GALT Ctrl_h A4GNT Ctrl_h.12 #... with 49,323 more rows Manipulating data in table The library dplyr can be used to manipulate the data within the table themselves while tidyr is more for reorganization ### Mutating Columns types The mutate function can be used to either create new columns or change current columns. Here I also take advantage of the gsub function, which takes three arguments, 1) a pattern to replace, 2) what to replace the pattern with, 3) what to do the replacement on library(dplyr) ts_gat = mutate(ts_gat, = as.numeric(gsub("h", "", ) ) ) Piping It is common practice with tidyverse functions to use something called piping which is using the results of function call and using that as input to the next function without saving that result in an intermediate variable, this allows for much more efficient processing of the data as not as much memory is used by the computer. This piping is accomplished by the %>% operator, which a keyboard shortcut is hitting the command+shift+m keys together (control+shirt+m for windows or ubuntu). The pipe operator takes what is given to it and places that as the first argument in the next function (e.g. mean(x) == x %>% mean() ) Below is a diagram. x = 1:1 mean(x) [1] 5.5 x %>% mean() [1] 5.5 library(tidyverse) #this will load readr, dplyr, and tidyr ts_longformat = read_tsv(".series.data.txt") ts_longformat = rename(ts_longformat, gene = X1) ts_longformat = gather(ts_longformat, Condition,, 2:ncol(ts_longFormat) ) ts_longformat = separate(ts_longformat, Condition, c("exposure", "") ) ts_longformat = mutate(ts_longformat, = as.numeric(gsub("h", "", ) ) ) ts_longformat # A tibble: 49,333 x 4 gene exposure 7
8 <chr> <chr> <dbl> <dbl> 1 A1BG Ctrl A1BG-AS1 Ctrl A1CF Ctrl A2M Ctrl A2M-AS1 Ctrl A2ML1 Ctrl A2MP1 Ctrl A3GALT2 Ctrl A4GALT Ctrl A4GNT Ctrl..12 #... with 49,323 more rows Is equivalent to library(tidyverse) #this will load readr, dplyr, and tidyr ts_longformat = read_tsv(".series.data.txt") %>% rename(gene = X1) %>% gather(condition,, 2:ncol(.) ) %>% separate(condition, c("exposure", "") ) %>% mutate( = as.numeric(gsub("h", "", ) ) ) ts_longformat # A tibble: 49,333 x 4 gene exposure <chr> <chr> <dbl> <dbl> 1 A1BG Ctrl A1BG-AS1 Ctrl A1CF Ctrl A2M Ctrl A2M-AS1 Ctrl A2ML1 Ctrl A2MP1 Ctrl A3GALT2 Ctrl A4GALT Ctrl A4GNT Ctrl..12 #... with 49,323 more rows Figure 4: Below shows the relationship between the above pipe commands and the commands executed one by one Summarizing data dplyr also offers ways to quickly group and then summarize your data using the group_by and summarize functions. 8
9 ts_longformat_exposure_summary = ts_longformat %>% group_by(exposure) %>% summarise(mean_ = mean(), median_ = median(), max_ = max(), min_ = min(), sd_ = sd()) ts_longformat_exposure_summary # A tibble: 4 x 6 exposure mean_ median_ max_ <chr> <dbl> <dbl> <dbl> 1 Ctrl Ifnb Lps R #... with 2 more variables: min_ <dbl>, # sd_ <dbl> ts_longformat_exposure summary = ts_longformat %>% group_by(exposure, ) %>% summarise(mean_ = mean(), median_ = median(), max_ = max(), min_ = min(), sd_ = sd()) ts_longformat_exposure summary # A tibble: 19 x 7 # Groups: exposure [?] exposure mean_ median_ <chr> <dbl> <dbl> <dbl> 1 Ctrl Ifnb Ifnb Ifnb Ifnb Ifnb Ifnb Lps Lps Lps Lps Lps Lps R R R R R
10 19 R #... with 3 more variables: max_ <dbl>, # min_ <dbl>, sd_ <dbl> Filtering data ts_longformat_crt = ts_longformat %>% filter(exposure == "Ctrl") ts_longformat_crt # A tibble: 25,87 x 4 gene exposure <chr> <chr> <dbl> <dbl> 1 A1BG Ctrl A1BG-AS1 Ctrl A1CF Ctrl A2M Ctrl A2M-AS1 Ctrl A2ML1 Ctrl A2MP1 Ctrl A3GALT2 Ctrl A4GALT Ctrl A4GNT Ctrl..12 #... with 25,797 more rows NAs in values Taking means, mins, maxes, etc. can be affected if NA values are present vals = c(1,3, 5, 9, 19, 23) mean(vals) [1] 11.5 min(vals) [1] 3 max(vals) [1] 23 vals = c(1,3, 5, 9, 19, 23, NA) mean(vals) [1] NA 1
11 min(vals) [1] NA max(vals) [1] NA You can handle this by setting na.rm = T vals = c(1,3, 5, 9, 19, 23, NA) mean(vals, na.rm = T) [1] 11.5 min(vals, na.rm = T) [1] 3 max(vals, na.rm = T) [1] 23 You can also just get rid of the NA values as well and if within in a data.frame you can use filter() you can do. # use na.rm =T ts_longformat_exposure summary = ts_longformat %>% group_by(exposure, ) %>% summarise(mean_ = mean(, na.rm =T), median_ = median(, na.rm =T), max_ = max(, na.rm =T), min_ = min(, na.rm =T), sd_ = sd(, na.rm =T)) # use filter to keep only values of that aren't (!) NA (is.na) ts_longformat_exposure summary = ts_longformat %>% filter(!is.na()) %>% group_by(exposure, ) %>% summarise(mean_ = mean(), median_ = median(), max_ = max(), min_ = min(), sd_ = sd()) dplyr offers a large array of available functions for manipulating data frames. Here is a list of resources for more options: 11
12 1. cheatsheet a few basics tutorial a webinar - Part 1. Excercises Download Average Temperatures USA Average Temperatures USA 1. convert to long format by using gather() on the temperate columns 2. separate columns so you have a column for year and month 3. create a table of mean temperatures for month, city, and months for each city (various group_by calls) 4. filter table to just one city or just one month Plotting Base R offers several basic plotting functions but in this course we will be focusing on using ggplot2 for plotting. A basic introduction can be found here, ggplot2 Basics A basic ggplot2 call # filter data to gene of interest ts_longformat_sod2 = ts_longformat %>% filter("sod2" == gene) ggplot(ts_longformat_sod2) + geom_point(aes(x =, y = ) ) 12
13 ggplot2 is based off of what is called Grammar of Graphics (hence gg plot), which is a book by Leland Wilkinson The philosophy of the book is that you should have a plotting system that allows you to simply describe what the plot should be based on and the computer will take care of it. Of note, ggplot2 is the name of the library but the function call itself is ggplot() and not ggplot2(). ggplot2 works best by working on a long-format data frame, you then describe all layers on the plot, you add each layer with another geom_[type] functions. There are many layers available, see http: //ggplot2.tidyverse.org/reference/index.html#section-layer-geoms for a list and examples of each. ggplot(ts_longformat_sod2) + geom_point(aes(x =, y = ) ) + geom_line(aes(x =, y = ) ) 13
14 Below is a diagram of how a generic ggplot2 is structured. The aspects of the plot that you want to map to specific column in the data frame are given in the aes() call within the layer calls, (aes is short for aesthetic). If the mapping aesthetics are shared between layers you can give them in the top ggplot() call and they will be applied to each layer. ggplot(ts_longformat_sod2, aes(x =, y = )) + geom_point() + geom_line() 14
15 Figure 5: Figure 6: 15
16 Now clearly this plot is not what we actually want to display, we are ignoring the exposure variable and this is causing the plot to look funny, so let s tell ggplot that we have a grouping variable, exposure. ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure)) + geom_point() + geom_line() 16
17 Now let s add some coloring to make this plot a little more exciting. ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, color = exposure)) + geom_point() + geom_line() 17
18 1 exposure Ctrl Ifnb Lps R Also when you add plotting aspects like coloring, ggplot2 assumes that this is also a grouping variable so you no longer have to supply grouping if you are giving a coloring variable. plotsod2 = ggplot(ts_longformat_sod2, aes(x =, y =, color = exposure)) + geom_point() + geom_line() Modifying colors Now the default color variables kind of leave a lot to be desired and we can set these colors to something else by using ggplot2 s scale_color_[func_name] to change the color plotting aspects. Here we are using the colors supplied by RColorBrewer, which you use by using scale_color_brewer(). 18
19 ggplot(ts_longformat_sod2, aes(x =, y =, color = exposure)) + geom_point() + geom_line() + scale_color_brewer(palette = "Dark2") 1 exposure Ctrl Ifnb Lps R The color brewer palettes were developed by Martin Krzywinski on which he wrote a Nature paper on the subject, certain palettes were developed to be color blind safe, more information about his work can be found here and a website for helping choosing colors can be found here For more information on how you can change the color other than scale_color_brewer() see here http: //ggplot2.tidyverse.org/reference/index.html#section-scales. 19
20 Setting colors manually As most PIs are extremely picky about colors, there are also easy ways of setting specific colors for specific grouping using scale_color_manual(). exposurecolors = c("#5ac8", "#AAA3C", "#AB45A", "#14D2DC") ggplot(ts_longformat_sod2, aes(x =, y =, color = exposure)) + geom_point() + geom_line() + scale_color_manual(values = exposurecolors) 1 exposure Ctrl Ifnb Lps R The above will assign the colors in the order they appear in the data frame and you can make it sure it doesn t matter the order or if certain levels are missing (which could then mess up the ordering) you can 2
21 name the color vector so that coloring is consistent. exposurecolors = c("#5ac8", "#AAA3C", "#AB45A", "#14D2DC", "#8214A") names(exposurecolors) = c("ctrl", "R848", "Ifnb", "Lps", "other") ggplot(ts_longformat_sod2, aes(x =, y =, color = exposure)) + geom_point() + geom_line() + scale_color_manual(values = exposurecolors) 1 exposure Ctrl Ifnb Lps R If you don t name one of the layer it will not get a color, so be careful of case etc. 21
22 exposurecolors = c("#5ac8", "#AAA3C", "#AB45A", "#14D2DC", "#8214A") names(exposurecolors) = c("ctrl", "R848", "Ifnb", "LPS", "other") ggplot(ts_longformat_sod2, aes(x =, y =, color = exposure)) + geom_point() + geom_line() + scale_color_manual(values = exposurecolors) 1 exposure Ctrl Ifnb Lps R
23 Changing point shape and line types ts_longformat_sod2_cd74 = ts_longformat %>% filter("sod2" == gene "CD74" == gene) # taking advantage of the or operator to do a check for either # also you can also the %in% operator that R offers ts_longformat_sod2_cd74 = ts_longformat %>% filter(gene %in% c("sod2", "CD74") ) # create a grouping variable to make plotting easier ts_longformat_sod2_cd74 = ts_longformat_sod2_cd74 %>% mutate(grouping = paste(gene, "-", exposure)) # using group = grouping to separate out the different genes and the exposure but still color by exposur ggplot(ts_longformat_sod2_cd74, aes(x =, y =, color = exposure, group = grouping)) + geom_point() + geom_line() + scale_color_brewer(palette = "Dark2") 23
24 15 1 exposure Ctrl Ifnb Lps R But just coloring by exposure we can t tell which lines and points are from which genes so lets change the shape and line types so we distinguish ggplot(ts_longformat_sod2_cd74, aes(x =, y =, color = exposure, group = grouping)) + geom_point(aes(shape = gene)) + geom_line(aes(linetype = gene)) + scale_color_brewer(palette = "Dark2") 24
25 15 1 gene CD74 SOD2 exposure Ctrl Ifnb Lps R Changing plot aspect not dependent on input data If you want to change certain aspect about the plot that doesn t depend on mapping data from the input data frame you put these setting on the output of the aes() call. Figure 7: 25
26 # make the points larger, the value given to size is a relative number ggplot(ts_longformat_sod2_cd74, aes(x =, y =, color = exposure, group = grouping)) + geom_point(aes(shape = gene), size = 3) + geom_line(aes(linetype = gene)) + scale_color_brewer(palette = "Dark2") 15 1 gene CD74 SOD2 exposure Ctrl Ifnb Lps R You can also then change the linetypes and shapes with scale_ functions genelinetypes =c("dotted", "solid") names(genelinetypes) = c("cd74", "SOD2") # make the points larger, the value given to size is a relative number ggplot(ts_longformat_sod2_cd74, aes(x =, y =, color = exposure, group = grouping)) + geom_point(aes(shape = gene), size = 3) + 26
27 geom_line(aes(linetype = gene)) + scale_color_brewer(palette = "Dark2") + scale_shape_manual(values = c(1, 3)) + scale_linetype_manual(values = genelinetypes) 15 1 gene CD74 SOD2 exposure Ctrl Ifnb Lps R Switching to another layer type Say you decide to take your plot and switch to a different layer type, you can reuse a lot of what you have already done. For example lets switch from a dot/line plot to a bar plot by using geom_bar(). By default geom_bar() does plotting by counting up all values that fall into a group, but if you want a specific values instead you have to give geom_bar() stat = "identity" 27
28 ggplot(ts_longformat_sod2, aes(x =, y =, color = exposure)) + geom_bar(stat = "identity") + scale_color_brewer(palette = "Dark2") exposure Ctrl Ifnb Lps R Notice now that for geom_bar() color controls the border of the bars but if we want the bars themselves to be the given color we have to use fill instead. ggplot(ts_longformat_sod2, aes(x =, y =, fill = exposure)) + geom_bar(stat = "identity") + scale_color_brewer(palette = "Dark2") 28
29 exposure Ctrl Ifnb Lps R But we lost the colors we were trying to set and that s because we are using the scale_color_brewer but we are now using fill instead so we need the scale_fill_brewer function instead. ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity") + scale_fill_brewer(palette = "Dark2") 29
30 exposure Ctrl Ifnb Lps R By default geom_bar() stacks all the bars belonging to the same x-axis grouping on top of each other but if we wanted them next to each other instead we give geom_bar() position = "dodge". ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge") + scale_fill_brewer(palette = "Dark2") 3
31 1 exposure Ctrl Ifnb Lps R We can make the bars stand out more by giving change the border color for all bars to be black ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + scale_fill_brewer(palette = "Dark2") 31
32 1 exposure Ctrl Ifnb Lps R Now lets dress up the plot a little bit, we can do this by using ggplot2 s theme() function which allows the tweaking of many different aspects of how the plot itself looks in general, let s change the legend position so it s on the bottom instead. ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + scale_fill_brewer(palette = "Dark2") + theme(legend.position = "bottom") 32
33 exposure Ctrl Ifnb Lps R848 To see all the things that theme() can do use the help function help(theme) We can also take advantage of preset themes supplied by ggplot2 ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + scale_fill_brewer(palette = "Dark2") + theme_bw() + theme(legend.position = "bottom") 33
34 exposure Ctrl Ifnb Lps R848 We can also change the title and labels of axis with labs() function and lets get rid of the panel around the plot (panel.border = element_blank()). Also center the title (plot.title = element_text(hjust =.5)). ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + scale_fill_brewer(palette = "Dark2") + labs(title = "Expression of SOD2 gene", y = "Gene Expression", x = "Time (hrs)") + theme_bw() + theme(legend.position = "bottom", panel.border = element_blank(), plot.title = element_text(hjust =.5)) 34
35 Expression of SOD2 gene 1 Gene Expression Time (hrs) exposure Ctrl Ifnb Lps R848 Controlling plotting order using factors Let s say we didn t change the column into a numeric column. Notice how the order isn t what we would want, this is because R will determine the order automatically by sorting the input values. ts_longformat_sod2 = read_tsv(".series.data.txt") %>% rename(gene = X1) %>% gather(condition,, 2:ncol(.) ) %>% separate(condition, c("exposure", "") ) %>% filter("sod2" == gene) ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + 35
36 scale_fill_brewer(palette = "Dark2") + labs(title = "Expression of SOD2 gene", y = "Gene Expression", x = "Time (hrs)") + theme_bw() + theme(legend.position = "bottom", panel.border = element_blank(), plot.title = element_text(hjust =.5)) Expression of SOD2 gene 1 Gene Expression 5 h 12h 1h 24h 2h 4h 6h Time (hrs) exposure Ctrl Ifnb Lps R848 This can be fixed by changing the column into a factor from a character and set the order of levels ts_longformat_sod2 = ts_longformat_sod2 %>% mutate( = factor(, levels = c("h", "1h", "2h", "4h", "6h", "12h", "24h"))) ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + 36
37 scale_fill_brewer(palette = "Dark2") + labs(title = "Expression of SOD2 gene", y = "Gene Expression", x = "Time (hrs)") + theme_bw() + theme(legend.position = "bottom", panel.border = element_blank(), plot.title = element_text(hjust =.5)) Expression of SOD2 gene 1 Gene Expression 5 h 1h 2h 4h 6h 12h 24h Time (hrs) exposure Ctrl Ifnb Lps R848 Saving plots Plots can be saved to a variety of image types but the most useful is likely pdf, this will allow you to be able to manipulate the plot in programs like Illustrator or Inkscape or save as any other image type after. To save as pdf we will use the function pdf(). How this functions works is that it opens up a pdf graphic 37
38 device which will catch all plot calls (rather than going to the plot window in RStudio) until the function dev.off() is called pdf("example_plot.pdf", width = 11, height = 8.5, usedingbats = F) ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + scale_fill_brewer(palette = "Dark2") + labs(title = "Expression of SOD2 gene", y = "Gene Expression", x = "Time (hrs)") + theme_bw() + theme(legend.position = "bottom", panel.border = element_blank(), plot.title = element_text(hjust =.5)) dev.off() pdf 2 example_plot.pdf - The first argument is the name of a file you want to save the plots to, this will erase any file with this name if it already exists so be careful. width - This is the width of the plot in inches height - This is the height of the plot in inches usedingbats=f - This turns off the graphic library Dingbats, you always want to set this to FALSE, it causes R to create a larger file but if you don t turn off Dingbats it causes problems when editing the pdf latter in something like Illustrator. Multiple pages R will keeping adding pages to the opened pdf until dev.off() is called. pdf("example_plot_2_pages.pdf", width = 11, height = 8.5, usedingbats = F) ggplot(ts_longformat_sod2, aes(x =, y =, group = exposure, fill = exposure)) + geom_bar(stat = "identity", position = "dodge", color = "black") + scale_fill_brewer(palette = "Dark2") + labs(title = "Expression of SOD2 gene", y = "Gene Expression", x = "Time (hrs)") + theme_bw() + theme(legend.position = "bottom", panel.border = element_blank(), plot.title = element_text(hjust =.5)) ggplot(ts_longformat_sod2_cd74, aes(x =, y =, color = exposure, group = grouping)) + geom_point(aes(shape = gene), size = 3) + geom_line(aes(linetype = gene)) + scale_color_brewer(palette = "Dark2") + scale_shape_manual(values = c(1, 3)) + scale_linetype_manual(values = genelinetypes) dev.off() pdf 2 38
39 Part 2. Excercises Using the Temperature data frame read in earlier Average Temperatures USA 1. Filter the long format data frame created in Part 1 to just one Station_name 2. Modify the month column into a factor so that the months are organized in chronological order. (hint use this vector c("january","february","march","april","may","june","july","august","september","october", 3. Create a line and dot plot of temperature for the Station_name you picked in 1 with months on x-axis and temperatures on y-axis, color the lines by years (see what happens to the colors when you change years into a factor rather than a numeric data type) 4. Now create a barplot 5. Now filter the long format data frame again to be from 3 different stations and to just the year Take the new data frame from 5 and create a barplot x = months and y = temperature and color the bars by station name (try setting the station names to new custom colors of your choosing, you can use to pick colors) 39
Session 5 Nick Hathaway;
Session 5 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Adding Text To Plots 1 Line graph................................................. 1 Bar graph..................................................
More informationAn Introduction to R. Ed D. J. Berry 9th January 2017
An Introduction to R Ed D. J. Berry 9th January 2017 Overview Why now? Why R? General tips Recommended packages Recommended resources 2/48 Why now? Efficiency Pointandclick software just isn't time efficient
More informationData visualization with ggplot2
Data visualization with ggplot2 Visualizing data in R with the ggplot2 package Authors: Mateusz Kuzak, Diana Marek, Hedi Peterson, Dmytro Fishman Disclaimer We will be using the functions in the ggplot2
More information03 - Intro to graphics (with ggplot2)
3 - Intro to graphics (with ggplot2) ST 597 Spring 217 University of Alabama 3-dataviz.pdf Contents 1 Intro to R Graphics 2 1.1 Graphics Packages................................ 2 1.2 Base Graphics...................................
More informationData Visualization. Module 7
Data Visualization http://datascience.tntlab.org Module 7 Today s Agenda A Brief Reminder to Update your Software A walkthrough of ggplot2 Big picture New cheatsheet, with some familiar caveats Geometric
More informationData Visualization Using R & ggplot2. Karthik Ram October 6, 2013
Data Visualization Using R & ggplot2 Karthik Ram October 6, 2013 Some housekeeping Install some packages install.packages("ggplot2", dependencies = TRUE) install.packages("plyr") install.packages("ggthemes")
More informationCRAN and Libraries CRAN AND LIBRARIES
V CRAN AND LIBRARIES V CRAN and Libraries One of the major advantages of using R for data analysis is the rich and active community that surrounds it. There is a rich ecosystem of extensions (also known
More informationIntroduction to R and the tidyverse. Paolo Crosetto
Introduction to R and the tidyverse Paolo Crosetto Lecture 1: plotting Before we start: Rstudio Interactive console Object explorer Script window Plot window Before we start: R concatenate: c() assign:
More informationDemo yeast mutant analysis
Demo yeast mutant analysis Jean-Yves Sgro February 20, 2018 Contents 1 Analysis of yeast growth data 1 1.1 Set working directory........................................ 1 1.2 List all files in directory.......................................
More informationFinancial Econometrics Practical
Financial Econometrics Practical Practical 3: Plotting in R NF Katzke Table of Contents 1 Introduction 1 1.0.1 Install ggplot2................................................. 2 1.1 Get data Tidy.....................................................
More informationThe Average and SD in R
The Average and SD in R The Basics: mean() and sd() Calculating an average and standard deviation in R is straightforward. The mean() function calculates the average and the sd() function calculates the
More informationGetting started with ggplot2
Getting started with ggplot2 STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 ggplot2 2 Resources for
More informationThe diamonds dataset Visualizing data in R with ggplot2
Lecture 2 STATS/CME 195 Matteo Sesia Stanford University Spring 2018 Contents The diamonds dataset Visualizing data in R with ggplot2 The diamonds dataset The tibble package The tibble package is part
More informationLecture 09. Graphics::ggplot I R Teaching Team. October 1, 2018
Lecture 09 Graphics::ggplot I 2018 R Teaching Team October 1, 2018 Acknowledgements 1. Mike Fliss & Sara Levintow! 2. stackoverflow (particularly user David for lecture styling - link) 3. R Markdown: The
More informationPlotting with ggplot2: Part 2. Biostatistics
Plotting with ggplot2: Part 2 Biostatistics 14.776 Building Plots with ggplot2 When building plots in ggplot2 (rather than using qplot) the artist s palette model may be the closest analogy Plots are built
More informationData Import and Formatting
Data Import and Formatting http://datascience.tntlab.org Module 4 Today s Agenda Importing text data Basic data visualization tidyverse vs data.table Data reshaping and type conversion Basic Text Data
More informationCreating elegant graphics in R with ggplot2
Creating elegant graphics in R with ggplot2 Lauren Steely Bren School of Environmental Science and Management University of California, Santa Barbara What is ggplot2, and why is it so great? ggplot2 is
More informationData Wrangling in the Tidyverse
Data Wrangling in the Tidyverse 21 st Century R DS Portugal Meetup, at Farfetch, Porto, Portugal April 19, 2017 Jim Porzak Data Science for Customer Insights 4/27/2017 1 Outline 1. A very quick introduction
More informationA Whistle-Stop Tour of the Tidyverse
A Whistle-Stop Tour of the Tidyverse Aimee Gott Senior Consultant agott@mango-solutions.com @aimeegott_r In This Workshop You will learn What the tidyverse is & why bother using it What tools are available
More informationggplot in 3 easy steps (maybe 2 easy steps)
1 ggplot in 3 easy steps (maybe 2 easy steps) 1.1 aesthetic: what you want to graph (e.g. x, y, z). 1.2 geom: how you want to graph it. 1.3 options: optional titles, themes, etc. 2 Background R has a number
More informationA set of rules describing how to compose a 'vocabulary' into permissible 'sentences'
Lecture 8: The grammar of graphics STAT598z: Intro. to computing for statistics Vinayak Rao Department of Statistics, Purdue University Grammar? A set of rules describing how to compose a 'vocabulary'
More informationStatistical transformations
Statistical transformations Next, let s take a look at a bar chart. Bar charts seem simple, but they are interesting because they reveal something subtle about plots. Consider a basic bar chart, as drawn
More informationggplot2 for beginners Maria Novosolov 1 December, 2014
ggplot2 for beginners Maria Novosolov 1 December, 214 For this tutorial we will use the data of reproductive traits in lizards on different islands (found in the website) First thing is to set the working
More informationEXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression
EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression
More informationIntroducing R/Tidyverse to Clinical Statistical Programming
Introducing R/Tidyverse to Clinical Statistical Programming MBSW 2018 Freeman Wang, @freestatman 2018-05-15 Slides available at https://bit.ly/2knkalu Where are my biases Biomarker Statistician Genomic
More informationThe Tidyverse BIOF 339 9/25/2018
The Tidyverse BIOF 339 9/25/2018 What is the Tidyverse? The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar,
More informationData Manipulation. Module 5
Data Manipulation http://datascience.tntlab.org Module 5 Today s Agenda A couple of base-r notes Advanced data typing Relabeling text In depth with dplyr (part of tidyverse) tbl class dplyr grammar Grouping
More informationPackage arphit. March 28, 2019
Type Package Title RBA-style R Plots Version 0.3.1 Author Angus Moore Package arphit March 28, 2019 Maintainer Angus Moore Easily create RBA-style graphs
More informationTidy Evaluation. Lionel Henry and Hadley Wickham RStudio
Tidy Evaluation Lionel Henry and Hadley Wickham RStudio Tidy evaluation Our vision for dealing with a special class of R functions Usually called NSE but we prefer quoting functions Most interesting language
More informationData Handling: Import, Cleaning and Visualisation
Data Handling: Import, Cleaning and Visualisation 1 Data Display Lecture 11: Visualisation and Dynamic Documents Prof. Dr. Ulrich Matter (University of St. Gallen) 13/12/18 In the last part of a data pipeline
More informationRstudio GGPLOT2. Preparations. The first plot: Hello world! W2018 RENR690 Zihaohan Sang
Rstudio GGPLOT2 Preparations There are several different systems for creating data visualizations in R. We will introduce ggplot2, which is based on Leland Wilkinson s Grammar of Graphics. The learning
More informationData Import and Export
Data Import and Export Eugen Buehler October 17, 2018 Importing Data to R from a file CSV (comma separated value) tab delimited files Excel formats (xls, xlsx) SPSS/SAS/Stata RStudio will tell you if you
More informationPackage ggextra. April 4, 2018
Package ggextra April 4, 2018 Title Add Marginal Histograms to 'ggplot2', and More 'ggplot2' Enhancements Version 0.8 Collection of functions and layers to enhance 'ggplot2'. The flagship function is 'ggmarginal()',
More informationMaking Tables and Graphs with Excel. The Basics
Making Tables and Graphs with Excel The Basics Where do my IV and DV go? Just like you would create a data table on paper, your IV goes in the leftmost column and your DV goes to the right of the IV Enter
More informationMaking sense of census microdata
Making sense of census microdata Tutorial 3: Creating aggregated variables and visualisations First, open a new script in R studio and save it in your working directory, so you will be able to access this
More informationExcel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller
Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller Table of Contents Introduction!... 1 Part 1: Entering Data!... 2 1.a: Typing!... 2 1.b: Editing
More informationЛекция 4 Трансформация данных в R
Анализ данных Лекция 4 Трансформация данных в R Гедранович Ольга Брониславовна, старший преподаватель кафедры ИТ, МИУ volha.b.k@gmail.com 2 Вопросы лекции Фильтрация (filter) Сортировка (arrange) Выборка
More informationAssignment 0. Nothing here to hand in
Assignment 0 Nothing here to hand in The questions here have solutions attached. Follow the solutions to see what to do, if you cannot otherwise guess. Though there is nothing here to hand in, it is very
More informationsocial data science Data Visualization Sebastian Barfort August 08, 2016 University of Copenhagen Department of Economics 1/86
social data science Data Visualization Sebastian Barfort August 08, 2016 University of Copenhagen Department of Economics 1/86 Who s ahead in the polls? 2/86 What values are displayed in this chart? 3/86
More informationEXCEL 2003 DISCLAIMER:
EXCEL 2003 DISCLAIMER: This reference guide is meant for experienced Microsoft Excel users. It provides a list of quick tips and shortcuts for familiar features. This guide does NOT replace training or
More informationSession 1 Nick Hathaway;
Session 1 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents R Basics 1 Variables/objects.............................................. 1 Functions..................................................
More informationData wrangling. Reduction/Aggregation: reduces a variable to a scalar
Data Wrangling Some definitions A data table is a collection of variables and observations A variable (when data are tidy) is a single column in a data table An observation is a single row in a data table,
More informationIntroduction to Functions. Biostatistics
Introduction to Functions Biostatistics 140.776 Functions The development of a functions in R represents the next level of R programming, beyond writing code at the console or in a script. 1. Code 2. Functions
More informationPackage ggseas. June 12, 2018
Package ggseas June 12, 2018 Title 'stats' for Seasonal Adjustment on the Fly with 'ggplot2' Version 0.5.4 Maintainer Peter Ellis Provides 'ggplot2' 'stats' that estimate
More informationLecture 12: Data carpentry with tidyverse
http://127.0.0.1:8000/.html Lecture 12: Data carpentry with tidyverse STAT598z: Intro. to computing for statistics Vinayak Rao Department of Statistics, Purdue University options(repr.plot.width=5, repr.plot.height=3)
More informationPackage gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2
Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2 Package gggenes November 7, 2018 Provides a 'ggplot2' geom and helper functions for drawing gene arrow maps. Depends R (>= 3.3.0) Imports grid (>=
More informationGraphical critique & theory. Hadley Wickham
Graphical critique & theory Hadley Wickham Exploratory graphics Are for you (not others). Need to be able to create rapidly because your first attempt will never be the most revealing. Iteration is crucial
More information1 Introduction to Using Excel Spreadsheets
Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)
More informationSurvey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9
Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2
More informationLogical operators: R provides an extensive list of logical operators. These include
meat.r: Explanation of code Goals of code: Analyzing a subset of data Creating data frames with specified X values Calculating confidence and prediction intervals Lists and matrices Only printing a few
More informationEVALUATION COPY. Unauthorized Reproduction or Distribution Prohibited EXCEL INTERMEDIATE
EXCEL INTERMEDIATE Overview NOTES... 2 OVERVIEW... 3 VIEW THE PROJECT... 5 USING FORMULAS AND FUNCTIONS... 6 BASIC EXCEL REVIEW... 6 FORMULAS... 7 Typing formulas... 7 Clicking to insert cell references...
More informationA Quick and focused overview of R data types and ggplot2 syntax MAHENDRA MARIADASSOU, MARIA BERNARD, GERALDINE PASCAL, LAURENT CAUQUIL
A Quick and focused overview of R data types and ggplot2 syntax MAHENDRA MARIADASSOU, MARIA BERNARD, GERALDINE PASCAL, LAURENT CAUQUIL 1 R and RStudio OVERVIEW 2 R and RStudio R is a free and open environment
More informationSTAT 1291: Data Science
STAT 1291: Data Science Lecture 20 - Summary Sungkyu Jung Semester recap data visualization data wrangling professional ethics statistical foundation Statistical modeling: Regression Cause and effect:
More informationThe following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan.
Graphics Agenda Grammer of Graphics Using ggplot2 The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan. ggplot2 (wiki) ggplot2 is a data visualization package Created
More informationR. Muralikrishnan Max Planck Institute for Empirical Aesthetics Frankfurt. 08 June 2017
R R. Muralikrishnan Max Planck Institute for Empirical Aesthetics Frankfurt 08 June 2017 Introduction What is R?! R is a programming language for statistical computing and graphics R is free and open-source
More informationLecture 4: Data Visualization I
Lecture 4: Data Visualization I Data Science for Business Analytics Thibault Vatter Department of Statistics, Columbia University and HEC Lausanne, UNIL 11.03.2018 Outline 1 Overview
More information1 The ggplot2 workflow
ggplot2 @ statistics.com Week 2 Dope Sheet Page 1 dope, n. information especially from a reliable source [the inside dope]; v. figure out usually used with out; adj. excellent 1 This week s dope This week
More informationPackage lvplot. August 29, 2016
Version 0.2.0 Title Letter Value 'Boxplots' Package lvplot August 29, 2016 Implements the letter value 'boxplot' which extends the standard 'boxplot' to deal with both larger and smaller number of data
More informationData Visualization. Andrew Jaffe Instructor
Module 9 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots previously, but we are going to expand the ability to customize these basic graphics first. 2/45 Read in Data
More informationFacets and Continuous graphs
Facets and Continuous graphs One way to add additional variables is with aesthetics. Another way, particularly useful for categorical variables, is to split your plot into facets, subplots that each display
More informationPlotting with Rcell (Version 1.2-5)
Plotting with Rcell (Version 1.2-) Alan Bush October 7, 13 1 Introduction Rcell uses the functions of the ggplots2 package to create the plots. This package created by Wickham implements the ideas of Wilkinson
More informationDplyr Introduction Matthew Flickinger July 12, 2017
Dplyr Introduction Matthew Flickinger July 12, 2017 Introduction to Dplyr This document gives an overview of many of the features of the dplyr library include in the tidyverse of related R pacakges. First
More informationK-fold cross validation in the Tidyverse Stephanie J. Spielman 11/7/2017
K-fold cross validation in the Tidyverse Stephanie J. Spielman 11/7/2017 Requirements This demo requires several packages: tidyverse (dplyr, tidyr, tibble, ggplot2) modelr broom proc Background K-fold
More information= 3 + (5*4) + (1/2)*(4/2)^2.
Physics 100 Lab 1: Use of a Spreadsheet to Analyze Data by Kenneth Hahn and Michael Goggin In this lab you will learn how to enter data into a spreadsheet and to manipulate the data in meaningful ways.
More informationImporting and visualizing data in R. Day 3
Importing and visualizing data in R Day 3 R data.frames Like pandas in python, R uses data frame (data.frame) object to support tabular data. These provide: Data input Row- and column-wise manipulation
More informationLab5A - Intro to GGPLOT2 Z.Sang Sept 24, 2018
LabA - Intro to GGPLOT2 Z.Sang Sept 24, 218 In this lab you will learn to visualize raw data by plotting exploratory graphics with ggplot2 package. Unlike final graphs for publication or thesis, exploratory
More informationMATLAB TUTORIAL WORKSHEET
MATLAB TUTORIAL WORKSHEET What is MATLAB? Software package used for computation High-level programming language with easy to use interactive environment Access MATLAB at Tufts here: https://it.tufts.edu/sw-matlabstudent
More informationOutline day 4 May 30th
Graphing in R: basic graphing ggplot2 package Outline day 4 May 30th 05/2017 117 Graphing in R: basic graphing 05/2017 118 basic graphing Producing graphs R-base package graphics offers funcaons for producing
More informationSTA130 - Class #2: Nathan Taback
STA130 - Class #2: Nathan Taback 2018-01-15 Today's Class Histograms and density functions Statistical data Tidy data Data wrangling Transforming data 2/51 Histograms and Density Functions Histograms and
More informationYou are to turn in the following three graphs at the beginning of class on Wednesday, January 21.
Computer Tools for Data Analysis & Presentation Graphs All public machines on campus are now equipped with Word 2010 and Excel 2010. Although fancier graphical and statistical analysis programs exist,
More informationWhy use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first
Why use R? Introduction to R: Using R for statistics ti ti and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ To perform inferential statistics
More informationPackage ezsummary. August 29, 2016
Type Package Title Generate Data Summary in a Tidy Format Version 0.2.1 Package ezsummary August 29, 2016 Functions that simplify the process of generating print-ready data summary using 'dplyr' syntax.
More informationThe Foundation. Review in an instant
The Foundation Review in an instant Table of contents Introduction 1 Basic use of Excel 2 - Important Excel terms - Important toolbars - Inserting and deleting columns and rows - Copy and paste Calculations
More informationIntroduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics
Introduction to R: Using R for Statistics and Data Analysis BaRC Hot Topics http://barc.wi.mit.edu/hot_topics/ Why use R? Perform inferential statistics (e.g., use a statistical test to calculate a p-value)
More informationUsing R for statistics and data analysis
Introduction ti to R: Using R for statistics and data analysis BaRC Hot Topics October 2011 George Bell, Ph.D. http://iona.wi.mit.edu/bio/education/r2011/ Why use R? To perform inferential statistics (e.g.,
More informationIntroduction to R: Using R for statistics and data analysis
Why use R? Introduction to R: Using R for statistics and data analysis George W Bell, Ph.D. BaRC Hot Topics November 2014 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/
More information# Call plot plot(gg)
Most of the requirements related to look and feel can be achieved using the theme() function. It accepts a large number of arguments. Type?theme in the R console and see for yourself. # Setup options(scipen=999)
More informationName Date Types of Graphs and Creating Graphs Notes
Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.
More informationPackage ggsubplot. February 15, 2013
Package ggsubplot February 15, 2013 Maintainer Garrett Grolemund License GPL Title Explore complex data by embedding subplots within plots. LazyData true Type Package Author Garrett
More informationLecture 3: Basics of R Programming
Lecture 3: Basics of R Programming This lecture introduces you to how to do more things with R beyond simple commands. Outline: 1. R as a programming language 2. Grouping, loops and conditional execution
More informationLondonR: Introduction to ggplot2. Nick Howlett Data Scientist
LondonR: Introduction to ggplot2 Nick Howlett Data Scientist Email: nhowlett@mango-solutions.com Agenda Catie Gamble, M&S - Using R to Understand Revenue Opportunities for your Online Business Andrie de
More informationThis chapter describes a handful of things you can do to customize Office
Chapter 1: Customizing an Office Program In This Chapter Personalizing the Ribbon Changing around the Quick Access toolbar Choosing what appears on the status bar Choosing a new color scheme Devising keyboard
More informationIntroduction to R: Using R for Statistics and Data Analysis. BaRC Hot Topics
Introduction to R: Using R for Statistics and Data Analysis BaRC Hot Topics http://barc.wi.mit.edu/hot_topics/ Why use R? Perform inferential statistics (e.g., use a statistical test to calculate a p-value)
More informationTricking it Out: Tricks to personalize and customize your graphs.
Tricking it Out: Tricks to personalize and customize your graphs. Graphing templates may be used online without downloading them onto your own computer. However, if you would like to use the templates
More informationVisualizing Data: Customization with ggplot2
Visualizing Data: Customization with ggplot2 Data Science 1 Stanford University, Department of Statistics ggplot2: Customizing graphics in R ggplot2 by RStudio s Hadley Wickham and Winston Chang offers
More informationIntro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf. Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University
Intro to R h)p://jacobfenton.s3.amazonaws.com/r- handson.pdf Jacob Fenton CAR Director InvesBgaBve ReporBng Workshop, American University Overview Import data Move around the file system, save an image
More informationIntroduction to Minitab 1
Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,
More informationPackage infer. July 11, Type Package Title Tidy Statistical Inference Version 0.3.0
Type Package Title Tidy Statistical Inference Version 0.3.0 Package infer July 11, 2018 The objective of this package is to perform inference using an epressive statistical grammar that coheres with the
More informationIntroduction to the workbook and spreadsheet
Excel Tutorial To make the most of this tutorial I suggest you follow through it while sitting in front of a computer with Microsoft Excel running. This will allow you to try things out as you follow along.
More informationWeek 1: Introduction to R, part 1
Week 1: Introduction to R, part 1 Goals Learning how to start with R and RStudio Use the command line Use functions in R Learning the Tools What is R? What is RStudio? Getting started R is a computer program
More informationArtDMX DMX control software V1.4
User manual ArtDMX DMX control software V1.4 1 2 Table of contents : 1. How to start a new Project...6 1.1. Introduction...6 1.2. System Requirements...6 1.3. Installing software and drivers...7 1.4. Software
More informationTransform Data! The Basics Part I!
Transform Data! The Basics Part I! arrange() arrange() Order rows from smallest to largest values arrange(.data, ) Data frame to transform One or more columns to order by (addi3onal columns will be used
More informationSUM - This says to add together cells F28 through F35. Notice that it will show your result is
COUNTA - The COUNTA function will examine a set of cells and tell you how many cells are not empty. In this example, Excel analyzed 19 cells and found that only 18 were not empty. COUNTBLANK - The COUNTBLANK
More informationIndividual Covariates
WILD 502 Lab 2 Ŝ from Known-fate Data with Individual Covariates Today s lab presents material that will allow you to handle additional complexity in analysis of survival data. The lab deals with estimation
More informationBIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...
BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... 2. Load R Data.. a. Load R data frames...
More informationLab2 Jacob Reiser September 30, 2016
Lab2 Jacob Reiser September 30, 2016 Introduction: An R-Blogger recently found a data set from a project of New York s Public Library called What s on the Menu, which can be found at https://www.r-bloggers.com/a-fun-gastronomical-dataset-whats-on-the-menu/.
More informationStat 849: Plotting responses and covariates
Stat 849: Plotting responses and covariates Douglas Bates 10-09-03 Outline Contents 1 R Graphics Systems Graphics systems in R ˆ R provides three dierent high-level graphics systems base graphics The system
More informationOld Faithful Chris Parrish
Old Faithful Chris Parrish 17-4-27 Contents Old Faithful eruptions 1 data.................................................. 1 duration................................................ 1 waiting time..............................................
More informationSpreadsheet View and Basic Statistics Concepts
Spreadsheet View and Basic Statistics Concepts GeoGebra 3.2 Workshop Handout 9 Judith and Markus Hohenwarter www.geogebra.org Table of Contents 1. Introduction to GeoGebra s Spreadsheet View 2 2. Record
More informationUsing IDLE for
Using IDLE for 15-110 Step 1: Installing Python Download and install Python using the Resources page of the 15-110 website. Be sure to install version 3.3.2 and the correct version depending on whether
More information