Facets and Continuous graphs

Similar documents
Lecture 4: Data Visualization I

The diamonds dataset Visualizing data in R with ggplot2

ggplot2 basics Hadley Wickham Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University September 2011

Statistical transformations

Introduction to R and the tidyverse. Paolo Crosetto

03 - Intro to graphics (with ggplot2)

Getting started with ggplot2

The following presentation is based on the ggplot2 tutotial written by Prof. Jennifer Bryan.

Intro to R for Epidemiologists

Econ 2148, spring 2019 Data visualization

Plotting with ggplot2: Part 2. Biostatistics

Importing and visualizing data in R. Day 3

Graphical critique & theory. Hadley Wickham

Rstudio GGPLOT2. Preparations. The first plot: Hello world! W2018 RENR690 Zihaohan Sang

# Call plot plot(gg)

Data Visualization. Module 7

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Creating elegant graphics in R with ggplot2

An introduction to ggplot: An implementation of the grammar of graphics in R

Install RStudio from - use the standard installation.

DATA VISUALIZATION WITH GGPLOT2. Coordinates

Advanced Plotting with ggplot2. Algorithm Design & Software Engineering November 13, 2016 Stefan Feuerriegel

Large data. Hadley Wickham. Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University.

data visualization Show the Data Snow Month skimming deep waters

Introduction to Data Visualization

ggplot2 for beginners Maria Novosolov 1 December, 2014

Maps & layers. Hadley Wickham. Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University.

A Quick and focused overview of R data types and ggplot2 syntax MAHENDRA MARIADASSOU, MARIA BERNARD, GERALDINE PASCAL, LAURENT CAUQUIL

You submitted this quiz on Sat 17 May :19 AM CEST. You got a score of out of

ggplot in 3 easy steps (maybe 2 easy steps)

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

An Introduction to R Graphics

Plotting with Rcell (Version 1.2-5)

Data Visualization Using R & ggplot2. Karthik Ram October 6, 2013

Introduction to ggvis. Aimee Gott R Consultant

INTRODUCTION TO DATA. Welcome to the course!

Data visualization with ggplot2

Package ggseas. June 12, 2018

Stat405. Displaying distributions. Hadley Wickham. Thursday, August 23, 12

Introduction to Graphics with ggplot2

LondonR: Introduction to ggplot2. Nick Howlett Data Scientist

R Workshop 1: Introduction to R

STAT 1291: Data Science

Visualizing Data: Customization with ggplot2

An introduction to R Graphics 4. ggplot2

User manual forggsubplot

Lab5A - Intro to GGPLOT2 Z.Sang Sept 24, 2018

Package autocogs. September 22, Title Automatic Cognostic Summaries Version 0.1.1

Data Visualization in R

Stat 849: Plotting responses and covariates

Visualizing the World

Hadley Wickham. ggplot2. Elegant Graphics for Data Analysis. July 26, Springer

A set of rules describing how to compose a 'vocabulary' into permissible 'sentences'

Package lvplot. August 29, 2016

S4C03, HW2: model formulas, contrasts, ggplot2, and basic GLMs

Using Built-in Plotting Functions

Introduction to R for Beginners, Level II. Jeon Lee Bio-Informatics Core Facility (BICF), UTSW

Data Visualization in R

Easy interactive ggplots

Stat 849: Plotting responses and covariates

Package ggsubplot. February 15, 2013

social data science Data Visualization Sebastian Barfort August 08, 2016 University of Copenhagen Department of Economics 1/86

Package plotluck. November 13, 2016

Ggplot2 QMMA. Emanuele Taufer. 2/19/2018 Ggplot2 (1)

EXPLORATORY DATA ANALYSIS. Introducing the data

1 The ggplot2 workflow

Introduction to R and R-Studio In-Class Lab Activity The 1970 Draft Lottery

Session 5 Nick Hathaway;

Section 7D Systems of Linear Equations

Ben Baumer Instructor

Package ggextra. April 4, 2018

Making Science Graphs and Interpreting Data

Package ggloop. October 20, 2016

Automation.

Package ggpmisc. May 4, 2018

Applied Regression Modeling: A Business Approach

Bivariate Linear Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Package ggqc. R topics documented: January 30, Type Package Title Quality Control Charts for 'ggplot' Version Author Kenith Grey

Package cowplot. March 6, 2016

Integrated Math I. IM1.1.3 Understand and use the distributive, associative, and commutative properties.

grammar statistical graphics Building a in Clojure Kevin Lynagh 2012 November 9 Øredev Keming Labs Malmö, Sweden

Excel 2010 with XLSTAT

SAS (Statistical Analysis Software/System)

Data Science and Machine Learning Essentials

The Average and SD in R

Homework 1 Excel Basics

Package ggmosaic. February 9, 2017

Arizona Academic Standards

Arkansas Curriculum Framework for Computer Applications II

Graphics in R. There are three plotting systems in R. base Convenient, but hard to adjust after the plot is created

Package ggcyto. October 12, 2016

Resources: Books. Data Visualization in R 4. ggplot2. What is ggplot2? Resources: Cheat sheets

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Error-Bar Charts from Summary Data

Lesson 16: More on Modeling Relationships with a Line

Package ggcyto. April 14, 2017

Introduction to R. Introduction to Econometrics W

How to Plot With Ggiraph

DATA VISUALIZATION WITH GGPLOT2. Grid Graphics

Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project

Transcription:

Facets and Continuous graphs One way to add additional variables is with aesthetics. Another way, particularly useful for categorical variables, is to split your plot into facets, subplots that each display one subset of the data. To facet your plot by a single variable, use facet_wrap(). The first argument of facet_wrap() should be a formula, which you create with ~ followed by a variable name (here formula is the name of a data structure in R, not a synonym for equation ). The variable that you pass to facet_wrap() should be discrete. geom_point(mapping = aes(x = displ, y = hwy)) + facet_wrap(~ class, nrow = 2) To facet your plot on the combination of two variables, add facet_grid() to your plot call. The first argument of facet_grid() is also a formula. This time the formula should contain two variable names separated by a ~. geom_point(mapping = aes(x = displ, y = hwy)) + facet_grid(drv ~ cyl)

If you prefer to not facet in the rows or columns dimension, use a. instead of a variable name, e.g. + facet_grid(. ~ cyl). Geometric objects How are these two plots similar?

Both plots contain the same x variable, the same y variable, and both describe the same data. But the plots are not identical. Each plot uses a different visual object to represent the data. In ggplot2 syntax, we say that they use different geoms. A geom is the geometrical object that a plot uses to represent data. People often describe plots by the type of geom that the plot uses. For example, bar charts use bar geoms, line charts use line geoms, boxplots use boxplot geoms, and so on. Scatterplots break the trend; they use the point geom. As we see above, you can use different geoms to plot the same data. The plot on the left uses the point geom, and the plot on the right uses the smooth geom, a smooth line fitted to the data. To change the geom in your plot, change the geom function that you add to ggplot(). For instance, to make the plots above, you can use this code: # left geom_point(mapping = aes(x = displ, y = hwy)) # right geom_smooth(mapping = aes(x = displ, y = hwy)) Every geom function in ggplot2 takes a mapping argument. However, not every aesthetic works with every geom. For example, shape only works with point, because, we could set the shape of a point, but you couldn t set the shape of a line. On the other hand, you could set the linetype of a line. geom_smooth() will draw a different line, with a different linetype, for each unique value of the variable that you map to linetype. geom_smooth(mapping = aes(x = displ, y = hwy, linetype = drv)) Here geom_smooth() separates the cars into three lines based on their drv value, which describes a car s drivetrain.

One line describes all of the points with a 4 value, one line describes all of the points with an f value, and one line describes all of the points with an r value. Here, 4 stands for four-wheel drive, f for front-wheel drive, and r for rear-wheel drive. We can make it more clear by overlaying the lines on top of the raw data and then coloring everything according to drv. Notice that this plot contains two geoms in the same graph! Place multiple geoms in the same plot. ggplot2 provides over many geoms, and extension packages provide even more (see https://www.ggplot2-exts.org ). The best way to get a comprehensive overview is the ggplot2 cheatsheet, which you can find at http://rstudio.com/cheatsheets. To learn more about any single geom, use help:?geom_smooth. Many geoms, like geom_smooth(), use a single geometric object to display multiple rows of data. For these geoms, you can set the group aesthetic to a categorical variable to draw multiple objects. ggplot2 will draw a separate object for each unique value of the grouping variable. In practice, ggplot2 will automatically group the data for these geoms whenever you map an aesthetic to a discrete variable (as in the linetype example). geom_smooth(mapping = aes(x = displ, y = hwy)) geom_smooth(mapping = aes(x = displ, y = hwy, group = drv))

geom_smooth( mapping = aes(x = displ, y = hwy, color = drv), show.legend = FALSE ) To display multiple geoms in the same plot, add multiple geom functions to ggplot(): geom_point(mapping = aes(x = displ, y = hwy)) + geom_smooth(mapping = aes(x = displ, y = hwy))

This, however, introduces some duplication in our code. Imagine if you wanted to change the y-axis to display cty instead of hwy. You d need to change the variable in two places, and you might forget to update one. You can avoid this type of repetition by passing a set of mappings to ggplot(). ggplot2 will treat these mappings as global mappings that apply to each geom in the graph. In other words, this code will produce the same plot as the previous code: ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point() + geom_smooth() If you place mappings in a geom function, ggplot2 will treat them as local mappings for the layer. It will use these mappings to extend or overwrite the global mappings for that layer only. This makes it possible to display different aesthetics in different layers. ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point(mapping = aes(color = class)) + geom_smooth()

You can use the same idea to specify different data for each layer. Here, our smooth line displays just a subset of the mpg dataset, the subcompact cars. The local data argument in geom_smooth() overrides the global data argument in ggplot() for that layer only. ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point(mapping = aes(color = class)) + geom_smooth(data = filter(mpg, class == "subcompact"), se = FALSE)

se stands for standard error. By default it is set to TRUE and show a shaded area for the error between the predicted value and the real value. ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + + geom_point(mapping = aes(color = class)) + + geom_smooth(data = filter(mpg, class == "subcompact"), se = TRUE)