Transform Data! The Basics Part I!

Size: px
Start display at page:

Download "Transform Data! The Basics Part I!"

Transcription

1 Transform Data! The Basics Part I!

2 arrange()

3 arrange() Order rows from smallest to largest values arrange(.data, ) Data frame to transform One or more columns to order by (addi3onal columns will be used as 3e breakers)

4 Common syntax Each function takes a data frame as the first argument, and returns a data frame arrange(.data, ) dplyr func3on data frame to transform func3on specific arguments

5 arrange() Order rows from smallest to largest values arrange(babynames, n) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop 1899 M Lance e M Charles M James M William M John

6 Your Turn 3 Arrange babynames by n. Add prop as a second (tie breaking) variable to arrange on. Can you tell what the smallest value of n is? How does adding prop affect the arrangement?

7 arrange(babynames, n, prop)

8 Helper function desc() Change ordering to go from largest to smallest arrange(babynames, desc(n)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop 1899 M John M William M James M Charles M Lance e-05

9 Your Turn 4 Use desc() to find the names with the highest prop. Then, use desc() to find the names with the highest n.

10 arrange(babynames, desc(prop)) arrange(babynames, desc(n))

11 mutate()

12 mutate() Create new columns mutate(.data, ) Data frame to transform One or more new columns to create

13 mutate() Create new columns mutate(babynames, percent = round(prop * 100, 2)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop percent 1899 M John M William M James M Lance e M Charles

14 Create new columns mutate() mutate(babynames, percent = round(prop * 100, 2), nper = round(percent)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop percent nper 1899 M John M William M James M Lance e M Charles

15

16 Window function min_rank() A go to ranking function (ties share the lowest rank) min_rank(c(50, 100, 100, 1000)) # [1] min_rank(desc(c(50, 100, 100, 1000))) # [1]

17 Your Turn 5 Use min_rank() and mutate() to rank each row in babynames from largest prop to lowest prop

18 mutate(babynames, rank = min_rank(desc(prop)))

19 %>%

20 Multiple steps (composed functions) arrange(mutate(filter(babynames, year == 2015, sex == M ), rank == min_rank(desc(prop))), rank) 1. Filter babynames to just boys born in Rank the names by proportion so that higher proportions have lower rank 3. Arrange the names by rank

21 Multiple steps (intermediate data frames) boys_2015 <- filter(babynames, year == 2015, sex == M ) boys_2015 <- mutate(boys_2015, rank == min_rank(desc(prop))) boys_2015 <- arrange(boys_2015, rank) boys_2015

22 Multiple steps (intermediate data frames) boys_2015 <- filter(babynames, year == 2015, sex == M ) boys_2015 <- mutate(boys_2015, rank == min_rank(desc(prop))) boys_2015 <- arrange(boys_2015, rank) boys_2015

23 The pipe operator %>% %>% babynames filter(, n == 99680) Passes result on left into first argument of the function on right. So, these two lines do the same thing. Try it! filter(babynames, n == 99680) babynames %>% filter(n == 99680)

24 Multiple steps (pipe operator) babynames %>% filter(year == 2015, sex == M ) %>% mutate(rank == min_rank(desc(prop))) %>% arrange(rank) 1. Allows us to eliminate redundant code (assigning to the same data frame over and over) and/or unwanted intermediate data frames 2. Allows us to write code in the same way we think about the problem

25 Shortcut to type %>%

26 Your Turn 6 Use %>% to write a sequence of functions that: 1. Filter babynames to just the girls born in Mutate to make a percent column rounded to a whole number 3. Arrange the results in descending order based on the percent column

27 babynames %>% filter(year == 1900, sex == F ) %>% mutate(percent = round(prop * 100)) %>% arrange(desc(percent))

28 Your Turn 7 Write code to do the following: 1. Trim babynames to just the rows that contain your name and your sex 2. Plot the results as a line graph with year on the x-axis and prop on the y-axis

29 babynames %>% filter(name == Lance, sex == M ) %>% ggplot() + geom_line(aes(year, prop))

30 What are the most popular names?

31 How should we define popularity? A name is popular if: 1. Sums a large number of children have the name when you sum across years 2. Ranks it consistently ranks among the top names from year to year

32 Question Do we have enough information to: 1. Calculate the total number of children with each name? 2. Rank names within each year?

33 Deriving information mutate() create new variables summarise() summarise variables group_by() group cases

34 summarise()

35 summarise() Compute table of summaries babynames %>% summarise(total = sum(n), max = max(n)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles total max

36 Your Turn 8 Use summarise() to compute three statistics about the data: 1. The first (minimum) year in the data set 2. The last (maximum) year in the data set 3. The total number of children represented in the data set

37 babynames %>% summarise(first = min(year), last = max(year), total = sum(n))

38 Your Turn 9 Extract the rows where name == Khaleesi. Then use summarise() and summary functions to find: 1. The first year Khaleesi appeared in the data 2. The total number of children named Khaleesi

39 babynames %>% filter(name == Khaleesi ) %>% summarise(first = min(year), total = sum(n))

40

41 n() The number of rows in a data set babynames %>% summarise(n = n()) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles F John e-04 n 6

42 n_distinct() The number of distinct values in a variable babynames %>% summarise(n = n(), nname = n_distinct(name)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles F John e-04 n nname 6 5

43 group_by()

44 group_by() Groups cases by common values of one or more columns babynames %>% group_by(sex)

45 group_by() babynames %>% group_by(sex) %>% summarise(total = sum(n)) babynames year sex name n prop 1899 F Anne e F John e F Mary M John M Mary e M Lance e-05 sex total F M 7094

46 group_by() babynames %>% group_by(year, sex) %>% summarise(total = sum(n)) babynames year sex name n prop 1899 F Anne e F John e F Mary M John M Mary e M Lance e-05 year sex total 1899 F M F M 99

47 Your Turn 10 Use group_by(), summarise(), and arrange() to display the ten most popular names. Compute popularity as the total number of children of a single gender given a name.

48 babynames %>% group_by(name, sex) %>% summarise(total = sum(n)) %>% arrange(desc(total))

49

50 babynames %>% group_by(name, sex) %>% summarise(total = sum(n)) %>% arrange(desc(total)) %>% ungroup() %>% slice(1:10) %>% ggplot() + geom_col(aes(fct_reorder(name, desc(total)), total/ , fill = sex)) + theme_bw() + scale_fill_brewer() + labs(x = name, y = total (in millions) )

51 Your Turn 11 Use grouping to calculate and then plot the number of children born each year over time. Do not worry about changing the theme, color scheme, labels, etc.

52 babynames %>% group_by(year) %>% summarise(n_children = sum(n)) %>% ggplot() + geom_line(aes(year, n_children)) How does this affect our measure of popularity?

53 Sources

Transform Data! The Basics Part I continued!

Transform Data! The Basics Part I continued! Transform Data! The Basics Part I continued! arrange() arrange() Order rows from smallest to largest values arrange(.data, ) Data frame to transform One or more columns to order by (addi3onal columns will

More information

Grammar of data. dplyr. Bjarki Þór Elvarsson and Einar Hjörleifsson. Marine Research Institute. Bjarki&Einar (MRI) R-ICES 1 / 29

Grammar of data. dplyr. Bjarki Þór Elvarsson and Einar Hjörleifsson. Marine Research Institute. Bjarki&Einar (MRI) R-ICES 1 / 29 dplyr Bjarki Þór Elvarsson and Einar Hjörleifsson Marine Research Institute Bjarki&Einar (MRI) R-ICES 1 / 29 Working with data A Reformat a variable (e.g. as factors or dates) B Split one variable into

More information

Lecture 12: Data carpentry with tidyverse

Lecture 12: Data carpentry with tidyverse http://127.0.0.1:8000/.html Lecture 12: Data carpentry with tidyverse STAT598z: Intro. to computing for statistics Vinayak Rao Department of Statistics, Purdue University options(repr.plot.width=5, repr.plot.height=3)

More information

Text & Patterns. stat 579 Heike Hofmann

Text & Patterns. stat 579 Heike Hofmann Text & Patterns stat 579 Heike Hofmann Outline Character Variables Control Codes Patterns & Matching Baby Names Data The social security agency keeps track of all baby names used in applications for social

More information

An Introduction to R. Ed D. J. Berry 9th January 2017

An Introduction to R. Ed D. J. Berry 9th January 2017 An Introduction to R Ed D. J. Berry 9th January 2017 Overview Why now? Why R? General tips Recommended packages Recommended resources 2/48 Why now? Efficiency Pointandclick software just isn't time efficient

More information

Data wrangling. Reduction/Aggregation: reduces a variable to a scalar

Data wrangling. Reduction/Aggregation: reduces a variable to a scalar Data Wrangling Some definitions A data table is a collection of variables and observations A variable (when data are tidy) is a single column in a data table An observation is a single row in a data table,

More information

Numerical Summaries of Data Section 14.3

Numerical Summaries of Data Section 14.3 MATH 11008: Numerical Summaries of Data Section 14.3 MEAN mean: The mean (or average) of a set of numbers is computed by determining the sum of all the numbers and dividing by the total number of observations.

More information

WHOLE NUMBER AND DECIMAL OPERATIONS

WHOLE NUMBER AND DECIMAL OPERATIONS WHOLE NUMBER AND DECIMAL OPERATIONS Whole Number Place Value : 5,854,902 = Ten thousands thousands millions Hundred thousands Ten thousands Adding & Subtracting Decimals : Line up the decimals vertically.

More information

The Average and SD in R

The Average and SD in R The Average and SD in R The Basics: mean() and sd() Calculating an average and standard deviation in R is straightforward. The mean() function calculates the average and the sd() function calculates the

More information

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques. Section 2.1 - Introduction Graphs are commonly used to organize, summarize, and analyze collections of data. Using a graph to visually present a data set makes it easy to comprehend and to describe the

More information

Introducing R/Tidyverse to Clinical Statistical Programming

Introducing R/Tidyverse to Clinical Statistical Programming Introducing R/Tidyverse to Clinical Statistical Programming MBSW 2018 Freeman Wang, @freestatman 2018-05-15 Slides available at https://bit.ly/2knkalu Where are my biases Biomarker Statistician Genomic

More information

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc Section 2-2 Frequency Distributions Copyright 2010, 2007, 2004 Pearson Education, Inc. 2.1-1 Frequency Distribution Frequency Distribution (or Frequency Table) It shows how a data set is partitioned among

More information

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution Chapter 2 Summarizing & Graphing Data Slide 1 Overview Descriptive Statistics Slide 2 A) Overview B) Frequency Distributions C) Visualizing Data summarize or describe the important characteristics of a

More information

2.3 Organizing Quantitative Data

2.3 Organizing Quantitative Data 2.3 Organizing Quantitative Data This section will focus on ways to organize quantitative data into tables, charts, and graphs. Quantitative data is organized by dividing the observations into classes

More information

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form. CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.

More information

PGQL 0.9 Specification

PGQL 0.9 Specification PGQL 0.9 Specification Table of Contents Table of Contents Introduction Basic Query Structure Clause Topology Constraint Repeated Variables in Multiple Topology Constraints Syntactic Sugars for Topology

More information

IMPORTANT WORDS TO KNOW UNIT 1

IMPORTANT WORDS TO KNOW UNIT 1 IMPORTANT WORDS TO KNOW UNIT READ THESE WORDS ALOUD THREE TIMES WITH YOUR TEACHER! Chapter. equation. integer 3. greater than 4. positive 5. negative 6. operation 7. solution 8. variable Chapter. ordered

More information

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017 THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS Summer semester, 2016/2017 SOCIAL NETWORK ANALYSIS: THEORY AND APPLICATIONS 1. A FEW THINGS ABOUT NETWORKS NETWORKS IN THE REAL WORLD There are four categories

More information

DAY 52 BOX-AND-WHISKER

DAY 52 BOX-AND-WHISKER DAY 52 BOX-AND-WHISKER VOCABULARY The Median is the middle number of a set of data when the numbers are arranged in numerical order. The Range of a set of data is the difference between the highest and

More information

CHAPTER 2. Objectives. Frequency Distributions and Graphs. Basic Vocabulary. Introduction. Organise data using frequency distributions.

CHAPTER 2. Objectives. Frequency Distributions and Graphs. Basic Vocabulary. Introduction. Organise data using frequency distributions. CHAPTER 2 Objectives Organise data using frequency distributions. Distributions and Graphs Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives. Represent

More information

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data Chapter 2 Organizing and Graphing Data 2.1 Organizing and Graphing Qualitative Data 2.2 Organizing and Graphing Quantitative Data 2.3 Stem-and-leaf Displays 2.4 Dotplots 2.1 Organizing and Graphing Qualitative

More information

Лекция 4 Трансформация данных в R

Лекция 4 Трансформация данных в R Анализ данных Лекция 4 Трансформация данных в R Гедранович Ольга Брониславовна, старший преподаватель кафедры ИТ, МИУ volha.b.k@gmail.com 2 Вопросы лекции Фильтрация (filter) Сортировка (arrange) Выборка

More information

Measures of Central Tendency

Measures of Central Tendency Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of

More information

Supporting our children to aim high!

Supporting our children to aim high! Reach for the Sky Supporting our children to aim high! St Mary s CE School Maths Support Resources Parents often ask us, how can I help my child in maths? Firstly, we provide parents with the expectations

More information

Session 3 Nick Hathaway;

Session 3 Nick Hathaway; Session 3 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Manipulating Data frames and matrices 1 Converting to long vs wide formats.................................... 2 Manipulating data in table........................................

More information

Transformations. Hadley Wickham. October 2009

Transformations. Hadley Wickham. October 2009 Transformations Hadley Wickham October 2009 1. US baby names data 2. Transformations 3. Summaries 4. Doing it by group Baby names Top 1000 male and female baby names in the US, from 1880 to 2008. 258,000

More information

Section 1.1 The Distance and Midpoint Formulas; Graphing Utilities; Introduction to Graphing Equations

Section 1.1 The Distance and Midpoint Formulas; Graphing Utilities; Introduction to Graphing Equations Section 1.1 The Distance and Midpoint Formulas; Graphing Utilities; Introduction to Graphing Equations origin (x, y) Ordered pair (x-coordinate, y-coordinate) (abscissa, ordinate) x axis Rectangular or

More information

Unit 3 Fill Series, Functions, Sorting

Unit 3 Fill Series, Functions, Sorting Unit 3 Fill Series, Functions, Sorting Fill enter repetitive values or formulas in an indicated direction Using the Fill command is much faster than using copy and paste you can do entire operation in

More information

Unit 3 Functions Review, Fill Series, Sorting, Merge & Center

Unit 3 Functions Review, Fill Series, Sorting, Merge & Center Unit 3 Functions Review, Fill Series, Sorting, Merge & Center Function built-in formula that performs simple or complex calculations automatically names a function instead of using operators (+, -, *,

More information

Excel Formulas & Functions I CS101

Excel Formulas & Functions I CS101 Excel Formulas & Functions I CS101 Topics Covered Use statistical functions Use cell references Use AutoFill Write formulas Use the RANK.EQ function Calculation in Excel Click the cell where you want to

More information

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set. Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean the sum of all data values divided by the number of values in

More information

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018 Day 4 Box & Whisker Plots and Percentiles In a previous lesson, we learned that the median divides a set a data into 2 equal parts. Sometimes it is necessary to divide the data into smaller more precise

More information

Data Management Page 1 of 12 Permutations Extra Problems (solutions)

Data Management Page 1 of 12 Permutations Extra Problems (solutions) Data Management Page of. a) How many -digit numbers can be formed using the digits 0,,,,, if no digits may be repeated in the number? remaining digits, including zero not zero {,,,, } b) How many of the

More information

Excel Boot Camp PIONEER TRAINING, INC.

Excel Boot Camp PIONEER TRAINING, INC. Excel Boot Camp Dates and Times: Cost: $250 1/22, 2-4 PM 1/29, 2-4 PM 2/5, 2-4 PM 2/12, 2-4 PM Please register online or call our office. (413) 387-1040 This consists of four-part class is aimed at students

More information

74 Wyner Math Academy I Spring 2016

74 Wyner Math Academy I Spring 2016 74 Wyner Math Academy I Spring 2016 CHAPTER EIGHT: SPREADSHEETS Review April 18 Test April 25 Spreadsheets are an extremely useful and versatile tool. Some basic knowledge allows many basic tasks to be

More information

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables

More information

Test Bank for Privitera, Statistics for the Behavioral Sciences

Test Bank for Privitera, Statistics for the Behavioral Sciences 1. A simple frequency distribution A) can be used to summarize grouped data B) can be used to summarize ungrouped data C) summarizes the frequency of scores in a given category or range 2. To determine

More information

Frequency Distributions

Frequency Distributions Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,

More information

Chapter 2 Ratios, Percents, Simple Equations, and Ratio-Proportion

Chapter 2 Ratios, Percents, Simple Equations, and Ratio-Proportion Chapter 2 Ratios, Percents, Simple Equations, and Ratio-Proportion PROBLEM Decimal Fraction Percent Ratio 1. 0.05 2. 3. 45% 4. 1. Complete row 1 in the table above., 5%, 1:20 DIF: Application REF: Ratios

More information

Stage 5 PROMPT sheet. 5/3 Negative numbers 4 7 = -3. l l l l l l l l l /1 Place value in numbers to 1million = 4

Stage 5 PROMPT sheet. 5/3 Negative numbers 4 7 = -3. l l l l l l l l l /1 Place value in numbers to 1million = 4 Stage PROMPT sheet / Place value in numbers to million The position of the digit gives its size / Negative numbers A number line is very useful for negative numbers. The number line below shows: 7 - l

More information

Dplyr Introduction Matthew Flickinger July 12, 2017

Dplyr Introduction Matthew Flickinger July 12, 2017 Dplyr Introduction Matthew Flickinger July 12, 2017 Introduction to Dplyr This document gives an overview of many of the features of the dplyr library include in the tidyverse of related R pacakges. First

More information

Stage 5 PROMPT sheet. 5/3 Negative numbers 4 7 = -3. l l l l l l l l l /1 Place value in numbers to 1million = 4

Stage 5 PROMPT sheet. 5/3 Negative numbers 4 7 = -3. l l l l l l l l l /1 Place value in numbers to 1million = 4 Millions Hundred thousands Ten thousands Thousands Hundreds Tens Ones Stage PROMPT sheet / Place value in numbers to million The position of the digit gives its size / Negative numbers A number line is

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Working with Data and Charts

Working with Data and Charts PART 9 Working with Data and Charts In Excel, a formula calculates a value based on the values in other cells of the workbook. Excel displays the result of a formula in a cell as a numeric value. A function

More information

Year 5 PROMPT sheet. Negative numbers 4 7 = -3. l l l l l l l l l Place value in numbers to 1million = 4

Year 5 PROMPT sheet. Negative numbers 4 7 = -3. l l l l l l l l l Place value in numbers to 1million = 4 Year PROMPT sheet Place value in numbers to million The position of the digit gives its size Millions Hundred thousands Ten thousands thousands hundreds tens units 7 Negative numbers A number line is very

More information

COMP 250 Fall heaps 2 Nov. 3, 2017

COMP 250 Fall heaps 2 Nov. 3, 2017 At the end of last lecture, I showed how to represent a heap using an array. The idea is that an array representation defines a simple relationship between a tree node s index and its children s index.

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

Data Wrangling Jo Hardin September 11 & 13, 2017

Data Wrangling Jo Hardin September 11 & 13, 2017 Data Wrangling Jo Hardin September 11 & 13, 2017 Goals Piping / chaining Basic data verbs Higher level data verbs Datasets starwars is from dplyr, although originally from SWAPI, the Star Wars API, http://swapi.co/.

More information

Organizing and Summarizing Data

Organizing and Summarizing Data 1 Organizing and Summarizing Data Key Definitions Frequency Distribution: This lists each category of data and how often they occur. : The percent of observations within the one of the categories. This

More information

TABLE OF CONTENTS. i Excel 2016 Basic

TABLE OF CONTENTS. i Excel 2016 Basic i TABLE OF CONTENTS TABLE OF CONTENTS I PREFACE VII 1 INTRODUCING EXCEL 1 1.1 Starting Excel 1 Starting Excel using the Start button in Windows 1 1.2 Screen components 2 Tooltips 3 Title bar 4 Window buttons

More information

Aston Hall s A-Z of mathematical terms

Aston Hall s A-Z of mathematical terms Aston Hall s A-Z of mathematical terms The following guide is a glossary of mathematical terms, covering the concepts children are taught in FS2, KS1 and KS2. This may be useful to clear up any homework

More information

An introduction to ggplot: An implementation of the grammar of graphics in R

An introduction to ggplot: An implementation of the grammar of graphics in R An introduction to ggplot: An implementation of the grammar of graphics in R Hadley Wickham 00-0-7 1 Introduction Currently, R has two major systems for plotting data, base graphics and lattice graphics

More information

06 Visualizing Information

06 Visualizing Information Professor Shoemaker 06-VisualizingInformation.xlsx 1 It can be sometimes difficult to uncover meaning in data that s presented in a table or list Especially if the table has many rows and/or columns But

More information

Lecture 3: Pipes and creating variables using mutate()

Lecture 3: Pipes and creating variables using mutate() Lecture 3: Pipes and creating variables using mutate() EDUC 263: Managing and Manipulating Data Using R Ozan Jaquette 1 Introduction What we will do today 1. Introduction 1.1 Finish lecture 2, filter and

More information

JUST THE MATHS UNIT NUMBER STATISTICS 1 (The presentation of data) A.J.Hobson

JUST THE MATHS UNIT NUMBER STATISTICS 1 (The presentation of data) A.J.Hobson JUST THE MATHS UNIT NUMBER 18.1 STATISTICS 1 (The presentation of data) by A.J.Hobson 18.1.1 Introduction 18.1.2 The tabulation of data 18.1.3 The graphical representation of data 18.1.4 Exercises 18.1.5

More information

Exploratory Data Analysis

Exploratory Data Analysis Chapter 10 Exploratory Data Analysis Definition of Exploratory Data Analysis (page 410) Definition 12.1. Exploratory data analysis (EDA) is a subfield of applied statistics that is concerned with the investigation

More information

Maths Class 9 Notes for Statistics

Maths Class 9 Notes for Statistics 1 P a g e Maths Class 9 Notes for Statistics BASIC TERMS Primary data : Data which collected for the first time by the statistical investigator or with the help of his workers is called primary data. Secondary

More information

2.1: Frequency Distributions

2.1: Frequency Distributions 2.1: Frequency Distributions Frequency Distribution: organization of data into groups called. A: Categorical Frequency Distribution used for and level qualitative data that can be put into categories.

More information

Data Manipulation. Module 5

Data Manipulation.   Module 5 Data Manipulation http://datascience.tntlab.org Module 5 Today s Agenda A couple of base-r notes Advanced data typing Relabeling text In depth with dplyr (part of tidyverse) tbl class dplyr grammar Grouping

More information

Introduction to SQL Server 2005/2008 and Transact SQL

Introduction to SQL Server 2005/2008 and Transact SQL Introduction to SQL Server 2005/2008 and Transact SQL Week 2 TRANSACT SQL CRUD Create, Read, Update, and Delete Steve Stedman - Instructor Steve@SteveStedman.com Homework Review Review of homework from

More information

Section 3.2 Measures of Central Tendency MDM4U Jensen

Section 3.2 Measures of Central Tendency MDM4U Jensen Section 3.2 Measures of Central Tendency MDM4U Jensen Part 1: Video This video will review shape of distributions and introduce measures of central tendency. Answer the following questions while watching.

More information

Common Core Vocabulary and Representations

Common Core Vocabulary and Representations Vocabulary Description Representation 2-Column Table A two-column table shows the relationship between two values. 5 Group Columns 5 group columns represent 5 more or 5 less. a ten represented as a 5-group

More information

Session 5 Nick Hathaway;

Session 5 Nick Hathaway; Session 5 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Adding Text To Plots 1 Line graph................................................. 1 Bar graph..................................................

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

Excel Module 7: Managing Data Using Tables

Excel Module 7: Managing Data Using Tables True / False 1. You should not have any blank columns or rows in your table. True LEARNING OBJECTIVES: ENHE.REDI.16.131 - Plan the data organization for a table 2. Field names should be similar to cell

More information

Ten Great Reasons to Learn SAS Software's SQL Procedure

Ten Great Reasons to Learn SAS Software's SQL Procedure Ten Great Reasons to Learn SAS Software's SQL Procedure Kirk Paul Lafler, Software Intelligence Corporation ABSTRACT The SQL Procedure has so many great features for both end-users and programmers. It's

More information

Math 155. Measures of Central Tendency Section 3.1

Math 155. Measures of Central Tendency Section 3.1 Math 155. Measures of Central Tendency Section 3.1 The word average can be used in a variety of contexts: for example, your average score on assignments or the average house price in Riverside. This is

More information

HOW TO DIVIDE: MCC6.NS.2 Fluently divide multi-digit numbers using the standard algorithm. WORD DEFINITION IN YOUR WORDS EXAMPLE

HOW TO DIVIDE: MCC6.NS.2 Fluently divide multi-digit numbers using the standard algorithm. WORD DEFINITION IN YOUR WORDS EXAMPLE MCC6.NS. Fluently divide multi-digit numbers using the standard algorithm. WORD DEFINITION IN YOUR WORDS EXAMPLE Dividend A number that is divided by another number. Divisor A number by which another number

More information

Subsetting, dplyr, magrittr Author: Lloyd Low; add:

Subsetting, dplyr, magrittr Author: Lloyd Low;  add: Subsetting, dplyr, magrittr Author: Lloyd Low; Email add: wai.low@adelaide.edu.au Introduction So you have got a table with data that might be a mixed of categorical, integer, numeric, etc variables? And

More information

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,

More information

CSC343 Fall 2007 Assignment 2 SQL and Embedded SQL

CSC343 Fall 2007 Assignment 2 SQL and Embedded SQL CSC343 Fall 2007 Assignment 2 SQL and Embedded SQL Distribution date: Friday, October 26, 2007 Due date: Monday, November 12, 2007 1:00 p.m. Instructions 1. Read this assignment thoroughly before you proceed.

More information

JOINING DATA TO FIND THE DEBTORS USING MICROSOFT ACCESS

JOINING DATA TO FIND THE DEBTORS USING MICROSOFT ACCESS JOINING DATA TO FIND THE DEBTORS USING MICROSOFT ACCESS Jim Mosley (Updated for Access 2010) Skills Summary: In this lesson, you will review basic query steps, perform multiple grouping of data, make a

More information

Intermediate Excel 2013

Intermediate Excel 2013 Intermediate Excel 2013 Class Objective: Elmhurst Public Library is committed to offering enriching programs to help our patrons Explore, Learn, and Grow. Today, technology skills are more than a valuable

More information

Descriptive Statistics and Graphing

Descriptive Statistics and Graphing Anatomy and Physiology Page 1 of 9 Measures of Central Tendency Descriptive Statistics and Graphing Measures of central tendency are used to find typical numbers in a data set. There are different ways

More information

MATH 117 Statistical Methods for Management I Chapter Two

MATH 117 Statistical Methods for Management I Chapter Two Jubail University College MATH 117 Statistical Methods for Management I Chapter Two There are a wide variety of ways to summarize, organize, and present data: I. Tables 1. Distribution Table (Categorical

More information

. Sheet - Sheet. Unhide Split Freeze. Sheet (book) - Sheet-book - Sheet{book} - Sheet[book] - Arrange- Freeze- Split - Unfreeze - .

. Sheet - Sheet. Unhide Split Freeze. Sheet (book) - Sheet-book - Sheet{book} - Sheet[book] - Arrange- Freeze- Split - Unfreeze - . 101 Excel 2007 (Workbook) : :. Sheet Workbook. Sheet Delete. Sheet. Unhide Split Freeze.1.2.3.4.5.6 Sheet.7 Sheet-book - Sheet (book) - Sheet{book} - Sheet[book] - Split - Unfreeze -.8 Arrange - Unhide

More information

Prob and Stats, Sep 4

Prob and Stats, Sep 4 Prob and Stats, Sep 4 Variations on the Frequency Histogram Book Sections: N/A Essential Questions: What are the methods for displaying data, and how can I build them? What are variations of the frequency

More information

0001 Understand the structure of numeration systems and multiple representations of numbers. Example: Factor 30 into prime factors.

0001 Understand the structure of numeration systems and multiple representations of numbers. Example: Factor 30 into prime factors. NUMBER SENSE AND OPERATIONS 0001 Understand the structure of numeration systems and multiple representations of numbers. Prime numbers are numbers that can only be factored into 1 and the number itself.

More information

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71 Stat 342 - Wk 3 What is SQL Proc SQL 'Select' command and 'from' clause 'group by' clause 'order by' clause 'where' clause 'create table' command 'inner join' (as time permits) Stat 342 Notes. Week 3,

More information

Announcements. Lab Friday, 1-2:30 and 3-4:30 in Boot your laptop and start Forte, if you brought your laptop

Announcements. Lab Friday, 1-2:30 and 3-4:30 in Boot your laptop and start Forte, if you brought your laptop Announcements Lab Friday, 1-2:30 and 3-4:30 in 26-152 Boot your laptop and start Forte, if you brought your laptop Create an empty file called Lecture4 and create an empty main() method in a class: 1.00

More information

A Whistle-Stop Tour of the Tidyverse

A Whistle-Stop Tour of the Tidyverse A Whistle-Stop Tour of the Tidyverse Aimee Gott Senior Consultant agott@mango-solutions.com @aimeegott_r In This Workshop You will learn What the tidyverse is & why bother using it What tools are available

More information

Matura 2012 Written exam in Mathematics

Matura 2012 Written exam in Mathematics Written exam in 4LW bil. e. Duration : Approved Materials: Remarks : 4 hours Calculator and handbook. (TI-89, TI Voyage 200, TI N-spire and a non-cas calculator) Formula Sheet (in English) and Fundamentum

More information

Test Bank for Database Processing Fundamentals Design and Implementation 13th Edition by Kroenke

Test Bank for Database Processing Fundamentals Design and Implementation 13th Edition by Kroenke Test Bank for Database Processing Fundamentals Design and Implementation 13th Edition by Kroenke Link full download: https://testbankservice.com/download/test-bank-fordatabase-processing-fundamentals-design-and-implementation-13th-edition-bykroenke

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 2 Summarizing and Graphing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Histograms

More information

Data Classes. Introduction to R for Public Health Researchers

Data Classes. Introduction to R for Public Health Researchers Data Classes Introduction to R for Public Health Researchers Data Types: One dimensional types ( vectors ): - Character: strings or individual characters, quoted - Numeric: any real number(s) - Integer:

More information

Data visualization with ggplot2

Data visualization with ggplot2 Data visualization with ggplot2 Visualizing data in R with the ggplot2 package Authors: Mateusz Kuzak, Diana Marek, Hedi Peterson, Dmytro Fishman Disclaimer We will be using the functions in the ggplot2

More information

03 - Intro to graphics (with ggplot2)

03 - Intro to graphics (with ggplot2) 3 - Intro to graphics (with ggplot2) ST 597 Spring 217 University of Alabama 3-dataviz.pdf Contents 1 Intro to R Graphics 2 1.1 Graphics Packages................................ 2 1.2 Base Graphics...................................

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Data Aggregation & Time Series Dr. David Koop Tidy Data: Baby Names Example Baby Names, Social Security Administration Popularity in 2016 Rank Male name Female name

More information

Find-A-Code Finding Codes Table of Contents

Find-A-Code Finding Codes Table of Contents Find-A-Code Finding Codes Table of Contents General Introduction...2 Using Find-A-Code Search...3 Using Click-A-Dex...7 Using Build-A-Code...9 Using Browse-A-Code...11 Using Cross-A-Code...14 General Introduction

More information

MyCodingTools Finding Codes Table of Contents

MyCodingTools Finding Codes Table of Contents MyCodingTools Finding Codes Table of Contents General Introduction...2 Using MyCodingTools Search...3 Using Click-A-Dex...7 Using Build-A-Code...9 Using Browse-A-Code...12 Using Cross-A-Code...15 General

More information

AND NUMERICAL SUMMARIES. Chapter 2

AND NUMERICAL SUMMARIES. Chapter 2 EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Sales Presentation for Matt s Mega Mart. Objectives. Steps: By the end of this lesson, you will be able to:

Sales Presentation for Matt s Mega Mart. Objectives. Steps: By the end of this lesson, you will be able to: Sales Presentation for Matt s Mega Mart Objectives By the end of this lesson, you will be able to: Apply Theme to presentation Export Word outline to PowerPoint Create pivot charts Modify pivot charts

More information

Assignment 2: Processing New York City Open Data

Assignment 2: Processing New York City Open Data Assignment 2: Processing New York City Open Data Due date: Sept. 28, 11:59PM EST. 1 Summary This assignment is designed to expose you to the use of open data. Wikipedia has a good description of open data:

More information

Franklin Math Bowl 2008 Group Problem Solving Test Grade 6

Franklin Math Bowl 2008 Group Problem Solving Test Grade 6 Group Problem Solving Test Grade 6 1. The fraction 32 17 can be rewritten by division in the form 1 p + q 1 + r Find the values of p, q, and r. 2. Robert has 48 inches of heavy gauge wire. He decided to

More information

Linkage analysis with paramlink Session I: Introduction and pedigree drawing

Linkage analysis with paramlink Session I: Introduction and pedigree drawing Linkage analysis with paramlink Session I: Introduction and pedigree drawing In this session we will introduce R, and in particular the package paramlink. This package provides a complete environment for

More information

Statistical transformations

Statistical transformations Statistical transformations Next, let s take a look at a bar chart. Bar charts seem simple, but they are interesting because they reveal something subtle about plots. Consider a basic bar chart, as drawn

More information

Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

More information

Know how to use fractions to describe part of something Write an improper fraction as a mixed number Write a mixed number as an improper fraction

Know how to use fractions to describe part of something Write an improper fraction as a mixed number Write a mixed number as an improper fraction . Fractions Know how to use fractions to describe part of something Write an improper fraction as a mixed number Write a mixed number as an improper fraction Key words fraction denominator numerator proper

More information