Transform Data! The Basics Part I continued!
|
|
- Curtis Edwards
- 5 years ago
- Views:
Transcription
1 Transform Data! The Basics Part I continued!
2 arrange()
3 arrange() Order rows from smallest to largest values arrange(.data, ) Data frame to transform One or more columns to order by (addi3onal columns will be used as 3e breakers)
4 Common syntax Each function takes a data frame as the first argument, and returns a data frame arrange(.data, ) dplyr func3on data frame to transform func3on specific arguments
5 arrange() Order rows from smallest to largest values arrange(babynames, n) year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop 1899 M Lance e M Charles M James M William M John
6 Your Turn 3 Arrange babynames by n. Add prop as a second (tie breaking) variable to arrange on. Can you tell what the smallest value of n is? How does adding prop affect the arrangement?
7 arrange(babynames, n) arrange(babynames, n, prop)
8 Helper function desc() Change ordering to go from largest to smallest arrange(babynames, desc(n)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop 1899 M John M William M James M Charles M Lance e-05
9 Your Turn 4 Use desc() to find the names with the highest prop. Then, use desc() to find the names with the highest n.
10 arrange(babynames, desc(prop)) arrange(babynames, desc(n))
11 mutate()
12 mutate() Create new columns mutate(.data, ) Data frame to transform One or more new columns to create
13 mutate() Create new columns mutate(babynames, percent = round(prop * 100, 2)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop percent 1899 M John M William M James M Lance e M Charles
14 Create new columns mutate() mutate(babynames, percent = round(prop * 100, 2), nper = round(percent)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles year sex name n prop percent nper 1899 M John M William M James M Lance e M Charles
15
16 Vectorized function min_rank() A popular ranking function (ties share the lowest rank) min_rank(c(50, 100, 100, 1000)) # [1] min_rank(desc(c(50, 100, 100, 1000))) # [1]
17 Your Turn 5 Use min_rank() and mutate() to rank each row in babynames from largest prop to lowest prop
18 mutate(babynames, rank = min_rank(desc(prop)))
19 %>%
20 Multiple steps (composed functions) arrange(mutate(filter(babynames, year == 2015, sex == M ), rank == min_rank(desc(prop))), rank) 1. Filter babynames to just boys born in Rank the names by proportion so that higher proportions have lower rank 3. Arrange the names by rank
21 Multiple steps (intermediate data frames) boys_2015 <- filter(babynames, year == 2015, sex == M ) boys_2015 <- mutate(boys_2015, rank == min_rank(desc(prop))) boys_2015 <- arrange(boys_2015, rank) boys_2015
22 Multiple steps (intermediate data frames) boys_2015 <- filter(babynames, year == 2015, sex == M ) boys_2015 <- mutate(boys_2015, rank == min_rank(desc(prop))) boys_2015 <- arrange(boys_2015, rank) boys_2015
23 The pipe operator %>% %>% babynames filter(, n == 99680) Passes result on left into first argument of the function on right. So, these two lines do the same thing. Try it! filter(babynames, n == 99680) babynames %>% filter(n == 99680)
24 Multiple steps (pipe operator) babynames %>% filter(year == 2015, sex == M ) %>% mutate(rank = min_rank(desc(prop))) %>% arrange(rank) 1. Allows us to eliminate redundant code (assigning to the same data frame over and over) and/or unwanted intermediate data frames 2. Allows us to write code in the same way we think about the problem
25 Shortcut to type %>%
26 Your Turn 6 Use %>% to write a sequence of functions that: 1. Filter babynames to just the girls born in Mutate to make a percent column rounded to a whole number 3. Arrange the results so that the most popular names, based on the percent column, appear first.
27 babynames %>% filter(year == 1977, sex == "F") %>% mutate(percent = round(prop * 100)) %>% arrange(desc(percent))
28 Your Turn 7 Write code to do the following: 1. Trim babynames to just the rows that contain your name and your sex 2. Plot the results as a line graph with year on the x-axis and prop on the y-axis
29 babynames %>% filter(name == Lance, sex == M ) %>% ggplot() + geom_line(aes(year, prop))
30 What are the most popular names?
31 How should we define popularity? A name is popular if: 1. Sums a large number of children have the name when you sum across years 2. Ranks it consistently ranks among the top names from year to year
32 Question Do we have the right tools to: 1. Calculate the total number of children with each name? 2. Rank names within each year?
33 Deriving information mutate() create new variables summarise() summarise variables group_by() group cases
34 summarise()
35 summarise() Compute table of summaries babynames %>% summarise(total = sum(n), max = max(n)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles total max
36 Your Turn 8 Use summarise() to compute three statistics about the data: 1. The first (minimum) year in the data set 2. The last (maximum) year in the data set 3. The total number of children represented in the data set
37 babynames %>% summarise(first = min(year), last = max(year), total = sum(n))
38 Your Turn 9 Extract the rows where name == Khaleesi. Then use summarise() and summary functions to find: 1. The first year Khaleesi appeared in the data 2. The total number of children named Khaleesi
39 babynames %>% filter(name == Khaleesi ) %>% summarise(first = min(year), total = sum(n))
40
41 n() The number of rows in a data set babynames %>% summarise(n = n()) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles F John e-04 n 6
42 n_distinct() The number of distinct values in a variable babynames %>% summarise(n = n(), nname = n_distinct(name)) babynames year sex name n prop 1899 M John M William M James M Lance e M Charles F John e-04 n nname 6 5
43 group_by()
44 group_by() Groups cases by common values of one or more columns babynames %>% group_by(sex)
45 group_by() babynames %>% group_by(sex) %>% summarise(total = sum(n)) babynames year sex name n prop 1899 F Anne e F John e F Mary M John M Mary e M Lance e-05 sex total F M 7094
46 group_by() babynames %>% group_by(year, sex) %>% summarise(total = sum(n)) babynames year sex name n prop 1899 F Anne e F John e F Mary M John M Mary e M Lance e-05 year sex total 1899 F M F M 99
47 Your Turn 10 Use group_by(), summarise(), and arrange() to display the ten most popular names. Compute popularity as the total number of children of a single gender given a name.
48 babynames %>% group_by(name, sex) %>% summarise(total = sum(n)) %>% arrange(desc(total))
49
50 babynames %>% group_by(name, sex) %>% summarise(total = sum(n)) %>% arrange(desc(total)) %>% ungroup() %>% slice(1:10) %>% ggplot() + geom_col(aes(fct_reorder(name, desc(total)), total/ , fill = sex)) + theme_bw() + scale_fill_brewer() + labs(x = name, y = total (in millions) )
Transform Data! The Basics Part I!
Transform Data! The Basics Part I! arrange() arrange() Order rows from smallest to largest values arrange(.data, ) Data frame to transform One or more columns to order by (addi3onal columns will be used
More informationGrammar of data. dplyr. Bjarki Þór Elvarsson and Einar Hjörleifsson. Marine Research Institute. Bjarki&Einar (MRI) R-ICES 1 / 29
dplyr Bjarki Þór Elvarsson and Einar Hjörleifsson Marine Research Institute Bjarki&Einar (MRI) R-ICES 1 / 29 Working with data A Reformat a variable (e.g. as factors or dates) B Split one variable into
More informationText & Patterns. stat 579 Heike Hofmann
Text & Patterns stat 579 Heike Hofmann Outline Character Variables Control Codes Patterns & Matching Baby Names Data The social security agency keeps track of all baby names used in applications for social
More informationNumerical Summaries of Data Section 14.3
MATH 11008: Numerical Summaries of Data Section 14.3 MEAN mean: The mean (or average) of a set of numbers is computed by determining the sum of all the numbers and dividing by the total number of observations.
More informationLecture 12: Data carpentry with tidyverse
http://127.0.0.1:8000/.html Lecture 12: Data carpentry with tidyverse STAT598z: Intro. to computing for statistics Vinayak Rao Department of Statistics, Purdue University options(repr.plot.width=5, repr.plot.height=3)
More informationThe Average and SD in R
The Average and SD in R The Basics: mean() and sd() Calculating an average and standard deviation in R is straightforward. The mean() function calculates the average and the sd() function calculates the
More informationRaw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.
Section 2.1 - Introduction Graphs are commonly used to organize, summarize, and analyze collections of data. Using a graph to visually present a data set makes it easy to comprehend and to describe the
More informationOverview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution
Chapter 2 Summarizing & Graphing Data Slide 1 Overview Descriptive Statistics Slide 2 A) Overview B) Frequency Distributions C) Visualizing Data summarize or describe the important characteristics of a
More information2.3 Organizing Quantitative Data
2.3 Organizing Quantitative Data This section will focus on ways to organize quantitative data into tables, charts, and graphs. Quantitative data is organized by dividing the observations into classes
More informationData wrangling. Reduction/Aggregation: reduces a variable to a scalar
Data Wrangling Some definitions A data table is a collection of variables and observations A variable (when data are tidy) is a single column in a data table An observation is a single row in a data table,
More informationTHE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017
THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS Summer semester, 2016/2017 SOCIAL NETWORK ANALYSIS: THEORY AND APPLICATIONS 1. A FEW THINGS ABOUT NETWORKS NETWORKS IN THE REAL WORLD There are four categories
More informationIMPORTANT WORDS TO KNOW UNIT 1
IMPORTANT WORDS TO KNOW UNIT READ THESE WORDS ALOUD THREE TIMES WITH YOUR TEACHER! Chapter. equation. integer 3. greater than 4. positive 5. negative 6. operation 7. solution 8. variable Chapter. ordered
More informationWHOLE NUMBER AND DECIMAL OPERATIONS
WHOLE NUMBER AND DECIMAL OPERATIONS Whole Number Place Value : 5,854,902 = Ten thousands thousands millions Hundred thousands Ten thousands Adding & Subtracting Decimals : Line up the decimals vertically.
More informationDAY 52 BOX-AND-WHISKER
DAY 52 BOX-AND-WHISKER VOCABULARY The Median is the middle number of a set of data when the numbers are arranged in numerical order. The Range of a set of data is the difference between the highest and
More informationCHAPTER 2. Objectives. Frequency Distributions and Graphs. Basic Vocabulary. Introduction. Organise data using frequency distributions.
CHAPTER 2 Objectives Organise data using frequency distributions. Distributions and Graphs Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives. Represent
More informationAn Introduction to R. Ed D. J. Berry 9th January 2017
An Introduction to R Ed D. J. Berry 9th January 2017 Overview Why now? Why R? General tips Recommended packages Recommended resources 2/48 Why now? Efficiency Pointandclick software just isn't time efficient
More informationPGQL 0.9 Specification
PGQL 0.9 Specification Table of Contents Table of Contents Introduction Basic Query Structure Clause Topology Constraint Repeated Variables in Multiple Topology Constraints Syntactic Sugars for Topology
More informationSupporting our children to aim high!
Reach for the Sky Supporting our children to aim high! St Mary s CE School Maths Support Resources Parents often ask us, how can I help my child in maths? Firstly, we provide parents with the expectations
More informationThis chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.
CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.
More informationTransformations. Hadley Wickham. October 2009
Transformations Hadley Wickham October 2009 1. US baby names data 2. Transformations 3. Summaries 4. Doing it by group Baby names Top 1000 male and female baby names in the US, from 1880 to 2008. 258,000
More informationSection 1.1 The Distance and Midpoint Formulas; Graphing Utilities; Introduction to Graphing Equations
Section 1.1 The Distance and Midpoint Formulas; Graphing Utilities; Introduction to Graphing Equations origin (x, y) Ordered pair (x-coordinate, y-coordinate) (abscissa, ordinate) x axis Rectangular or
More informationSection 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc
Section 2-2 Frequency Distributions Copyright 2010, 2007, 2004 Pearson Education, Inc. 2.1-1 Frequency Distribution Frequency Distribution (or Frequency Table) It shows how a data set is partitioned among
More informationDay 4 Percentiles and Box and Whisker.notebook. April 20, 2018
Day 4 Box & Whisker Plots and Percentiles In a previous lesson, we learned that the median divides a set a data into 2 equal parts. Sometimes it is necessary to divide the data into smaller more precise
More informationIntroducing R/Tidyverse to Clinical Statistical Programming
Introducing R/Tidyverse to Clinical Statistical Programming MBSW 2018 Freeman Wang, @freestatman 2018-05-15 Slides available at https://bit.ly/2knkalu Where are my biases Biomarker Statistician Genomic
More informationData Management Page 1 of 12 Permutations Extra Problems (solutions)
Data Management Page of. a) How many -digit numbers can be formed using the digits 0,,,,, if no digits may be repeated in the number? remaining digits, including zero not zero {,,,, } b) How many of the
More information74 Wyner Math Academy I Spring 2016
74 Wyner Math Academy I Spring 2016 CHAPTER EIGHT: SPREADSHEETS Review April 18 Test April 25 Spreadsheets are an extremely useful and versatile tool. Some basic knowledge allows many basic tasks to be
More informationFurther Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables
Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables
More informationTest Bank for Privitera, Statistics for the Behavioral Sciences
1. A simple frequency distribution A) can be used to summarize grouped data B) can be used to summarize ungrouped data C) summarizes the frequency of scores in a given category or range 2. To determine
More informationChapter 2 Ratios, Percents, Simple Equations, and Ratio-Proportion
Chapter 2 Ratios, Percents, Simple Equations, and Ratio-Proportion PROBLEM Decimal Fraction Percent Ratio 1. 0.05 2. 3. 45% 4. 1. Complete row 1 in the table above., 5%, 1:20 DIF: Application REF: Ratios
More informationStage 5 PROMPT sheet. 5/3 Negative numbers 4 7 = -3. l l l l l l l l l /1 Place value in numbers to 1million = 4
Stage PROMPT sheet / Place value in numbers to million The position of the digit gives its size / Negative numbers A number line is very useful for negative numbers. The number line below shows: 7 - l
More informationStage 5 PROMPT sheet. 5/3 Negative numbers 4 7 = -3. l l l l l l l l l /1 Place value in numbers to 1million = 4
Millions Hundred thousands Ten thousands Thousands Hundreds Tens Ones Stage PROMPT sheet / Place value in numbers to million The position of the digit gives its size / Negative numbers A number line is
More informationMeasures of Central Tendency
Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of
More informationAverages and Variation
Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus
More information2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative
More informationYear 5 PROMPT sheet. Negative numbers 4 7 = -3. l l l l l l l l l Place value in numbers to 1million = 4
Year PROMPT sheet Place value in numbers to million The position of the digit gives its size Millions Hundred thousands Ten thousands thousands hundreds tens units 7 Negative numbers A number line is very
More informationChapter 2 Describing, Exploring, and Comparing Data
Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative
More informationOrganizing and Summarizing Data
1 Organizing and Summarizing Data Key Definitions Frequency Distribution: This lists each category of data and how often they occur. : The percent of observations within the one of the categories. This
More informationЛекция 4 Трансформация данных в R
Анализ данных Лекция 4 Трансформация данных в R Гедранович Ольга Брониславовна, старший преподаватель кафедры ИТ, МИУ volha.b.k@gmail.com 2 Вопросы лекции Фильтрация (filter) Сортировка (arrange) Выборка
More informationData Wrangling Jo Hardin September 11 & 13, 2017
Data Wrangling Jo Hardin September 11 & 13, 2017 Goals Piping / chaining Basic data verbs Higher level data verbs Datasets starwars is from dplyr, although originally from SWAPI, the Star Wars API, http://swapi.co/.
More informationMeasures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.
Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean the sum of all data values divided by the number of values in
More information06 Visualizing Information
Professor Shoemaker 06-VisualizingInformation.xlsx 1 It can be sometimes difficult to uncover meaning in data that s presented in a table or list Especially if the table has many rows and/or columns But
More informationAn introduction to ggplot: An implementation of the grammar of graphics in R
An introduction to ggplot: An implementation of the grammar of graphics in R Hadley Wickham 00-0-7 1 Introduction Currently, R has two major systems for plotting data, base graphics and lattice graphics
More informationJUST THE MATHS UNIT NUMBER STATISTICS 1 (The presentation of data) A.J.Hobson
JUST THE MATHS UNIT NUMBER 18.1 STATISTICS 1 (The presentation of data) by A.J.Hobson 18.1.1 Introduction 18.1.2 The tabulation of data 18.1.3 The graphical representation of data 18.1.4 Exercises 18.1.5
More informationSection 3.2 Measures of Central Tendency MDM4U Jensen
Section 3.2 Measures of Central Tendency MDM4U Jensen Part 1: Video This video will review shape of distributions and introduce measures of central tendency. Answer the following questions while watching.
More information2.1: Frequency Distributions
2.1: Frequency Distributions Frequency Distribution: organization of data into groups called. A: Categorical Frequency Distribution used for and level qualitative data that can be put into categories.
More informationCommon Core Vocabulary and Representations
Vocabulary Description Representation 2-Column Table A two-column table shows the relationship between two values. 5 Group Columns 5 group columns represent 5 more or 5 less. a ten represented as a 5-group
More informationChapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data
Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically
More informationMath 155. Measures of Central Tendency Section 3.1
Math 155. Measures of Central Tendency Section 3.1 The word average can be used in a variety of contexts: for example, your average score on assignments or the average house price in Riverside. This is
More informationSession 5 Nick Hathaway;
Session 5 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Adding Text To Plots 1 Line graph................................................. 1 Bar graph..................................................
More informationHOW TO DIVIDE: MCC6.NS.2 Fluently divide multi-digit numbers using the standard algorithm. WORD DEFINITION IN YOUR WORDS EXAMPLE
MCC6.NS. Fluently divide multi-digit numbers using the standard algorithm. WORD DEFINITION IN YOUR WORDS EXAMPLE Dividend A number that is divided by another number. Divisor A number by which another number
More informationExcel Boot Camp PIONEER TRAINING, INC.
Excel Boot Camp Dates and Times: Cost: $250 1/22, 2-4 PM 1/29, 2-4 PM 2/5, 2-4 PM 2/12, 2-4 PM Please register online or call our office. (413) 387-1040 This consists of four-part class is aimed at students
More informationFrequency Distributions
Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,
More informationSession 3 Nick Hathaway;
Session 3 Nick Hathaway; nicholas.hathaway@umassmed.edu Contents Manipulating Data frames and matrices 1 Converting to long vs wide formats.................................... 2 Manipulating data in table........................................
More informationChapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data
Chapter 2 Organizing and Graphing Data 2.1 Organizing and Graphing Qualitative Data 2.2 Organizing and Graphing Quantitative Data 2.3 Stem-and-leaf Displays 2.4 Dotplots 2.1 Organizing and Graphing Qualitative
More informationDescriptive Statistics and Graphing
Anatomy and Physiology Page 1 of 9 Measures of Central Tendency Descriptive Statistics and Graphing Measures of central tendency are used to find typical numbers in a data set. There are different ways
More informationUnit 3 Fill Series, Functions, Sorting
Unit 3 Fill Series, Functions, Sorting Fill enter repetitive values or formulas in an indicated direction Using the Fill command is much faster than using copy and paste you can do entire operation in
More informationUnit 3 Functions Review, Fill Series, Sorting, Merge & Center
Unit 3 Functions Review, Fill Series, Sorting, Merge & Center Function built-in formula that performs simple or complex calculations automatically names a function instead of using operators (+, -, *,
More information0001 Understand the structure of numeration systems and multiple representations of numbers. Example: Factor 30 into prime factors.
NUMBER SENSE AND OPERATIONS 0001 Understand the structure of numeration systems and multiple representations of numbers. Prime numbers are numbers that can only be factored into 1 and the number itself.
More informationLecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #
Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms
More informationProb and Stats, Sep 4
Prob and Stats, Sep 4 Variations on the Frequency Histogram Book Sections: N/A Essential Questions: What are the methods for displaying data, and how can I build them? What are variations of the frequency
More informationWorking with Data and Charts
PART 9 Working with Data and Charts In Excel, a formula calculates a value based on the values in other cells of the workbook. Excel displays the result of a formula in a cell as a numeric value. A function
More informationExcel Formulas & Functions I CS101
Excel Formulas & Functions I CS101 Topics Covered Use statistical functions Use cell references Use AutoFill Write formulas Use the RANK.EQ function Calculation in Excel Click the cell where you want to
More informationData Classes. Introduction to R for Public Health Researchers
Data Classes Introduction to R for Public Health Researchers Data Types: One dimensional types ( vectors ): - Character: strings or individual characters, quoted - Numeric: any real number(s) - Integer:
More informationDSC 201: Data Analysis & Visualization
DSC 201: Data Analysis & Visualization Data Aggregation & Time Series Dr. David Koop Tidy Data: Baby Names Example Baby Names, Social Security Administration Popularity in 2016 Rank Male name Female name
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 2 Summarizing and Graphing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Histograms
More informationFind-A-Code Finding Codes Table of Contents
Find-A-Code Finding Codes Table of Contents General Introduction...2 Using Find-A-Code Search...3 Using Click-A-Dex...7 Using Build-A-Code...9 Using Browse-A-Code...11 Using Cross-A-Code...14 General Introduction
More informationMyCodingTools Finding Codes Table of Contents
MyCodingTools Finding Codes Table of Contents General Introduction...2 Using MyCodingTools Search...3 Using Click-A-Dex...7 Using Build-A-Code...9 Using Browse-A-Code...12 Using Cross-A-Code...15 General
More informationAND NUMERICAL SUMMARIES. Chapter 2
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative
More informationData visualization with ggplot2
Data visualization with ggplot2 Visualizing data in R with the ggplot2 package Authors: Mateusz Kuzak, Diana Marek, Hedi Peterson, Dmytro Fishman Disclaimer We will be using the functions in the ggplot2
More informationFranklin Math Bowl 2008 Group Problem Solving Test Grade 6
Group Problem Solving Test Grade 6 1. The fraction 32 17 can be rewritten by division in the form 1 p + q 1 + r Find the values of p, q, and r. 2. Robert has 48 inches of heavy gauge wire. He decided to
More informationAston Hall s A-Z of mathematical terms
Aston Hall s A-Z of mathematical terms The following guide is a glossary of mathematical terms, covering the concepts children are taught in FS2, KS1 and KS2. This may be useful to clear up any homework
More informationKnow how to use fractions to describe part of something Write an improper fraction as a mixed number Write a mixed number as an improper fraction
. Fractions Know how to use fractions to describe part of something Write an improper fraction as a mixed number Write a mixed number as an improper fraction Key words fraction denominator numerator proper
More informationDplyr Introduction Matthew Flickinger July 12, 2017
Dplyr Introduction Matthew Flickinger July 12, 2017 Introduction to Dplyr This document gives an overview of many of the features of the dplyr library include in the tidyverse of related R pacakges. First
More informationDownloaded from
UNIT 2 WHAT IS STATISTICS? Researchers deal with a large amount of data and have to draw dependable conclusions on the basis of data collected for the purpose. Statistics help the researchers in making
More informationConverting between Percents, Decimals, and Fractions
Section. PRE-ACTIVITY PREPARATION Converting between Percents, Decimals, and Fractions Think about how often you have heard, read, or used the term percent (%) in its many everyday applications: The sales
More informationChapter 2 - Graphical Summaries of Data
Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense
More informationData Manipulation using dplyr
Data Manipulation in R Reading and Munging Data L. Torgo ltorgo@fc.up.pt Faculdade de Ciências / LIAAD-INESC TEC, LA Universidade do Porto Oct, 2017 Data Manipulation using dplyr The dplyr is a package
More informationCOMP 250 Fall heaps 2 Nov. 3, 2017
At the end of last lecture, I showed how to represent a heap using an array. The idea is that an array representation defines a simple relationship between a tree node s index and its children s index.
More informationWriting Functions! Part I!
Writing Functions! Part I! In your mat219_class project 1. Create a new R script or R notebook called wri7ng_func7ons 2. Include this code in your script or notebook: library(tidyverse) library(gapminder)
More informationMiddle Years Data Analysis Display Methods
Middle Years Data Analysis Display Methods Double Bar Graph A double bar graph is an extension of a single bar graph. Any bar graph involves categories and counts of the number of people or things (frequency)
More informationB. Graphing Representation of Data
B Graphing Representation of Data The second way of displaying data is by use of graphs Although such visual aids are even easier to read than tables, they often do not give the same detail It is essential
More informationPandas III: Grouping and Presenting Data
Lab 8 Pandas III: Grouping and Presenting Data Lab Objective: Learn about Pivot tables, groupby, etc. Introduction Pandas originated as a wrapper for numpy that was developed for purposes of data analysis.
More informationSML 201 Week 3 John D. Storey Spring 2016
SML 201 Week 3 John D. Storey Spring 2016 Contents Functions 4 Rationale................................. 4 Defining a New Function......................... 4 Example 1.................................
More informationHW3 Solutions. Answer: Let X be the random variable which denotes the number of packets lost.
HW3 Solutions 1. (20 pts.) Packets Over the Internet n packets are sent over the Internet (n even). Consider the following probability models for the process: (a) Each packet is routed over a different
More informationFormulas and Functions
Conventions used in this document: Keyboard keys that must be pressed will be shown as Enter or Ctrl. Controls to be activated with the mouse will be shown as Start button > Settings > System > About.
More informationCHAPTER 2: SAMPLING AND DATA
CHAPTER 2: SAMPLING AND DATA This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and Georgia Highlands College. OUTLINE 2.1 Stem-and-Leaf Graphs (Stemplots),
More informationExploratory Data Analysis
Chapter 10 Exploratory Data Analysis Definition of Exploratory Data Analysis (page 410) Definition 12.1. Exploratory data analysis (EDA) is a subfield of applied statistics that is concerned with the investigation
More informationTABLE OF CONTENTS. i Excel 2016 Basic
i TABLE OF CONTENTS TABLE OF CONTENTS I PREFACE VII 1 INTRODUCING EXCEL 1 1.1 Starting Excel 1 Starting Excel using the Start button in Windows 1 1.2 Screen components 2 Tooltips 3 Title bar 4 Window buttons
More informationDistributions of Continuous Data
C H A P T ER Distributions of Continuous Data New cars and trucks sold in the United States average about 28 highway miles per gallon (mpg) in 2010, up from about 24 mpg in 2004. Some of the improvement
More informationReview Guide for Term Paper (Teza)
Review Guide for Term Paper (Teza) We will soon have a term paper over the material covered in Chapters 1, 2, 8, 9, 15, 22, 23, 16 and 3. In Chapter 1 we covered: Place value, multiplying and dividing
More informationGraphing Bivariate Relationships
Graphing Bivariate Relationships Overview To fully explore the relationship between two variables both summary statistics and visualizations are important. For this assignment you will describe the relationship
More informationMATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation
MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,
More informationVocabulary: Data Distributions
Vocabulary: Data Distributions Concept Two Types of Data. I. Categorical data: is data that has been collected and recorded about some non-numerical attribute. For example: color is an attribute or variable
More informationTen Great Reasons to Learn SAS Software's SQL Procedure
Ten Great Reasons to Learn SAS Software's SQL Procedure Kirk Paul Lafler, Software Intelligence Corporation ABSTRACT The SQL Procedure has so many great features for both end-users and programmers. It's
More informationGateway Regional School District VERTICAL ALIGNMENT OF MATHEMATICS STANDARDS Grades 3-6
NUMBER SENSE & OPERATIONS 3.N.1 Exhibit an understanding of the values of the digits in the base ten number system by reading, modeling, writing, comparing, and ordering whole numbers through 9,999. Our
More informationHow many toothpicks are needed for her second pattern? How many toothpicks are needed for her third pattern?
Problem of the Month Tri - Triangles Level A: Lisa is making triangle patterns out of toothpicks all the same length. A triangle is made from three toothpicks. Her first pattern is a single triangle. Her
More informationMultiple-Subscripted Arrays
Arrays in C can have multiple subscripts. A common use of multiple-subscripted arrays (also called multidimensional arrays) is to represent tables of values consisting of information arranged in rows and
More informationSTAT 20060: Statistics for Engineers. Statistical Programming with R
STAT 20060: Statistics for Engineers Statistical Programming with R Why R? Because it s free to download for everyone! Most statistical software is very, very expensive, so this is a big advantage. Statisticians
More informationSummarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester
Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 09/10/2018 Summarising Data Today we will consider Different types of data Appropriate ways to summarise these
More informationSpreadsheet Applications Test
Spreadsheet Applications Test 1. The expression returns the maximum value in the range A1:A100 and then divides the value by 100. a. =MAX(A1:A100/100) b. =MAXIMUM(A1:A100)/100 c. =MAX(A1:A100)/100 d. =MAX(100)/(A1:A100)
More information