Goodness-of-Fit Testing T.Scofield Nov. 16, 2016

Size: px
Start display at page:

Download "Goodness-of-Fit Testing T.Scofield Nov. 16, 2016"

Transcription

1 Goodness-of-Fit Testing T.Scofield Nov. 16, 2016 We do goodness-of-fit testing with a single categorical variable, to see if the distribution of its sampled values fits a specified probability model. The probability model is stated in the null hypothesis. As the presence of a null hypothesis implies, goodness-of-fit tests are hypothesis tests. For each of the different possible values of the categorical variable, the null hypothesis should assert a population proportion. Example: A die can produce any of 6 rolls. In the fair die model, we expect these rolls to be equally likely, with each occurring (over the long haul) one-sixth of the time. To see if this model is a good fit for a sample of rolls taken from a particular die, we would presume this null hypothesis: Note that the asserted probabilities sum to 1: H 0 : p 1 = p 2 = p 3 = p 4 = p 5 = p 6 = = 1, a general principle in goodness-of-fit testing. Sample data would consist of n rolls of a die. Assuming the null hypothesis holds, we would expect to see n 1 6 instances of rolls which are 1, n/6 instances of twos, and so on. That is, the expected count of each different value would be n/6. Example: In Mendelian genetics, the law of independent assortment asserts that, for dihybrid crosses, asserts that combinations of two traits will occur with frequencies in a 9:3:3:1 ratio. A test to see if this Mendelian model applies in the case of two traits would have null hypothesis H 0 : p 1 = 9 16, p 2 = 3 16, p 3 = 3 16, p 4 = Mendel s work was on peas, and the two traits often used in descriptions of this work are color (Yellow vs. green) and texture (Smooth vs. wrinkled). If the model asserted in the null hypothesis holds then, when observing n peas, one would expect to see 9 16 n which are Yellow and Smooth, 3 16n which are Yellow and 3 1 wrinkled, 16n which are green and Smooth, and 16n which are green and wrinkled. The sample size and null hypothesis together give us expected counts E i. The data collected gives us observed counts O i by way of a frequency table. We use the chi-square (or χ 2 ) statistic χ 2 = Σ (O i E i ) 2 E i, as an overall measure of the discrepancy between the frequencies we expected and what we actually observed. This number, which cannot be negative, is zero when observed frequencies match expected ones exactly, but grows as observed counts become increasingly different from expected ones. It would be convenient to have a function in RStudio to calculate chi-square statistics for us, much as the mean() and sd() functions calculate x and s from sample data. The initiation cell below, which you should execute, provides two commands for calculating χ 2. There are two versions: (), which is useful when we have the raw data (i.e., one value of the categorical variable for every case), and ftchisqstat(), useful when we have a frequency table of observed counts in the sample. <- function(datavector, probs) { sampsize = length(datavector) observedcounts <- tally(datavector) expectedcounts <- probs * sampsize 1

2 chisqstat <- sum( (observedcounts - expectedcounts)^2 / expectedcounts ) return ( chisqstat ) } ftchisqstat <- function(obscounts, probs) { sampsize <- sum(obscounts) expectedcounts <- probs * sampsize chisqstat <- sum( (obscounts - expectedcounts)^2 / expectedcounts ) return ( chisqstat ) } Goodness-of-fit via randomization sampling An example involving answers for multiple choice tests Now that we have the calculation that leads to a test statistic, we turn our attention to producing a P -value. Chapter 4 showed us how to produce randomization distributions for other hypothesis test settings. Here we consider how we might produce a simulated distribution for χ 2 values under the null hypothesis. The key is to draw sample data like ours (i.e., with the exact same sample size) with replacement from a bag that is stocked according to the specifications in the null hypothesis. Consider the Lock data set APMultipleChoice. This is raw data, where each row gives the letter which was the correct answer of a particular multiple choice question. The values of the Answer variable are gathered and stored in optionslist using the command optionslist <- names(tally(apmultiplechoice$answer)) We look at optionslist optionslist ## [1] "A" "B" "C" "D" "E" and see that the questions had 5 possible letters. The observed counts, telling us how often each letter was the correct answer, appear in the frequency table: tally(apmultiplechoice$answer) ## X ## A B C D E ## A total of sum(tally(apmultiplechoice$answer)) ## [1] 400 multiple choice questions appear in our sample. A teacher might consider a good multiple choice test to be one in which each letter is equally likely to be the correct one. This good test model naturally gives rise to the null hypothesis This yields a list of expected frequencies hypprobs = rep(1,5) / * hypprobs H 0 : p A = p B = p C = p D = p E =

3 ## [1] and the corresponding test (χ 2 ) statistic is computed by the () command defined above: myteststat <- (APMultipleChoice$Answer, hypprobs) myteststat ## [1] A randomization sample (which takes our null hypothesis and sample size into account) could be produced using the command resample(optionslist, size=400, prob=hypprobs) The corresponding randomization statistic would be the χ 2 statistic of this randomization sample (resample(optionslist, size=400, prob=hypprobs), hypprobs) ## [1] To get a randomization distribution, we want to gather many randomization statisics manycsstats <- do(1000) * (resample(optionslist, size=400, prob=hypprobs), hypprobs) head(manycsstats) ## ## ## ## ## ## ## histogram(~, data=manycsstats, groups=>=myteststat) Since χ 2 statistics get larger as the observed counts become increasingly more extreme (more extremely different than expected ones), a goodness-of-fit test is always a 1-tailed (an upper-tailed) test, one concerned with the area (probability) corresponding to values as high or higher than the test statistic (χ 2 from the actual data). So, the approximate P value is 3

4 nrow(subset(manycsstats, >= myteststat)) / 1000 # gives approx. P-value ## [1] 0.49 An example involving the breakdown of days on which babies are born There is another package, abd, one can load to gain access to the DayOfBirth data frame. require(abd) ## Loading required package: abd ## Loading required package: nlme ## ## Attaching package: nlme ## The following object is masked from package:dplyr : ## ## collapse ## Loading required package: grid DayOfBirth ## day births ## 1 Sunday 33 ## 2 Monday 41 ## 3 Tuesday 63 ## 4 Wednesday 63 ## 5 Thursday 47 ## 6 Friday 56 ## 7 Saturday 47 This is not the raw data, which would have contained one row per birth. The variable in question is the day on which the birth occurred. The data comes to us already summarized in a frequency table. A reasonable null hypothesis is that it is equally likely that a birth would fall on any of the days of the week: H 0 : p Su = p M = p T u = p W = p T h = p F = p Sa = 1 7. We can use the other chi-square computing function defined above to obtain a test statistic: hypprobs = rep(1, 7) / 7 ftchisqstat(dayofbirth$births, hypprobs) ## [1] Since there are 350 births in the sample, and the values (Sunday Saturday) are stored in the day column, we may produce a randomization distribution via manychisqs = do(1000) * (resample(dayofbirth$day, prob=rep(1,7)/7, size=350), hypprobs) histogram(~, data=manychisqs) 4

5 nrow(subset(manychisqs, >= 15.24)) / 1000 # gives approx. P-value ## [1] This P -value is small of enough to be significant at the 5% level. In rejecting H 0, what we can say is that for at least one of the days, the likelihood that a birth occurs on that day is something other than 1over7. Obtaining a P -value using a chi-square distribution We should still have randomization distributions stored in manycsstats (good multiple choice test model) and in manychisqs (births equally-likely on all days of the week model). When we viewed these distributions above, they did not look normal. They do, however, have shapes which are well-approximated by density curves that come from a known distributional family: the chi-square distributions. Specifically, the null distribution (for χ 2 statistics) in the case of good multiple choice tests with optional answers A E follows closely a χ 2 distribution with 4 degrees of freedom: histogram(~, data=manycsstats) plotdist("chisq", df=4, add=true) 5

6 approximate P -value via the command 1 - pchisq(3.425, df=4) This means we could have obtained the ## [1] instead of obtaining it from a randomization distribution. Similarly, the null distribution for χ 2 statistics is a close match to the chi-square distribution with df = 6: histogram(~, data=manychisqs) plotdist("chisq", df=6, add=true) pchisq(15.24, df=6) We obtain the approximate P -value via ## [1] Some general remarks: When our (single) categorical variable has 5 different possible values, the number of degrees of freedom in the approximating chi-square distribution is 4. When it is a categorical variable with 7 values, we 6

7 have df = 6. The best approximating chi-square distribution is the one with 1 fewer dfs than the number of values/categories seen in the categorical variable. Even though the previous bullet point indicates the best chi-square distribution (the best choice of df) to use in approximating P -values, it is necessarily the case that any chi-square distribution gives good approximations. As in the past, there is a rule of thumb (more than one of them, in fact). The Lock s indicate that, if you inspect the expected counts and find them all to be at least 5, then you are assured P -values obtained from a chi-square distribution (i.e., using the pchisq() command) give reasonably acceptable approximations. 7

7.2: Chi-Square Test for Association T.Scofield Nov. 17, 2016

7.2: Chi-Square Test for Association T.Scofield Nov. 17, 2016 72: Chi-Square Test for Association TScofield Nov 17, 2016 The goal of this section is to provide means for investigating whether there is an association between two categorical variables Before proceeding,

More information

Descriptive Statistics, Standard Deviation and Standard Error

Descriptive Statistics, Standard Deviation and Standard Error AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ when the population standard deviation is known and population distribution is normal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses

More information

Chapter 2: The Normal Distribution

Chapter 2: The Normal Distribution Chapter 2: The Normal Distribution 2.1 Density Curves and the Normal Distributions 2.2 Standard Normal Calculations 1 2 Histogram for Strength of Yarn Bobbins 15.60 16.10 16.60 17.10 17.60 18.10 18.60

More information

Introduction to Hypothesis Testing T.Scofield 10/03/2016

Introduction to Hypothesis Testing T.Scofield 10/03/2016 Introduction to Hypothesis Testing T.Scofield 10/03/016 Hypothesis Testing: the steps 1. Identify the research question, along with relevant variables.. Formulate hypotheses (null and alternative) appropriate

More information

The Normal Distribution & z-scores

The Normal Distribution & z-scores & z-scores Distributions: Who needs them? Why are we interested in distributions? Important link between distributions and probabilities of events If we know the distribution of a set of events, then we

More information

The Normal Distribution & z-scores

The Normal Distribution & z-scores & z-scores Distributions: Who needs them? Why are we interested in distributions? Important link between distributions and probabilities of events If we know the distribution of a set of events, then we

More information

Using Variables to Write Pattern Rules

Using Variables to Write Pattern Rules Using Variables to Write Pattern Rules Goal Use numbers and variables to represent mathematical relationships. 1. a) What stays the same and what changes in the pattern below? b) Describe the pattern rule

More information

Ms Nurazrin Jupri. Frequency Distributions

Ms Nurazrin Jupri. Frequency Distributions Frequency Distributions Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data so that it is possible to get a general overview of the results.

More information

Unit 5: Estimating with Confidence

Unit 5: Estimating with Confidence Unit 5: Estimating with Confidence Section 8.3 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Unit 5 Estimating with Confidence 8.1 8.2 8.3 Confidence Intervals: The Basics Estimating

More information

Frequency Distributions

Frequency Distributions Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,

More information

Objectives/Outcomes. Introduction: If we have a set "collection" of fruits : Banana, Apple and Grapes.

Objectives/Outcomes. Introduction: If we have a set collection of fruits : Banana, Apple and Grapes. 1 September 26 September One: Sets Introduction to Sets Define a set Introduction: If we have a set "collection" of fruits : Banana, Apple Grapes. 4 F={,, } Banana is member "an element" of the set F.

More information

The Normal Distribution & z-scores

The Normal Distribution & z-scores & z-scores Distributions: Who needs them? Why are we interested in distributions? Important link between distributions and probabilities of events If we know the distribution of a set of events, then we

More information

3. Probability 51. probability A numerical value between 0 and 1 assigned to an event to indicate how often the event occurs (in the long run).

3. Probability 51. probability A numerical value between 0 and 1 assigned to an event to indicate how often the event occurs (in the long run). 3. Probability 51 3 Probability 3.1 Key Definitions and Ideas random process A repeatable process that has multiple unpredictable potential outcomes. Although we sometimes use language that suggests that

More information

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data Spatial Patterns We will examine methods that are used to analyze patterns in two sorts of spatial data: Point Pattern Analysis - These methods concern themselves with the location information associated

More information

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values Chapter 500 Introduction This procedure produces tables of frequency counts and percentages for categorical and continuous variables. This procedure serves as a summary reporting tool and is often used

More information

Private Swimming Lessons

Private Swimming Lessons Private Swimming Lessons Private Lessons Designed for participants who would like a 1:1 ratio. Participants will receive individual attention to improve their swimming technique and have the convenience

More information

Confidence Intervals. Dennis Sun Data 301

Confidence Intervals. Dennis Sun Data 301 Dennis Sun Data 301 Statistical Inference probability Population / Box Sample / Data statistics The goal of statistics is to infer the unknown population from the sample. We ve already seen one mode of

More information

GRADE 1 SUPPLEMENT. March Calendar Pattern C7.1

GRADE 1 SUPPLEMENT. March Calendar Pattern C7.1 GRADE 1 SUPPLEMENT Set C7 Geometry: Describing 3-D Shapes Calendar Pattern Includes March Calendar Pattern C7.1 Skills & Concepts H identify, name, and describe 3-D shapes in isolation and in everyday

More information

Mendel and His Peas Investigating Monhybrid Crosses Using the Graphing Calculator

Mendel and His Peas Investigating Monhybrid Crosses Using the Graphing Calculator 20 Investigating Monhybrid Crosses Using the Graphing Calculator This activity will use the graphing calculator s random number generator to simulate the production of gametes in a monohybrid cross. The

More information

Grade 8 Common Mathematics Assessment Multiple Choice Answer Sheet Name: Mathematics Teacher: Homeroom: Section A No Calculator Permitted

Grade 8 Common Mathematics Assessment Multiple Choice Answer Sheet Name: Mathematics Teacher: Homeroom: Section A No Calculator Permitted Multiple Choice Answer Sheet Name: Mathematics Teacher: Homeroom: Section A No Calculator Permitted Calculator Permitted. A B C D 2. A B C D. A B C D 4. A B C D 5. A B C D 6. A B C D 7. A B C D 8. A B

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms

More information

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form. CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.

More information

Data 8 Final Review #1

Data 8 Final Review #1 Data 8 Final Review #1 Topics we ll cover: Visualizations Arrays and Table Manipulations Programming constructs (functions, for loops, conditional statements) Chance, Simulation, Sampling and Distributions

More information

Data Mining. 2.4 Data Integration. Fall Instructor: Dr. Masoud Yaghini. Data Integration

Data Mining. 2.4 Data Integration. Fall Instructor: Dr. Masoud Yaghini. Data Integration Data Mining 2.4 Fall 2008 Instructor: Dr. Masoud Yaghini Data integration: Combines data from multiple databases into a coherent store Denormalization tables (often done to improve performance by avoiding

More information

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals BIOSTATS 540 Fall 017 8. SUPPLEMENT Normal, T, Chi Square, F and Sums of Normals Page 1 of Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals Topic 1. Normal Distribution.. a. Definition..

More information

Other conditional and loop constructs. Fundamentals of Computer Science Keith Vertanen

Other conditional and loop constructs. Fundamentals of Computer Science Keith Vertanen Other conditional and loop constructs Fundamentals of Computer Science Keith Vertanen Overview Current loop constructs: for, while, do-while New loop constructs Get out of loop early: break Skip rest of

More information

2) familiarize you with a variety of comparative statistics biologists use to evaluate results of experiments;

2) familiarize you with a variety of comparative statistics biologists use to evaluate results of experiments; A. Goals of Exercise Biology 164 Laboratory Using Comparative Statistics in Biology "Statistics" is a mathematical tool for analyzing and making generalizations about a population from a number of individual

More information

AQA Decision 1 Algorithms. Section 1: Communicating an algorithm

AQA Decision 1 Algorithms. Section 1: Communicating an algorithm AQA Decision 1 Algorithms Section 1: Communicating an algorithm Notes and Examples These notes contain subsections on Flow charts Pseudo code Loops in algorithms Programs for the TI-83 graphical calculator

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

SPSS Basics for Probability Distributions

SPSS Basics for Probability Distributions Built-in Statistical Functions in SPSS Begin by defining some variables in the Variable View of a data file, save this file as Probability_Distributions.sav and save the corresponding output file as Probability_Distributions.spo.

More information

Introductory Applied Statistics: A Variable Approach TI Manual

Introductory Applied Statistics: A Variable Approach TI Manual Introductory Applied Statistics: A Variable Approach TI Manual John Gabrosek and Paul Stephenson Department of Statistics Grand Valley State University Allendale, MI USA Version 1.1 August 2014 2 Copyright

More information

PAF Chapter Prep Section Mathematics Class 6 Worksheets for Intervention Classes

PAF Chapter Prep Section Mathematics Class 6 Worksheets for Intervention Classes The City School PAF Chapter Prep Section Mathematics Class 6 Worksheets for Intervention Classes Topic: Percentage Q1. Convert it into fractions and its lowest term: a) 25% b) 75% c) 37% Q2. Convert the

More information

View a Students Schedule Through Student Services Trigger:

View a Students Schedule Through Student Services Trigger: Department Responsibility/Role File Name Version Document Generation Date 6/10/2007 Date Modified 6/10/2007 Last Changed by Status View a Students Schedule Through Student Services_BUSPROC View a Students

More information

23.2 Normal Distributions

23.2 Normal Distributions 1_ Locker LESSON 23.2 Normal Distributions Common Core Math Standards The student is expected to: S-ID.4 Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

Practical 2: Using Minitab (not assessed, for practice only!)

Practical 2: Using Minitab (not assessed, for practice only!) Practical 2: Using Minitab (not assessed, for practice only!) Instructions 1. Read through the instructions below for Accessing Minitab. 2. Work through all of the exercises on this handout. If you need

More information

Quantitative - One Population

Quantitative - One Population Quantitative - One Population The Quantitative One Population VISA procedures allow the user to perform descriptive and inferential procedures for problems involving one population with quantitative (interval)

More information

TI-83 Users Guide. to accompany. Statistics: Unlocking the Power of Data by Lock, Lock, Lock, Lock, and Lock

TI-83 Users Guide. to accompany. Statistics: Unlocking the Power of Data by Lock, Lock, Lock, Lock, and Lock TI-83 Users Guide to accompany by Lock, Lock, Lock, Lock, and Lock TI-83 Users Guide- 1 Getting Started Entering Data Use the STAT menu, then select EDIT and hit Enter. Enter data for a single variable

More information

- 1 - Fig. A5.1 Missing value analysis dialog box

- 1 - Fig. A5.1 Missing value analysis dialog box WEB APPENDIX Sarstedt, M. & Mooi, E. (2019). A concise guide to market research. The process, data, and methods using SPSS (3 rd ed.). Heidelberg: Springer. Missing Value Analysis and Multiple Imputation

More information

Remotely Test Any Networked Equipment

Remotely Test Any Networked Equipment 1 Remotely Test Any Networked Equipment Universal Test Head Platform includes: Multiple Test Heads Scheduler Resource Balancing Database: Equipment Links Equipment History Test History Test Library Windows

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 3. Chapter 3: Data Preprocessing. Major Tasks in Data Preprocessing

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 3. Chapter 3: Data Preprocessing. Major Tasks in Data Preprocessing Data Mining: Concepts and Techniques (3 rd ed.) Chapter 3 1 Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major Tasks in Data Preprocessing Data Cleaning Data Integration Data

More information

Unit 10: Data Structures CS 101, Fall 2018

Unit 10: Data Structures CS 101, Fall 2018 Unit 10: Data Structures CS 101, Fall 2018 Learning Objectives After completing this unit, you should be able to: Define and give everyday examples of arrays, stacks, queues, and trees. Explain what a

More information

Question. Dinner at the Urquhart House. Data, Statistics, and Spreadsheets. Data. Types of Data. Statistics and Data

Question. Dinner at the Urquhart House. Data, Statistics, and Spreadsheets. Data. Types of Data. Statistics and Data Question What are data and what do they mean to a scientist? Dinner at the Urquhart House Brought to you by the Briggs Multiracial Alliance Sunday night All food provided (probably Chinese) Contact Mimi

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 3: Distributions Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture Examine data in graphical form Graphs for looking at univariate distributions

More information

Two-Stage Least Squares

Two-Stage Least Squares Chapter 316 Two-Stage Least Squares Introduction This procedure calculates the two-stage least squares (2SLS) estimate. This method is used fit models that include instrumental variables. 2SLS includes

More information

Lab 5 - Risk Analysis, Robustness, and Power

Lab 5 - Risk Analysis, Robustness, and Power Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors

More information

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Probability and Statistics. Copyright Cengage Learning. All rights reserved. Probability and Statistics Copyright Cengage Learning. All rights reserved. 14.6 Descriptive Statistics (Graphical) Copyright Cengage Learning. All rights reserved. Objectives Data in Categories Histograms

More information

STA 313: Topics in Statistics

STA 313: Topics in Statistics Al Nosedal. University of Toronto. Fall 2015 essentially, all models are wrong, but some are useful George E. P. Box (one of the great statistical minds of the 20th century). What is R? R language essentials

More information

1. Estimation equations for strip transect sampling, using notation consistent with that used to

1. Estimation equations for strip transect sampling, using notation consistent with that used to Web-based Supplementary Materials for Line Transect Methods for Plant Surveys by S.T. Buckland, D.L. Borchers, A. Johnston, P.A. Henrys and T.A. Marques Web Appendix A. Introduction In this on-line appendix,

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

IQR = number. summary: largest. = 2. Upper half: Q3 =

IQR = number. summary: largest. = 2. Upper half: Q3 = Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number

More information

Student Learning Objectives

Student Learning Objectives Student Learning Objectives A. Understand that the overall shape of a distribution of a large number of observations can be summarized by a smooth curve called a density curve. B. Know that an area under

More information

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,

More information

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242 Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242 Creation & Description of a Data Set * 4 Levels of Measurement * Nominal, ordinal, interval, ratio * Variable Types

More information

Written by Donna Hiestand-Tupper CCBC - Essex TI 83 TUTORIAL. Version 3.0 to accompany Elementary Statistics by Mario Triola, 9 th edition

Written by Donna Hiestand-Tupper CCBC - Essex TI 83 TUTORIAL. Version 3.0 to accompany Elementary Statistics by Mario Triola, 9 th edition TI 83 TUTORIAL Version 3.0 to accompany Elementary Statistics by Mario Triola, 9 th edition Written by Donna Hiestand-Tupper CCBC - Essex 1 2 Math 153 - Introduction to Statistical Methods TI 83 (PLUS)

More information

StatsMate. User Guide

StatsMate. User Guide StatsMate User Guide Overview StatsMate is an easy-to-use powerful statistical calculator. It has been featured by Apple on Apps For Learning Math in the App Stores around the world. StatsMate comes with

More information

No. of blue jelly beans No. of bags

No. of blue jelly beans No. of bags Math 167 Ch5 Review 1 (c) Janice Epstein CHAPTER 5 EXPLORING DATA DISTRIBUTIONS A sample of jelly bean bags is chosen and the number of blue jelly beans in each bag is counted. The results are shown in

More information

Chapter 6 Normal Probability Distributions

Chapter 6 Normal Probability Distributions Chapter 6 Normal Probability Distributions 6-1 Review and Preview 6-2 The Standard Normal Distribution 6-3 Applications of Normal Distributions 6-4 Sampling Distributions and Estimators 6-5 The Central

More information

Table Of Contents. Table Of Contents

Table Of Contents. Table Of Contents Statistics Table Of Contents Table Of Contents Basic Statistics... 7 Basic Statistics Overview... 7 Descriptive Statistics Available for Display or Storage... 8 Display Descriptive Statistics... 9 Store

More information

Integrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework

Integrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework 1 Integrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework Adapted from The Mathematics Vision Project: Scott Hendrickson, Joleigh Honey, Barbara Kuehl, Travis Lemon, Janet Sutorius

More information

Functions 3.6. Fall Math (Math 1010) M / 13

Functions 3.6. Fall Math (Math 1010) M / 13 Functions 3.6 Fall 2013 - Math 1010 (Math 1010) M 1010 3.6 1 / 13 Roadmap 3.6 - Functions: Relations, Functions 3.6 - Evaluating Functions, Finding Domains and Ranges (Math 1010) M 1010 3.6 2 / 13 3.6

More information

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created

More information

Data Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology

Data Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology ❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous

More information

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015 MAT 142 College Mathematics Statistics Module ST Terri Miller revised July 14, 2015 2 Statistics Data Organization and Visualization Basic Terms. A population is the set of all objects under study, a sample

More information

Lecture 12. Data Types and Strings

Lecture 12. Data Types and Strings Lecture 12 Data Types and Strings Class v. Object A Class represents the generic description of a type. An Object represents a specific instance of the type. Video Game=>Class, WoW=>Instance Members of

More information

DDOS-GUARD Q DDoS Attack Report

DDOS-GUARD Q DDoS Attack Report DDOS-GUARD Q4 2017 DDoS Attack Report 02 12,7% Number of attacks also dropped by 12,7% in comparison with same period in 2016 4613 Total number of DDoS attacks 36,8% Number of attacks dropped by 36.8%

More information

Chapter 2 Modeling Distributions of Data

Chapter 2 Modeling Distributions of Data Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and

More information

Chapter 2: The Normal Distributions

Chapter 2: The Normal Distributions Chapter 2: The Normal Distributions Measures of Relative Standing & Density Curves Z-scores (Measures of Relative Standing) Suppose there is one spot left in the University of Michigan class of 2014 and

More information

Effective probabilistic stopping rules for randomized metaheuristics: GRASP implementations

Effective probabilistic stopping rules for randomized metaheuristics: GRASP implementations Effective probabilistic stopping rules for randomized metaheuristics: GRASP implementations Celso C. Ribeiro Isabel Rosseti Reinaldo C. Souza Universidade Federal Fluminense, Brazil July 2012 1/45 Contents

More information

Matrices. A Matrix (This one has 2 Rows and 3 Columns) To add two matrices: add the numbers in the matching positions:

Matrices. A Matrix (This one has 2 Rows and 3 Columns) To add two matrices: add the numbers in the matching positions: Matrices A Matrix is an array of numbers: We talk about one matrix, or several matrices. There are many things we can do with them... Adding A Matrix (This one has 2 Rows and 3 Columns) To add two matrices:

More information

Excel Tips and FAQs - MS 2010

Excel Tips and FAQs - MS 2010 BIOL 211D Excel Tips and FAQs - MS 2010 Remember to save frequently! Part I. Managing and Summarizing Data NOTE IN EXCEL 2010, THERE ARE A NUMBER OF WAYS TO DO THE CORRECT THING! FAQ1: How do I sort my

More information

To Plot a Graph in Origin. Example: Number of Counts from a Geiger- Müller Tube as a Function of Supply Voltage

To Plot a Graph in Origin. Example: Number of Counts from a Geiger- Müller Tube as a Function of Supply Voltage To Plot a Graph in Origin Example: Number of Counts from a Geiger- Müller Tube as a Function of Supply Voltage 1 Digression on Error Bars What entity do you use for the magnitude of the error bars? Standard

More information

GreenThumb Garden Registration

GreenThumb Garden Registration GreenThumb Garden Registration 2015-2019 Garden Name Block Lot CB Jurisdiction Two members must provide full contact information on the license agreement, including phone numbers, addresses and emails.

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Hypothesis Testing Using Randomization Distributions T.Scofield 10/03/2016

Hypothesis Testing Using Randomization Distributions T.Scofield 10/03/2016 Hypothesis Testing Using Randomization Distributions T.Scofield 10/03/2016 Randomization Distributions in Two-Proportion Settings By calling our setting a two proportion one, I mean that the data frame

More information

Susan had $50 to spend at the carnival. She spent $12 on food and twice as much on rides. How many dollars did she have left to spend?

Susan had $50 to spend at the carnival. She spent $12 on food and twice as much on rides. How many dollars did she have left to spend? Susan had $50 to spend at the carnival. She spent $12 on food and twice as much on rides. How many dollars did she have left to spend? (A) 12 (B) 14 (C) 26 (D) 38 (E) 50 2008 AMC 8, Problem #1 Susan spent

More information

SYS 6021 Linear Statistical Models

SYS 6021 Linear Statistical Models SYS 6021 Linear Statistical Models Project 2 Spam Filters Jinghe Zhang Summary The spambase data and time indexed counts of spams and hams are studied to develop accurate spam filters. Static models are

More information

Box-Cox Transformation for Simple Linear Regression

Box-Cox Transformation for Simple Linear Regression Chapter 192 Box-Cox Transformation for Simple Linear Regression Introduction This procedure finds the appropriate Box-Cox power transformation (1964) for a dataset containing a pair of variables that are

More information

height VUD x = x 1 + x x N N 2 + (x 2 x) 2 + (x N x) 2. N

height VUD x = x 1 + x x N N 2 + (x 2 x) 2 + (x N x) 2. N Math 3: CSM Tutorial: Probability, Statistics, and Navels Fall 2 In this worksheet, we look at navel ratios, means, standard deviations, relative frequency density histograms, and probability density functions.

More information

Diode Lab vs Lab 0. You looked at the residuals of the fit, and they probably looked like random noise.

Diode Lab vs Lab 0. You looked at the residuals of the fit, and they probably looked like random noise. Diode Lab vs Lab In Lab, the data was from a nearly perfect sine wave of large amplitude from a signal generator. The function you were fitting was a sine wave with an offset, an amplitude, a frequency,

More information

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable Learning Objectives Continuous Random Variables & The Normal Probability Distribution 1. Understand characteristics about continuous random variables and probability distributions 2. Understand the uniform

More information

Minitab on the Math OWL Computers (Windows NT)

Minitab on the Math OWL Computers (Windows NT) STAT 100, Spring 2001 Minitab on the Math OWL Computers (Windows NT) (This is an incomplete revision by Mike Boyle of the Spring 1999 Brief Introduction of Benjamin Kedem) Department of Mathematics, UMCP

More information

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling

More information

UNIT 15 GRAPHICAL PRESENTATION OF DATA-I

UNIT 15 GRAPHICAL PRESENTATION OF DATA-I UNIT 15 GRAPHICAL PRESENTATION OF DATA-I Graphical Presentation of Data-I Structure 15.1 Introduction Objectives 15.2 Graphical Presentation 15.3 Types of Graphs Histogram Frequency Polygon Frequency Curve

More information

Hypothesis Test Exercises from Class, Oct. 12, 2018

Hypothesis Test Exercises from Class, Oct. 12, 2018 Hypothesis Test Exercises from Class, Oct. 12, 218 Question 1: Is there a difference in mean sepal length between virsacolor irises and setosa ones? Worked on by Victoria BienAime and Pearl Park Null Hypothesis:

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Prime Time (Factors and Multiples)

Prime Time (Factors and Multiples) CONFIDENCE LEVEL: Prime Time Knowledge Map for 6 th Grade Math Prime Time (Factors and Multiples). A factor is a whole numbers that is multiplied by another whole number to get a product. (Ex: x 5 = ;

More information

SAS/STAT 13.1 User s Guide. The SURVEYFREQ Procedure

SAS/STAT 13.1 User s Guide. The SURVEYFREQ Procedure SAS/STAT 13.1 User s Guide The SURVEYFREQ Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS

More information

New National Curriculum for England - Curriculum Objectives. Year 5 Maths Objectives

New National Curriculum for England - Curriculum Objectives. Year 5 Maths Objectives New National Curriculum for England - Curriculum Objectives Year 5 Maths Objectives Place Value Statement Topic P1 COUNTING interpret negative s in context, count forwards and backwards with positive and

More information

Notes on Simulations in SAS Studio

Notes on Simulations in SAS Studio Notes on Simulations in SAS Studio If you are not careful about simulations in SAS Studio, you can run into problems. In particular, SAS Studio has a limited amount of memory that you can use to write

More information

1 Starting with your cursor in cell A1, press the TAB key on your keyboard.

1 Starting with your cursor in cell A1, press the TAB key on your keyboard. Page 11 Entering data 1 Starting with your cursor in cell A1, press the TAB key on your keyboard. 2 With the cursor in B1, press the Caps Lock key and enter the word BREAKFAST 3 Press the TAB key and enter

More information

Excel 2. Module 2 Formulas & Functions

Excel 2. Module 2 Formulas & Functions Excel 2 Module 2 Formulas & Functions Revised 1/1/17 People s Resource Center Module Overview This module is part of the Excel 2 course which is for advancing your knowledge of Excel. During this lesson

More information

On Using Graph Coloring to Create University Timetables with Essential and Preferential Conditions

On Using Graph Coloring to Create University Timetables with Essential and Preferential Conditions On Using Graph Coloring to Create University Timetables with Essential and Preferential Conditions TIMOTHY A. REDL University of Houston-Downtown Department of Computer and Mathematical Sciences One Main

More information

Household planner. totallythebomb.com

Household planner. totallythebomb.com Household planner password tracker date website password Note Chore Chart Chore Sunday Monday Tuesday Wednesday Thursday Friday Saturday De-clutter check list Living Room Dinning Room Kitchen bathroom

More information

Programming Language. Control Structures: Selection (switch) Eng. Anis Nazer First Semester

Programming Language. Control Structures: Selection (switch) Eng. Anis Nazer First Semester Programming Language Control Structures: Selection (switch) Eng. Anis Nazer First Semester 2018-2019 Multiple selection choose one of two things if/else choose one from many things multiple selection using

More information

Year 6 Maths Scheme of Work

Year 6 Maths Scheme of Work Year 6 National Curriculum The 2014 2015 Year 6 cohort will be using the old national curriculum as this is what will be used for the KS2 SATs 2015. Below are the objectives students are required to meet

More information

Lesson 11.1 Dilations

Lesson 11.1 Dilations Lesson 11.1 Dilations Key concepts: Scale Factor Center of Dilation Similarity A A dilation changes the size of a figure. B C Pre Image: 1 A A' B C Pre Image: B' C' Image: What does a dilation NOT change?

More information

Courtesy :

Courtesy : STATISTICS The Nature of Statistics Introduction Statistics is the science of data Statistics is the science of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data.

More information