Elementary Statistics

Similar documents
MATH 117 Statistical Methods for Management I Chapter Two

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Test Bank for Privitera, Statistics for the Behavioral Sciences

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

2.1: Frequency Distributions and Their Graphs

Courtesy :

Spell out your full name (first, middle and last)

BUSINESS DECISION MAKING. Topic 1 Introduction to Statistical Thinking and Business Decision Making Process; Data Collection and Presentation

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

Chapter 2 - Graphical Summaries of Data

Section 2-2. Histograms, frequency polygons and ogives. Friday, January 25, 13

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

download instant at Summarizing Data: Listing and Grouping

Chapter 2 Describing, Exploring, and Comparing Data

STATISTICS Chapter (1) Introduction

+ Statistical Methods in

Chapter 2. Frequency distribution. Summarizing and Graphing Data

Raw Data. Statistics 1/8/2016. Relative Frequency Distribution. Frequency Distributions for Qualitative Data

Downloaded from

Chapter 2. Frequency Distributions and Graphs. Bluman, Chapter 2

MATH1635, Statistics (2)

Frequency Distributions and Graphs

Chapter 2 - Frequency Distributions and Graphs

CHAPTER 2. Objectives. Frequency Distributions and Graphs. Basic Vocabulary. Introduction. Organise data using frequency distributions.

Round each observation to the nearest tenth of a cent and draw a stem and leaf plot.

NOTES TO CONSIDER BEFORE ATTEMPTING EX 1A TYPES OF DATA

ORGANIZING THE DATA IN A FREQUENCY TABLE

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation

Date Lesson TOPIC HOMEWORK. Displaying Data WS 6.1. Measures of Central Tendency WS 6.2. Common Distributions WS 6.6. Outliers WS 6.

Measures of Central Tendency


JUST THE MATHS UNIT NUMBER STATISTICS 1 (The presentation of data) A.J.Hobson

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

2.1: Frequency Distributions

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction

Decimals should be spoken digit by digit eg 0.34 is Zero (or nought) point three four (NOT thirty four).

B. Graphing Representation of Data

2. The histogram. class limits class boundaries frequency cumulative frequency

Maths Class 9 Notes for Statistics

STAT STATISTICAL METHODS. Statistics: The science of using data to make decisions and draw conclusions

2.3 Organizing Quantitative Data

Organizing and Summarizing Data

UNIT 15 GRAPHICAL PRESENTATION OF DATA-I

Chapter 5snow year.notebook March 15, 2018

Applied Statistics for the Behavioral Sciences

CHAPTER 2: ORGANIZING AND VISUALIZING VARIABLES

Lecture Series on Statistics -HSTC. Frequency Graphs " Dr. Bijaya Bhusan Nanda, Ph. D. (Stat.)

Prob and Stats, Sep 4

Frequency Distributions

Organisation and Presentation of Data in Medical Research Dr K Saji.MD(Hom)

Math Tech IIII, Sep 14

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Business Statistics 10th Edition Groebner SOLUTIONS MANUAL

STP 226 ELEMENTARY STATISTICS NOTES

Statistical Tables and Graphs

McGraw-Hill Ryerson. Data Management 12. Section 5.1 Continuous Random Variables. Continuous Random. Variables

Chapters 1.5 and 2.5 Statistics: Collecting and Displaying Data

Chapter 2: Understanding Data Distributions with Tables and Graphs

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

LESSON 3: CENTRAL TENDENCY

AND NUMERICAL SUMMARIES. Chapter 2

- 1 - Class Intervals

An Example of a Class Frequency Histogram. An Example of a Class Frequency Table. Freq

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Chapter 2: Descriptive Statistics

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Statistics for Managers Using Microsoft Excel, 7e (Levine) Chapter 2 Organizing and Visualizing Data

CHAPTER 2: SAMPLING AND DATA

6th Grade Vocabulary Mathematics Unit 2

Elementary Statistics. Organizing Raw Data

Chapter 2: Graphical Summaries of Data 2.1 Graphical Summaries for Qualitative Data. Frequency: Frequency distribution:

Measures of Dispersion

Univariate Statistics Summary

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Foundation. Scheme of Work. Year 9. September 2016 to July 2017

TYPES OF NUMBER P1 P2 P3 Learning Objective Understand place value in large numbers Add and subtract large numbers (up to 3 digits) Multiply and

Organizing Data. Class limits (in miles) Tally Frequency Total 50

Statistics. MAT 142 College Mathematics. Module ST. Terri Miller revised December 13, Population, Sample, and Data Basic Terms.

Basic Statistical Terms and Definitions

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

UNIT 1A EXPLORING UNIVARIATE DATA

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

Processing, representing and interpreting data

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

Using a percent or a letter grade allows us a very easy way to analyze our performance. Not a big deal, just something we do regularly.

Math 227 EXCEL / MEGASTAT Guide

Data and Data Presentation

Beal High School. Mathematics Department. Scheme of Work for years 7 and 8

Mathematics Curriculum

Frequency Distributions

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Common Core Vocabulary and Representations

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : 4 Date Year 9 MEG :

Visualizing Data: Freq. Tables, Histograms

8 Organizing and Displaying

Transcription:

1 Elementary Statistics Introduction Statistics is the collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions. Definitions and Statistical Terminology Population - data set consisting of all outcomes, measurements, or responses of interest Sample - data set which is a subset of the population data set For example, if we are interested in measuring the salaries of Kenyan high-school teachers, the population data set would be a list of the salaries of every high-school teacher in Kenya. A sample data set could be obtained by selecting 100 high-school teachers from a across the country and listing their salaries. Note: There are several reasons why we don't work with populations. They are usually large, and it is often impossible to get data for every object we're studying. Sampling does not usually occur without cost, and the more items surveyed, the larger the cost. Raw Data - Data collected in original form which have not been organised numerically. Variable - Characteristic or attribute that can assume different values. Qualitative Variables Variables which assume non-numerical values e.g. marital status, hair colour, favourite ice-cream etc. Quantitative Variables Variables which assume numerical values. These can further be divided into two: i) Discrete Variables Variables which assume a finite or countable number of possible values. Usually obtained by counting e.g. number of children in a family.

2 ii) Continuous Variables Variables which assume an infinite number of possible values. Usually obtained by measurement e.g. age, weight, income etc. Note: Since continuous variables are real numbers, we usually round them. This implies a boundary depending on the number of decimal places. For example: 64 is really anything 63.5 < x < 64.5. Likewise, if there are two decimal places, then 64.03 is really anything 63.025 < x < 63.035. Boundaries always have one more decimal place than the data and end in a 5. Frequency Distributions A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that are in each class. This number is called the frequency of the class. Example 1 Some values occur more than once. Therefore we can form a table showing how many times each value occurs.

3 Complete a tally diagram and determine the frequency distribution of the values. Classification of Data If the range of values of the variable is large, it is often helpful to consider these values arranged in regular groups or classes. There are two methods of classifying data i) The Inclusive Method Both class boundaries are included in the class they represent. E.g. 10-19, 20-29 etc ii) The Exclusive Method (left end-point convention) The upper class boundary for a particular class actually belongs to the next class. E.g. 10-20, 20-30 etc. In the previous example using discrete data, there is no difficulty in allocating any given value to its appropriate group, since for example, there is no value between, say, 29 and 30. However with continuous data, the value is measured on a continuous scale and may have values lying in between e.g. 29.7, 29.8 etc. In such cases we use the exclusive method of data classification. Example 2 Statistics exam grades. Suppose that 20 statistics students scores on an exam are as follows: 97, 92, 88, 75, 83, 67, 89, 55, 72, 78, 81, 91, 57, 63, 67, 74, 87, 84, 98, 46 We can construct a frequency table with classes 90-99, 80-89, 70-79 etc. by counting the number of grades in each grade range.

4 Class Frequency ( f ) 90-99 4 80-89 6 70-79 4 60-69 3 50-59 2 40-49 1 Note that the sum of the frequency column is equal to 20, the number of test scores. In practice, where the values of the variable are all given to the same number of significant figures or decimal places, there is no trouble and we form the groups accordingly as in the following example. Example 3 The following data represent the record high temperatures for each of the 50 counties. Construct a grouped frequency distribution for the data using 7 classes. Exercise 112 100 127 120 134 118 105 110 109 112 110 118 117 116 118 122 114 114 105 109 107 112 114 115 118 117 118 122 106 110 116 108 110 121 113 120 119 111 104 111 120 113 120 117 105 110 118 112 114 114 The lengths (in mm) of 40 spindles were measured with the following results: Additional Terminology Class Limits - Separate one class in a grouped frequency distribution from another. The limits could actually appear in the data and have gaps between the upper limit of one class and the lower limit of the next.

5 Class Width The difference between the upper (or lower) class limits of consecutive classes. It is also the positive difference between two consecutive class marks. It is not the difference between the upper and lower limits of the same class. Class Mark The middle value of each data class. Also called midpoint or central value. To find the class mark, average the upper and lower class limits. class mark = upper 2 lower Class Boundaries - The boundaries have one more decimal place than the raw data and therefore do not appear in the data. There is no gap between the upper boundary of one class and the lower boundary of the next class. The lower class boundary is found by subtracting 0.5 units from the lower class limit and the upper class boundary is found by adding 0.5 units to the upper class limit. Example: From the frequency table of statistics grades above. The upper class limits are 99, 89, 79, 69, 59, and 49. The lower class limits are 90, 80, 70, 60, 50, and 40. The class marks are 94.5, 84.5, 74.5, 64.5, 54.5, and 44.5. The width of each class is 10.

6 Creating a Frequency Table

7 Exercises 1) The thicknesses of 20 samples of steel plate are measured and the results (in mm) to 2 s.f. are as follows: 7.3 7.1 6.6 7.0 7.8 7.3 7.5 6.2 6.9 6.7 6.5 6.8 7.2 7.4 6.5 6.9 7.2 7.6 7.0 6.8 Complete a table showing the frequency distribution for regular classes of class width 0.3 mm. 2) Construct a frequency table with 6 data classes from the following data set. Amount of gasoline purchased by 28 drivers: 7, 4, 18, 4, 9, 8, 8, 7, 6, 2, 9, 5, 9, 12, 4, 14, 15, 7, 10, 2, 3, 11, 4, 4, 9, 12, 5, 3 Mathematical Notation The following symbols and variables have the meanings given below. Variables x n N f = data value = number of values in a sample data set = number of values in a population data set = frequency of a data class Symbol indicates the sum of all values for the following variable or expression. Cumulative Frequency The cumulative frequency of a data class is the number of data elements in that class and all previous classes. (Can be done ascending or descending).

8 Example: Class Frequency ( f ) Cumulative Frequency 90-99 4 4 80-89 6 10 70-79 4 14 60-69 3 17 50-59 2 19 40-49 1 20 Notice that the last entry in the cumulative frequency column is n 20. Exercise: Add a cumulative frequency column to table of gasoline purchases. Relative Frequency It is the frequency of any one data class compared with the sum of the frequencies of all classes (i.e. the total frequency). The result is generally expressed as a percentage. We can calculate the relative frequency for each class as follows: relative frequency = f n Example: Class Frequency ( f ) Cumulative Frequency Relative Frequency (%) 90-99 4 4 20 80-89 6 10 30 70-79 4 14 20 60-69 3 17 15 50-59 2 19 10 40-49 1 20 05 Note: The sum of the relative frequencies should be 1 or 100 per cent f 1 n Exercise: Add a relative frequency column to table of gasoline purchases.

9 Graphical Representation of Data DESCRIPTIVE STATISTICS 1) Histograms A histogram is a graphical representation of the information in a frequency table in which vertical rectangular blocks are drawn so that: a) The centre of the base indicates the class mark and b) The area of the rectangle represents the percentage frequency If the class intervals are regular, the frequency is then denoted by the height of the rectangle. How to draw a histogram i) Convert count to percent (if necesssary). ii) For each bar, find width and height. area height class width iii) Draw and label axes iv) Draw the bars Example: A histogram to represent the data for the record high temperatures for each of the 50 counties. Notice that the bar for each class is centered at the class midpoint (class mark), and the bars for successive classes touch. Exercise: Construct a histogram for the frequency table of gasoline purchases.

10 2) Frequency Polygon A frequency polygon is a line graph representation of the information in a frequency table. Like a histogram, the vertical axis represents the percent frequency and the horizontal axis represents the variable being measured in the data set. To construct the graph, a point is plotted for each class at its midpoint and with height given by the frequency (or percent frequency) of the class. The points are then connected by straight lines. If the centre points of the tops of the bars of a histogram are joined, the resulting figure is a frequency polygon. If the polygon is extended to include the midpoints of the zero frequency classes at each end of the histogram, then the area of the complete polygon is equal to the area of the histogram and therefore represents the total frequency of the variable. Example: using the same data for high temperatures in the previous example. 3) Ogive a line graph that represents the cumulative frequencies for the classes in a frequency distribution. x-axis: class boundaries y-axis: cumulative frequency

11 Example: for the high temperatures, this is an ogive. See illustration of less-than and more-than ogive on following page. Other types of graphs include Bar charts, pareto charts, pie charts, scatter plots etc. Exercises 1) Construct a) a histogram b) a frequency polygon c) a less-than and more-than ogive for the lengths of spindles in the previous example. 2) The table below shows a frequency distribution of the monthly wages in sterling pounds of 70 employees at a company. Wages ( ) Frequency f 50.00-59.99 8 60.00-69.99 10 70.00-79.99 16 80.00-89.99 15 90.00-99.99 10 100.00-119.99 8 120.00-179.99 3 Construct a histogram for this frequency distribution.

12

13