MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET Before you work on the practice problems (Section 3) please make sure that you read the supplementary notes (Section 1) and work through the examples (Section 2). Solutions to the examples and practice problems are posted in the appendices (Appendices A and B, respectively). It is important that you understand how to work through these problems before Exam 1. 1. Types of Data Descriptive statistics is the organization and description of data sets using tables, charts and numerical measures calculated from the data set. Descriptive statistics can be done for both sample data and population data, although some definitions (like the standard deviation) depend on whether you are dealing with sample data or population data. It is important to classify the data you are describing, because different types of data require different descriptive statistics: Categorical or Qualitative Data - Categorical data is non-numerical data (for example, eye color of each Metro State student). We use pie charts and bar charts to graph distributions of categorical data. Discrete Data - Discrete data is numerical data whose possible values can be counted (for example, the number of siblings of each Metro State student). We use histograms with midpoint labels to graph discrete data. Continuous Data - Continuous data is numerical data that can take any value in a continuous range of numbers (for example, the height in mm of a Metro State student). We use histograms with cut points to describe continuous data. This worksheet provides examples and exercises that require us to clearly organize and compute numerical measures for these data sets using the tables, graphs and numerical measures we have defined on in class. For further explanations and more detailed definitions of special terms and concepts, please refer to Chapter 1 of our textbook. 2. Examples (Solutions are in Appendix A) 2.1. Example (Categorical Data). The type of food advertised during after-school programming is observed and recorded for 24 randomly sampled ads. The categories of food type used are Cereal (Ce), Candy (Ca), Fast Food (FF), Drinks (D) and Snacks (S) { Ce, F F, F F, F F, D, Ca, Ce, F F, F F, D, Ce, S D, F F, Ca, D, F F, F F, D, D, F F, Ca, Ca, D } (1) Construct a table that gives the frequency distribution of this data. 1
2 MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET (2) Construct a table that gives the relative frequency distribution of this data. (3) Construct a pie chart of this data that displays the percentage of ads in the sample that advertise each food type. (4) Construct a bar chart of this data that displays the frequency of ads in the sample that advertise each food type. (5) Construct a bar chart of this data that displays the relative frequency of ads in the sample that advertise each each food type. (6) Identify the mode of this data set 2.2. Example (Discrete Data). The clutch size of a bird is the number of eggs in their nest during a single incubation period. A sample of first clutch sizes for Hawaiian Red Junglefowl is given below: { 6, 5, 5, 7, 5, 6, 5, 3, 4, 5 6, 7, 5, 6, 5, 5, 5, 4, 5, 6 } (1) Construct a table that gives the frequency distribution of this data. (2) Construct a table that gives the relative frequency distribution of this data. (3) Construct a frequency histogram of this data. (4) Construct a relative frequency histogram of this data. (5) Construct a boxplot of this data. (6) Construct a stem and leaf plot for this data set. (7) Find the sample mean for this data set. (8) Find the median of this data set. (9) Find the sample standard deviation of this data set. (10) Describe the shape of this distribution using vocabulary presented in the textbook. 2.3. Example (Continuous Data). The farm incomes per acre of soybeans reported by 18 American soybean farms randomly sampled in 2011 are given below (dollars). { 435, 358, 383, 396, 591, 403, 411, 492, 438 437, 510, 548, 470, 459, 493, 383, 412, 512 } (1) Construct a table that gives the frequency distribution of this data. (2) Construct a table that gives the relative frequency distribution of this data. (3) Construct a frequency histogram of this data. (4) Construct a relative frequency histogram of this data. (5) Construct a boxplot of this data. (6) Construct a stem and leaf plot for this data set. (7) Find the sample mean for this data set.
MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 3 (8) Find the median of this data set. (9) Find the sample standard deviation of this data set. (10) Describe the shape of this distribution using vocabulary presented in the textbook. 3. Practice Problems 3.1. Practice Problem (Categorical Data). A sample of major networks viewed on a given night by 14 randomly selected households is given as follows. { ABC, NBC, F OX, CBS, NBC, NBC, ABC, NBC, NBC, ABC, NBC, F OX, ABC, CBS } (1) Construct a table that gives the frequency distribution of this data. (2) Construct a table that gives the relative frequency distribution of this data. (3) Construct a pie chart of this data that displays the percentage of networks. (4) Construct a bar chart of this data that displays the frequency of networks. (5) Construct a bar chart of this data that displays the relative frequency of networks. (6) Identify the mode of this data set 3.2. Practice Problem (Discrete Data). The clutch size of a bird is the number of eggs in their nest next during a single incubation period. A sample of second clutch sizes for Hawaiian Red Junglefowl is given below: { 3, 3, 2, 5, 2, 4, 4, 3, 4, 2 3, 3, 3, 4, 4, 5, 4, 3, 4, 5 } (1) Construct a table that gives the frequency distribution of this data. (2) Construct a table that gives the relative frequency distribution of this data. (3) Construct a frequency histogram of this data. (4) Construct a relative frequency histogram of this data. (5) Use your calculator to compute the five-number summary of this data set. (6) Construct a boxplot of this data. (7) Use your calculator to compute the mean of this data set. (8) Use your calculator to compute the sample standard deviation of this data set. (9) Compare this data distribution to the data distribution in Example 2.2 by sketching both of the boxplots over the same horizontal axis of possible values. 3.3. Practice Problem (Continuous Data). The farm incomes per acre of soybeans reported by 16 American soybean farms randomly sampled in 2014 are given below (dollars). { 445, 286, 260, 280, 284, 334, 292, 250, 330, 385, 319, 218, 371, 263, 327, 286 }
4 MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET (1) Construct a table that gives the frequency distribution of this data. (2) Construct a table that gives the relative frequency distribution of this data. (3) Construct a frequency histogram of this data. (4) Construct a relative frequency histogram of this data. (5) Use your calculator to compute the five-number summary of this data set. (6) Construct a boxplot of this data. (7) Use your calculator to compute the mean of this data set. (8) Use your calculator to compute the sample standard deviation of this data set. (9) Compare this data distribution to the data distribution in Example 2.3 by sketching both of the boxplots over the same horizontal axis of possible values. Solution to Example 2.1: Appendix A. Solutions to Examples (1) The frequency distribution is summarized by the following table: Class Frequency Ca 5 Ce 2 D 7 FF 9 S 1 (2) The relative frequency distribution is summarized by the following table: Class Relative Frequency Ca.208 Ce.083 D.292 FF.375 S.0412 (3) The following pie chart was created using Minitab (saved as a.jpg using Copy Graph )
MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 5 (4) The following frequency distribution bar graph was created using Minitab (saved as a.jpg using Copy Graph ) (5) The following relative frequency distribution bar graph was created using Minitab (saved as a.jpg using Copy Graph ) (6) The mode of the data set is F F (Fast Food). Solution to Example 2.2: (1) The frequency distribution is summarized by the following table: Clutch Size Frequency 3 1 4 2 5 10 6 5 7 2
6 MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET (2) The relative frequency distribution is summarized by the following table: Clutch Size Relative Frequency 3.05 4.1 5.5 6.25 7.1 (3) The following frequency histogram was created using Minitab (saved as a.jpg using Copy Graph ) (4) The following relative frequency histogram was created using Minitab (saved as a.jpg using Copy Graph ) (5) The following boxplot was created using Minitab (saved as a.jpg using Copy Graph )
MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 7 (6) The following stem and leaf plot uses enough stems to reveal the distribution of the data: 3 0 4 00 5 00000000000 6 00000 7 00 (7) The mean is x = 5.25 (8) The median is 5. (9) The standard deviation is s =.967 (10) The distribution is roughly bell-shaped with a slight right skew. Solution to Example 2.3: (1) The frequency distribution is summarized by the following table: Income per acre (dollars) Frequency 340 x < 380 1 380 x < 420 6 420 x < 460 4 460 x < 500 3 500 x < 540 2 540 x < 580 1 580 x < 620 1
8 MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET (2) The relative frequency distribution is summarized by the following table: Income per acre (dollars) Relative Frequency 340 x < 380.056 380 x < 420.333 420 x < 460.222 460 x < 500.167 500 x < 540.111 540 x < 580.056 580 x < 620.056 (3) The following frequency histogram was created using Minitab (saved as a.jpg using Copy Graph ) (4) The following relative frequency histogram was created using Minitab (saved as a.jpg using Copy Graph ) (5) The following boxplot was created using Minitab (saved as a.jpg using Copy Graph )
MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 9 (6) The following stem and leaf plot uses split hundreds digits for the stems: 3 5 3 889 4 011333 4 5799 5 114 5 9 (7) The mean is x = 451.70. (8) The median is 437.50. (9) The standard deviation is s = 62.80. (10) The distribution reverse J-shaped and right-skewed. Appendix B. Solutions to Practice Problems Solution to Practice Problem 3.1: 3.1(1): The frequency distribution is summarized by the following table: Class Frequency ABC 4 CBS 2 NBC 6 FOX 2 3.1(2): The relative frequency distribution is summarized by the following table: Class Frequency ABC.286 CBS.143 NBC.429 FOX.143 3.1(3): The following pie chart was created using Minitab, but it is not difficult to sketch the pie chart by hand.
10 MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 3.1(4): The following frequency distribution bar graph was created using Minitab, but it is not difficult to sketch the frequency distribution by hand. 3.1(5): The following relative frequency distribution bar graph was created using Minitab, but it is not difficult to sketch the relative frequency distribution by hand. 3.1(6): The mode is NBC Solution to Practice Problem 3.2:
MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 11 3.2(1): The frequency distribution is summarized by the following table: Clutch Size Frequency 2 3 3 7 4 7 5 3 3.2(2): The relative frequency distribution is summarized by the following table: Clutch Size Frequency 2.15 3.35 4.35 5.15 3.2(3): The following frequency histogram was created using Minitab, but it is not difficult to sketch the frequency distribution by hand. 3.2(4): The following relative frequency histogram was created using Minitab, but it is not difficult to sketch the relative frequency distribution by hand. 3.2(5): The five-number summary is (2, 3, 3.5, 4, 5). 3.2(6): The following boxplot was created using Minitab, but it is not difficult to sketch the boxplot by hand.
12 MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 3.2(7): The mean is 3.5 3.2(8): The standard deviation is 0.946 3.2(9): The following double boxplot was created using Minitab, but it is not difficult to sketch the boxplot by hand. Solution to Practice Problem 3.3: 3.3(1): The frequency distribution is summarized by the following table: Income per acre (dollars) Frequency 180 x < 220 1 220 x < 260 1 260 x < 300 7 300 x < 340 4 340 x < 380 1 380 x < 420 1 420 x < 460 1
MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 13 3.3(2): The relative frequency distribution is summarized by the following table: Income per acre (dollars) Relative Frequency 180 x < 220.0625 220 x < 260.0625 260 x < 300.4375 300 x < 340.25 340 x < 380.0625 380 x < 420.0625 420 x < 460.0625 3.3(3): The following frequency histogram was created using Minitab, but it is not difficult to sketch the frequency distribution by hand. 3.3(4): The following relative frequency histogram was created using Minitab, but it is not difficult to sketch the relative frequency distribution by hand. 3.3(5): The five-number summary is (218, 271.5, 289, 332, 445). 3.3(6): The following boxplot was created using Minitab, but it is not difficult to sketch the boxplot by hand.
14 MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET 3.3(7): The mean is 308.1 3.3(8): The standard deviation is 57 3.3(9): The following boxplot was created using Minitab, but it is not difficult to sketch the boxplot by hand.