Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Size: px
Start display at page:

Download "Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data"

Transcription

1 Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data

2 Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically Qualitative data pie charts, bar charts, Pareto Charts. Quantitative data Histograms, Stemplots, Dot plots and Boxplots. Describe the shape of the plot. Summarize data numerically Quantitative data only Measure of center mean, median, midrange, and mode. Measure of position quartiles and percentiles. Measure of spread/variation range, variance, standard deviation, and inter-quartile range. Use TI graphing calculator to obtain statistics.

3 Organize Data Tabulate data into frequency and relative frequency Tables

4 Tabulate Qualitative Data Qualitative data values can be organized by a frequency distribution A frequency distribution lists Each of the categories The frequency/counts for each category

5 Frequency Table A simple data set is blue, blue, green, red, red, blue, red, blue A frequency table for this qualitative data is Color Frequency Blue 4 Green 1 Red 3 The most commonly occurring color is blue

6 What Is A Relative Frequency? The relative frequencies are the proportions (or percents) of the observations out of the total A relative frequency distribution lists Each of the categories The relative frequency for each category Relative frequency = Frequency Total

7 Relative Frequency Table A relative frequency table for this qualitative data is Color Relative Frequency Blue.500 (= 4/8) Green.125 (= 1/8) Red.375 (= 3/8) A relative frequency table can also be constructed with percents (50%, 12.5%, and 37.5% for the above table)

8 Tabulate Quantitative Data Suppose we recorded number of customers served each day for total of 40 days as below: We would like to compute the frequencies and the relative frequencies

9 Frequency/Relative Frequency Table The resulting frequencies and the relative frequencies:

10 Display Data graphically Qualitative data Bar, Pareto, Pie Charts Quantitative data Histograms, Stemplots, Dot plots

11 Graphic Display for Qualitative Data Bar Charts, Pareto Charts, Pie Charts

12 Bar and Pie Charts for Qualitative Data Bar charts for our simple data (generated with Chart command in Excel) Frequency bar chart Relative frequency bar chart Note: Always label the axes, provide category and numeric scales, and title when you present graphs. Relative Frequency Bar Chart Frequency Bar Chart Relative Frequency Blue Green Red Frequency Blue Green Red Color Color

13 Pareto Charts A Pareto chart is a particular type of bar graph A Pareto differs from a bar chart only in that the categories are arranged in order The category with the highest frequency is placed first (on the extreme left) The second highest category is placed second Etc. Pareto charts are often used when there are many categories but only the top few are of interest

14 Pareto Charts Here shows a Pareto chart for the simple data set: Pareto Chart Color Relative Frequency Blue 0.5 Red Green Relative Frequency 60% 50% 40% 30% 20% 10% 0% Blue Red Green Color

15 Side-by-Side Bar Charts Use it to compare multiple bar charts. An example side-by-side bar chart comparing educational attainment in 1990 versus 2003

16 Pie Charts Pie Charts are used to display qualitative data. It shows the amount of data that belong to each category as a proportional part of a circle. Pie Chart Green, 13% Blue, 50% Red, 38% Notice that Bar charts show the amount of data that belong to each category as a proportionally sized rectangular area.

17 Pie Charts Another example of a pie chart

18 Summary Qualitative data can be organized in several ways Tables are useful for listing the data, its frequencies, and its relative frequencies Charts such as bar charts, Pareto charts, and pie charts are useful visual methods for organizing data Side-by-side bar charts are useful for comparing multiple sets of qualitative data

19 Graphic Display Quantitative Data Histograms, Stemplots, Dot Plots

20 Histogram Histogram is a bar graph which represents a frequency distribution of a quantitative variable. It is a term used only for a bar graph of quantitative data. A histogram is made up of the following components: 1. A title, which identifies the population of interest 2. A vertical scale, which identifies the frequencies or relative frequency in the various classes 3. A horizontal scale, which identifies the variable x. Values or ranges of values may be labeled along the x-axis. Use whichever method of labeling the axis best presents the variable. When you make a graph, make sure you label (give descriptions to) both axes clearly, and give a title for the graph too.

21 Histogram for discrete Quantitative data Example of histograms for discrete data Frequencies Relative frequencies Note: The term histogram is used only for a bar graph to summarize quantitative data. The bar chart for qualitative data can not be called a histogram. Also, there are no gaps between bars in a histogram.

22 Categorize/Group Continuous Quantitative Data Continuous type of quantitative data cannot be put directly into frequency tables since they do not have any obvious categories Categories are created using classes, or intervals/ranges of numbers The continuous data is then put into the classes

23 Categorize/Group Continuous Quantitative Data For ages of adults, a possible set of classes is and older For the class is the lower class limit 39 is the upper class limit The class width is the difference between the upper class limit and the lower class limit For the class 30 39, the class width is = 10 (The difference between two adjacent lower class limits) The class midpoint = Average of the lower limits for the two adjacent classes

24 Categorize/Group Continuous Quantitative Data All the classes should have the same widths, except for the last class The class 60 and above is an openended class because it has no upper limit Classes with no lower limits are also called open-ended classes

25 Categorize/Group Continuous Quantitative Data The classes and the number of values in each can be put into a frequency table Age Number (frequency) and older 110 In this table, there are 1147 subjects between 30 and 39 years old

26 Categorize/Group Continuous Quantitative Data Good practices for constructing tables for continuous variables The classes should not overlap The classes should not have any gaps between them The classes should have the same width (except for possible open-ended classes at the extreme low or extreme high ends) The class boundaries should be reasonable numbers The class width should be a reasonable number

27 Histogram for continuous Quantitative data Just as for discrete data, a histogram can be created from the frequency table Instead of individual data values, the categories are the classes the intervals of data You can label/scale the bars with the lower class limits or class midpoints.

28 Stemplots A stem-and-leaf plot ( or simply Stemplot) is a different way to represent data that is similar to a histogram To draw a stem-and-leaf plot, each data value must be broken up into two components The stem consists of all the digits except for the right most one The leaf consists of the right most digit For the number 173, for example, the stem would be 17 and the leaf would be 3

29 Example of a Stemplot In the stem-and-leaf plot below The smallest value is 56 The largest value is 180 The second largest value is 178

30 Stemplots Construction To draw a stem-and-leaf plot Write all the values in ascending order Find the stems and write them vertically in ascending order For each data value, write its leaf in the row next to its stem The resulting leaves will also be in ascending order The list of stems with their corresponding leaves is the stem-and-leaf plot

31 Modification to Stemplots Modifications to stem-and-leaf plots Sometimes there are too many values with the same stem we would need to split the stems (such as having in one stem and in another) If we wanted to compare two sets of data, we could draw two stem-and-leaf plots using the same stem, with leaves going left (for one set of data) and right (for the other set) a sideby-side stem plot

32 Dot Plots A dot plot is a graph where a dot is placed over the observation each time it is observed The following is an example of a dot plot

33 Shapes of Plots for Quantiative Data The pattern of variability displayed by the data of a variable is called distribution. The distribution displays how frequent each value of the variable occurs. A useful way to describe a quantitative variable is by the shape of its distribution Some common distribution shapes are Uniform Bell-shaped (or normal) Skewed right Skewed left Bimodal Note: We are not concerned about the shapes of the plots for qualitative data, because there is no particular order arrangement for the categories of the nominal data. Once we change the order, the shape of the graph will be changed.

34 Uniform Distribution A variable has a uniform distribution when Each of the values tends to occur with the same frequency The histogram looks flat

35 Normal Distribution A variable has a bell-shaped (normal) distribution when Most of the values fall in the middle The frequencies tail off to the left and to the right It is symmetric

36 Right-skewed Distribution A variable has a skewed right distribution when The distribution is not symmetric The tail to the right is longer than the tail to the left The arrow from the middle to the long tail points right In Other words: The direction of skewness is determined by the side of distribution with a longer tail. That is, if a distribution has a longer tail on its right side, it is called a right-skewed distribution. Right

37 Left-skewed Distribution A variable has a skewed left distribution when The distribution is not symmetric The tail to the left is longer than the tail to the right The arrow from the middle to the long tail points left Left

38 Bimodal Distribution There are two peaks/humps or highest points in the distribution. Often implies two populations are sampled. The graph below shows a bimodal distribution for body mass. It implies that data come from two populations, each with its own separate average. Here, one group has an average body mass of 147 grams and the other has a average body mass of 178 grams.

39 Summary Quantitative data can be organized in several ways Histogram is the most used graphical tool. Histograms based on data values are good for discrete data Histograms based on classes (intervals) are good for continuous data The shape of a distribution describes a variable histograms are useful for identifying the shapes

40 Summarize data numerically Measure of Center, Spread, and Position

41 Measure of Center Mean, Median, Mode, Midrange

42 Measures of Center Numerical values used to locate the middle of a set of data, or where the data is most clustered The term mean/average is often associated with the measure of center of a distribution.

43 Mean An arithmetic mean For a population the population mean Is computed using all the observations in a population Is denoted by a Greek letter µ ( called mu) Is a parameter For a sample the sample mean Is computed using only the observations in a sample Is denoted (called x bar) Is a statistic x Note: We usually cannot measure µ (due to the size of the population) but would like to estimate its value with a sample mean x

44 Formula for Means The sample mean is the sum of all the values divided by the size of the sample, n: 1 1 x = xi = ( x + x x n n 1 n ) The population mean is the sum of all the values divided by the size of the population, N: 1 µ = x N i = 1 N ( x + x x 1 2 N Note: is called summation, means summing all values. It is a short-cut notation for adding a set of numbers. )

45 Example Example:The following sample data represents the number of accidents in each of the last 6 years at a dangerous intersection. Find the mean number of accidents: 8, 9, 3, 5, 2, 6, 4, 5: Solution: 1 x = = 8 ( ) 525. In the data above, change 6 to 26: Solution: 1 x = = ( ). Note: The mean can be greatly influenced by outliers (extremely large or small values)

46 Median The median denoted by M of a variable is the center. The median splits the data into halves When the data is sorted in order, the median is the middle value The calculation of the median of a variable is slightly different depending on If there are an odd number of points, or If there are an even number of points

47 How to Obtain a Median? To calculate the median of a data set Arrange the data in order Count the number of observations, n If n is odd There is a value that s exactly in the middle That value is the median If n is even There are two values on either side of the exact middle Take their mean to be the median

48 Example An example with an odd number of observations (5 observations) Compute the median of Sort them in order 6, 1, 11, 2, 11 1, 2, 6, 11, 11 The middle number is 6, so the median is 6

49 Example An example with an even number of observations (4 observations) Compute the median of 6, 1, 11, 2 Sort them in order 1, 2, 6, 11 Take the mean of the two middle values (2 + 6) / 2 = 4 The median is 4

50 Quick Way to Locate Median 1. Rank the data (Suppose, the sample size is n.) 2. Find the position of the median (counting from either end) using the formula: i = n +1 2 Then, the median is the ith smallest value.

51 Example 1 Suppose we want to find the median of the data set 4, 8, 3, 8, 2, 9, 2, 11, 3, 1. Rank the data: 2, 2, 3, 3, 4, 8, 8, 9, Find the position of the median using the formula: n +1 2 For the data given, n is 9 (because the size of the sample is 9, that is, there are 9 data values given), so the median position is = 5 The median is the 5 th smallest or 5 th largest value, which is 4. 2

52 Example 2 Consider this data set 4, 8, 3, 8, 2, 9, 2, 11, 3, Rank the data: 2, 2, 3, 3, 4, 8, 8, 9, 11, Find the position of the median using the formula: n +1 2 For the data given, n is 10 (because the size of the sample is 10, that is, there are 10 data values given), so the median position is = The median is the 5.5 th smallest or largest value. In other words, it is in the middle of the 5 th and 6 th smallest or largest values. Since the 5 th value is 4 and the 6 th value is 8. We average out 4 and 8, so the median is 6.

53 Mode The mode of a variable is the most frequently occurring value. For instance, Find the mode of the data 6, 1, 2, 6, 11, 7, 3 Since the data contain 6 distinct values: 1, 2, 3, 6, 7, 11 and, the value 6 occurs twice, all the other values occur only once, so the mode is 6 Note: If two or more values in a sample are tied for the highest frequency (number of occurrences), there is no mode

54 Midrange Another useful measure of the center of the distribution is Midrange, which is the number exactly midway between a lowest value data L and a highest value data H. It is found by averaging the low and the high values: midrange= L+ H 2

55 Comparing mean and Median The mean and the median are often different This difference gives us clues about the shape of the distribution Is it symmetric? Is it skewed left? Is it skewed right? Are there any extreme values?

56 Mean and Median Symmetric the mean will usually be close to the median Skewed left the mean will usually be smaller than the median Skewed right the mean will usually be larger than the median

57 Symmetric Distribution If a distribution is symmetric, the data values above and below the mean will balance The mean will be in the middle The median will be in the middle Thus the mean will be close to the median, in general, for a distribution that is symmetric

58 Left-skewed Distribution If a distribution is skewed left, there will be some data values that are larger than the others The mean will decrease The median will not decrease as much Thus the mean will be smaller than the median, in general, for a distribution that is skewed left

59 Right-skewed Distribution If a distribution is skewed right, there will be some data values that are larger than the others The mean will increase The median will not increase as much Thus the mean will be larger than the median, in general, for a distribution that is skewed right

60 Mean and Median If one value in a data set is extremely different from the others? For instance, if we made a mistake and 6, 1, 2 was recorded as 6000, 1, 2 The mean is now ( ) / 3 = 2001 The median is still 2 The median is resistant to extreme values than the mean.

61 Round-off Rule When rounding off an answer, a common rule-of-thumb is to keep one more decimal place in the answer than was present in the original data To avoid round-off buildup, round off only the final answer, not intermediate steps

62 Measure of Spread Range, Variance, Standard Deviation

63 Measures of Spread/Dispersion Measures of central tendency alone cannot completely characterize a set of data. Two very different data sets may have similar measures of central tendency. Measures of dispersion are used to describe the spread, or variability, of a distribution Common measures of dispersion: range, variance, and standard deviation

64 Range The range of a variable is the largest data value minus the smallest data value Compute the range of The largest value is 11 The smallest value is 1 6, 1, 2, 6, 11, 7, 3, 3 Subtracting the two 11 1 = 10 the range is 10 Note: Please do not confused the range with the midrange which is a measure for the center of data distribution

65 Range The range only uses two values in the data set the largest value and the smallest value The range is affected easily by extreme values in the data. (i.e., not resistant to outliers) If we made a mistake and 6, 1, 2 was recorded as 6000, 1, 2 The range is now ( ) = 5999

66 Deviations From The Mean The variance is based on the deviation from the mean ( x i µ) for populations x ( x i ) for samples Deviation may be positive or negative depending on if value is above the mean or below the mean. So, the sum of all deviations will be zero. To avoid the cancellation of the positive deviations and the negative deviations when we add them up, we square the deviations first: ( x i µ) 2 for populations x ( x i ) 2 for samples

67 Population Variance The population variance of a variable is the average of these squared deviations, i.e. is the sum of these squared deviations divided by the number in the population ( xi µ ) ( x1 µ ) + ( x2 µ ) ( xn µ ) = N N 2 The population variance is represented by σ 2 (namely sigma square) Note: For accuracy, use as many decimal places as allowed by your calculator during the calculation of the squared deviations, if the average is not a whole number.

68 Example Compute the population variance of 6, 1, 2, 11 Compute the population mean first µ = ( ) / 4 = 5 Now compute the squared deviations (1 5) 2 = 16, (2 5) 2 = 9, (6 5) 2 = 1, (11 5) 2 = 36 Average the squared deviations ( ) / 4 = 15.5 The population variance σ 2 is 15.5

69 Sample Variance The sample variance of a variable is the average deviations for the sample data, i.e., is the sum of these squared deviations divided by one less than the number in the sample The sample variance is represented by s 2 Note: we use n 1 as the devisor. 1 ) (... ) ( ) ( 1 ) ( = n x x x x x x n x x N i

70 Example Compute the sample variance of 6, 1, 2, 11 Compute the sample mean first = ( ) / 4 = 5 Now compute the squared deviations (1 5) 2 = 16, (2 5) 2 = 9, (6 5) 2 = 1, (11 5) 2 = 36 Average the squared deviations ( ) / 3 = 20.7 The sample variance s 2 is 20.7

71 Computational Formulas for the Sample Variance A shortcut (a quick way to compute) formula for the sample variance: ( because you do not need to compute all the deviations from the mean.) s 2 = x 2 n 1 ( x) x 2 is the sum of the squars of each data value. x is the square of the sum of all data values. ( ) For the above example, = = 162, S = n x ( x) = ( ) = 400 = 20.7

72 Compare Population and Sample Variances Why are the population variance (15.5) and the sample variance (20.7) different for the same set of numbers? In the first case, { 6, 1, 2, 11 } was the entire population (divide by N) In the second case, { 6, 1, 2, 11 } was just a sample from the population (divide by n 1) These are two different situations

73 Why Population and Sample Variances are different? Why do we use different formulas? The reason is that using the sample mean is not quite as accurate as using the population mean If we used n in the denominator for the sample variance calculation, we would get a biased result Bias here means that we would tend to underestimate the true variance

74 Standard Deviation The standard deviation is the square root of the variance The population standard deviation Is the square root of the population variance (σ 2 ) Is represented by σ The sample standard deviation Is the square root of the sample variance (s 2 ) Is represented by s Note: Standard deviation can be interpreted as the average deviation of the data. It has the same measuring unit as the original data ( e.g. inches). The variance has a squared unit (e.g. inches 2 ).

75 Example If the population is { 6, 1, 2, 11 } The population variance σ 2 = 15.5 The population standard deviation σ = If the sample is { 6, 1, 2, 11 } The sample variance s 2 = 20.7 The sample standard deviation s = 20.7 = 15.5 = The population standard deviation and the sample standard deviation apply in different situations

76 Compute mean and Variance for A Frequency Distribution To calculate the mean, variance for a set of sample data: In a grouped frequency distribution, we use the frequency of occurrence associated with each class midpoint In an ungrouped frequency distribution, use the frequency of occurrence, f, of each observation x xf = f s 2 = 2 x f f ( xf ) 1 f 2

77 Grouped Data To compute the mean, variance, and standard deviation for grouped data Assume that, within each class, the mean of the data is equal to the class midpoint (which is an average of two adjacent lower lass limits.) Use the class midpoint as an approximated value for all data in the same class, since their actual values are not provided. The number of times the class midpoint value is used is equal to the frequency of the class For instance, if 6 values are in the interval [ 8, 10 ], then we assume that all 6 values are equal to 9 (the midpoint of [ 8, 10 ]

78 Example of Grouped Data As an example, for the following frequency table, Class Midpoint Frequency we calculate the mean as if The value 1 occurred 3 times The value 3 occurred 7 times The value 5 occurred 6 times The value 7 occurred 1 time

79 Example of Grouped Data Class Midpoint Frequency The calculation for the mean would be Or ( 1 3) + (3 7) + (5 6) + (7 1) 17 xf X = f Which follows the formula = 3.6 7

80 Example of Grouped Data Since the sample size = f = = 17 the Sum of squared values = x 2 f = = 265 the square of the sum = ( x ) = ( ) = 61 = 3721 f Follow the short-cut formula for the sample variance, we obtain the sample variance S 2 = = = the sample standard deviation S = = 1.7

81 Summary The mean for grouped data Use the class midpoints Obtain an approximation for the mean The variance and standard deviation for grouped data Use the class midpoints Obtain an approximation for the variance and standard deviation

82 Example of Ungrouped Data Example: A survey of students in the first grade at a local school asked for the number of brothers and/or sisters for each child. The results are summarized in the table below. Here, we see 15 students responded o sibling, 17 students responded 1 sibling, etc. Total number of students in this survey is 62, which is n = f. Find 1) the mean, 2) the variance, and 3) the standard deviation: Solutions: First: Sum: x f x f x 2 f ) x = 93/ 62= ( 93) 2) s 2 = = ) s= 163. = 128.

83 Measure of Position Percentiles, Quartiles

84 Measures of Position Measures of position are used to describe the relative location of an observation within a data set. Quartiles and percentiles are two of the most popular measures of position Quartiles are part of the 5-number summary

85 Percentile The median divides the lower 50% of the data from the upper 50% The median is the 50 th percentile If a number divides the lower 34% of the data from the upper 66%, that number is the 34 th percentile

86 Quartiles Quartiles divide the data set into four equal parts The quartiles are the 25 th, 50 th, and 75 th percentiles Q 1 = 25 th percentile Q 2 = 50 th percentile = median Q 3 = 75 th percentile Quartiles are the most commonly used percentiles The 50 th percentile and the second quartile Q 2 are both other ways of defining the median

87 How to Find Quartiles? 1. Order the data from smallest to largest. 2. Find the median Q The first quartile (Q 1 ) is then the median of the lower half of the data; that is, it is the median of the data falling below the median (Q 2 ) position (and not including Q 2 ). 4. The third quartile (Q 3 ) is the median of the upper half of the data; that is, it is the median of the data falling above the Q 2 position (not including Q 2 ). Note: Excel has a set of different rules to compute these quartiles than the TI graphing calculator which will follow the rules stated above. So, different software may give different quartiles, particularly if the sample size is an odd-numbed. However, for a large data set, the values are often not much different. In our class, we will only follow the rules stated here.

88 Example The following data represents the ph levels of a random sample of swimming pools in a California town. Find the three quartiles. Solutions: ) Median= Q 2 = the average of the 10 th and 11 th smallest values = ( )/2 =6.55 2) The first quartile = Q 1 = the median of the 10 values below the median = the average of the 5 th and 6 th smallest values = ( )/2 = 6.0 3) The third quartile =Q 3 = the median of the 10 values above the median = the average of the 15 th and 16 th smallest values = ( )/2 = 6.95

89 Outliers Extreme observations in the data are referred to as outliers Outliers should be investigated Outliers could be Chance occurrences Measurement errors Data entry errors Sampling errors Outliers are not necessarily invalid data

90 How To Detect Outliers? One way to check for outliers uses the quartiles Outliers can be detected as values that are significantly too high or too low, based on the known spread The fences used to identify outliers are Lower fence = LF = Q IQR Upper fence = UF = Q IQR Values less than the lower fence or more than the upper fence could be considered outliers

91 Example Is the value 54 an outlier? 1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 54 Calculations Q 1 = (4 + 7) / 2 = 5.5 Q 3 = ( ) / 2 = 29 IQR = = 23.5 UF = Q IQR = = 64 Using the fence rule, the value 54 is not an outlier

92 Another Measure of the Spread Inter-quartile range (IQR)

93 Inter-quartile Range (IQR) The inter-quartile range (IQR) is the difference between the third and first quartiles IQR = Q 3 Q 1 The IQR is a resistant measurement of spread. Its value will not be affected easily by extremely large or small values in a data set, since IQR covers only the middle 50% of values.)

94 Another Graphical Tool to Summarize Data Five-number Summary & Boxplot

95 Five-number Summary The five-number summary is the collection of The smallest value The first quartile (Q 1 or P 25 ) The median (M or Q 2 or P 50 ) The third quartile (Q 3 or P 75 ) The largest value These five numbers give a concise description of the distribution of a variable

96 Why These Five Numbers? The median Information about the center of the data Resistant measure of a center The first quartile and the third quartile Information about the spread of the data Resistant measure of a spread The smallest value and the largest value Information about the tails of the data

97 Example Compute the five-number summary for the ordered data: 1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 54 Calculations The minimum = 1 Q 1 = P 25, Q 1 = 7 M = Q 2 = P 50 = ( ) / 2 = 17.5 Q 3 = P 75 = 27 The maximum = 54 The five-number summary is 1, 7, 17.5, 27, 54

98 Boxplot The five-number summary can be illustrated using a graph called the boxplot An example of a (basic) boxplot is The middle box shows Q 1, Q 2, and Q 3 The horizontal lines (sometimes called whiskers ) show the minimum and maximum

99 How to draw A Boxplot? To draw a (basic) boxplot: 1. Calculate the five-number summary 2. Draw & scale a horizontal number line which will cover all the data from the minimum to the maximum 3. Mark the 5 numbers on the number line according to the scale. 4. Superimpose these five marked points on some distance above the lines. 5. Draw a box with the left edge at Q 1 and the right edge at Q 3 6. Draw a line inside the box at M = Q 2 7. Draw a horizontal line from the Q 1 edge of the box to the minimum and one from the Q 3 edge of the box to the maximum

100 Example To draw a (basic) boxplot Draw the middle box Draw in the median Draw the minimum and maximum Voila!

101 A Modified Boxplot An example of a more sophisticated boxplot is The middle box shows Q 1, Q 2, and Q 3 The horizontal lines (sometimes called whiskers ) show the minimum and maximum The asterisk on the right shows an outlier (determined by using the upper fence)

102 How To Draw A Modified Boxplot? To draw a modified boxplot 1. Draw the center box and mark the median, as before 2. Compute the upper fence and the lower fence 3. Temporarily remove the outliers as identified by the upper fence and the lower fence (but we will add them back later with asterisks) 4. Draw the horizontal lines to the new minimum and new maximum (These are the minimum and maximum within the fence) 5. Mark each of the outliers with an asterisk Note: Sometimes, data contain no outliers. You will obtain a basic boxplot.

103 Example To draw this boxplot Draw the middle box and the median Draw in the fences, remove the outliers (temporarily) Draw the minimum and maximum Draw the outliers as asterisks

104 Interpret a Boxplot The distribution shape and boxplot are related Symmetry (or lack of symmetry) Quartiles Maximum and minimum Relate the distribution shape to the boxplot for Symmetric distributions Skewed left distributions Skewed right distributions

105 Symmetric Distribution Distribution Q 1 is equally far from the median as Q 3 is The min is equally far from the median as the max is Boxplot The median line is in the center of the box The left whisker is equal to the right whisker Q 1 M Q 3 Min Q 1 M Q 3 Max

106 Left-skewed Distribution Distribution Q 1 is further from the median than Q 3 is The min is further from the median than the max is Boxplot The median line is to the right of center in the box The left whisker is longer than the right whisker Min Q 1 MQ 3 Max Min Q 1 MQ 3 Max

107 Right-skewed Distribution Distribution Q 1 is closer to the median than Q 3 is The min is closer to the median than the max is Boxplot The median line is to the left of center in the box The left whisker is shorter than the right whisker Min Q 1 M Q 3 Max Min Q 1 M Q 3 Max

108 Side-by-side Boxplot We can compare two distributions by examining their boxplots We draw the boxplots on the same horizontal scale We can visually compare the centers We can visually compare the spreads We can visually compare the extremes

109 Example Comparing the flight with the control samples Center Spread

110 Summary 5-number summary Minimum, first quartile, median, third quartile maximum Resistant measures of center (median) and spread (interquartile range) Boxplots Visual representation of the 5-number summary Related to the shape of the distribution Can be used to compare multiple distributions

111 Using Technology for Statistics Instruction for TI Graphing Calculator

112 Entering Data into TI Calculator Enter data in lists: Press STAT then choose EDIT menu. (We ll denote the sequence of the key strokes by STAT EDIT). Entering data one by one (press Return after each entry) under a blank column which represents a variable (a list). Note: 1. Clear a list: on EDIT screen, use the up arrow to place the cursor on the list name, press CLEAR, then ENTER (that is, CLEAR ENTER). You need to always clear a list before entering a new set of data into the list. Warning! Pressing the DEL key instead of CLEAR will delete the list from the calculator. You can get it back with the INS key. See Insert a new list below. 2. List name: there are six built-in lists, L1 through L6, and you can add more with your own names. You can get the L1 symbol by pressing the 2 ND key, then 1 key [ 2nd 1 ].(The instruction in the brackets shows the sequence of keys you need to press, here, you press 2ND key, then 1 key to have a L1 symbol.) 3. Insert a new list (optional): STAT EDIT, use the up arrow to place the cursor on a list name, then press INS [ 2nd DEL ]. Type the name of a list using the alpha character keys. The ALPHA key is locked down for you. Press ENTER. The new list is placed just before the point where the cursor was. To obtain a quick statistics, just use one of the build-in list L1 through L6 to enter the data, you do not need to create a new list with a name.

113 Obtain Numeric Measures from TI Calculator 1. After entering data, return to home screen by pressing QUIT[2 nd MODE]. 2. Press STAT Key, select CALC menu, then choose the number 1 operation : 1-Var Stats, then ENTER. Enter the name of the list, say L 1. That is, STAT CALC 1 ENTER L 1 Note: L 1 is the default list. You do not need to enter it, if the data is on L 1

114 Obtain Statistics from a Frequency Distribution Enter the values in one list, say L 1, and their corresponding frequencies in another list, say L 2. Then, STAT CALC 1 ENTER L 1, L 2 Note: Need to enter comma L 2 after L 1. The calculator will use the second list as the frequency for the values entered on its list before to calculate the appropriate statistics.

115 Example 1 Example: A random sample of students in a sixth grade class was selected. Their weights are given in the table below. Find the mean and variance, standard deviation, 5-number summary for this data using the TI calculator: The output shows: x = S x = 2261 = σ = n = 25 min X = 63 1 Med = 92 Q x x 3 x 2 Q = 84 = 99 = max X = 112 Note: 1. Since this a sample data, we take S x as the standard deviation. 2. You may need to press the arrow key on the calculator several times to view these many statistics.

116 Example 2 Consider the grouped data we considered previously: Class Midpoint Frequency Use TI calculator to obtain the statistics: The output shows: Note: Here, the notations used in the x = S x x = 61 = 265 = σ = x n = 17 min X = 1 1 x 2 Q = 3 Med = 3 Q3 = 5 max X = 7 calculator correspond to the notations used in the formula for computing mean, variance and standard deviation of a frequency distribution: n = f = 2 = x x x f x 2 f

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 3 - Displaying and Summarizing Quantitative Data Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

CHAPTER 2: SAMPLING AND DATA

CHAPTER 2: SAMPLING AND DATA CHAPTER 2: SAMPLING AND DATA This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and Georgia Highlands College. OUTLINE 2.1 Stem-and-Leaf Graphs (Stemplots),

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

UNIT 1A EXPLORING UNIVARIATE DATA

UNIT 1A EXPLORING UNIVARIATE DATA A.P. STATISTICS E. Villarreal Lincoln HS Math Department UNIT 1A EXPLORING UNIVARIATE DATA LESSON 1: TYPES OF DATA Here is a list of important terms that we must understand as we begin our study of statistics

More information

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES STP 6 ELEMENTARY STATISTICS NOTES PART - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES Chapter covered organizing data into tables, and summarizing data with graphical displays. We will now use

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

15 Wyner Statistics Fall 2013

15 Wyner Statistics Fall 2013 15 Wyner Statistics Fall 2013 CHAPTER THREE: CENTRAL TENDENCY AND VARIATION Summary, Terms, and Objectives The two most important aspects of a numerical data set are its central tendencies and its variation.

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,

More information

2.1: Frequency Distributions and Their Graphs

2.1: Frequency Distributions and Their Graphs 2.1: Frequency Distributions and Their Graphs Frequency Distribution - way to display data that has many entries - table that shows classes or intervals of data entries and the number of entries in each

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys Unit 7 Statistics AFM Mrs. Valentine 7.1 Samples and Surveys v Obj.: I will understand the different methods of sampling and studying data. I will be able to determine the type used in an example, and

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

Measures of Central Tendency

Measures of Central Tendency Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of

More information

AND NUMERICAL SUMMARIES. Chapter 2

AND NUMERICAL SUMMARIES. Chapter 2 EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Descriptive Statistics

Descriptive Statistics Chapter 2 Descriptive Statistics 2.1 Descriptive Statistics 1 2.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Display data graphically and interpret graphs:

More information

CHAPTER 3: Data Description

CHAPTER 3: Data Description CHAPTER 3: Data Description You ve tabulated and made pretty pictures. Now what numbers do you use to summarize your data? Ch3: Data Description Santorico Page 68 You ll find a link on our website to a

More information

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years.

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years. 3: Summary Statistics Notation Consider these 10 ages (in years): 1 4 5 11 30 50 8 7 4 5 The symbol n represents the sample size (n = 10). The capital letter X denotes the variable. x i represents the

More information

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one. Probability and Statistics Chapter 2 Notes I Section 2-1 A Steps to Constructing Frequency Distributions 1 Determine number of (may be given to you) a Should be between and classes 2 Find the Range a The

More information

Name Date Types of Graphs and Creating Graphs Notes

Name Date Types of Graphs and Creating Graphs Notes Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.

More information

No. of blue jelly beans No. of bags

No. of blue jelly beans No. of bags Math 167 Ch5 Review 1 (c) Janice Epstein CHAPTER 5 EXPLORING DATA DISTRIBUTIONS A sample of jelly bean bags is chosen and the number of blue jelly beans in each bag is counted. The results are shown in

More information

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS To Describe Data, consider: Symmetry Skewness TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS Unimodal or bimodal or uniform Extreme values Range of Values and mid-range Most frequently occurring values In

More information

Measures of Central Tendency

Measures of Central Tendency Measures of Central Tendency MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2017 Introduction Measures of central tendency are designed to provide one number which

More information

Univariate Statistics Summary

Univariate Statistics Summary Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:

More information

CHAPTER 2 DESCRIPTIVE STATISTICS

CHAPTER 2 DESCRIPTIVE STATISTICS CHAPTER 2 DESCRIPTIVE STATISTICS 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is how the data is spread or distributed over the range of the data values. This is one of

More information

AP Statistics Summer Assignment:

AP Statistics Summer Assignment: AP Statistics Summer Assignment: Read the following and use the information to help answer your summer assignment questions. You will be responsible for knowing all of the information contained in this

More information

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set. Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean the sum of all data values divided by the number of values in

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Exploratory Data Analysis

Exploratory Data Analysis Chapter 10 Exploratory Data Analysis Definition of Exploratory Data Analysis (page 410) Definition 12.1. Exploratory data analysis (EDA) is a subfield of applied statistics that is concerned with the investigation

More information

Frequency Distributions

Frequency Distributions Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,

More information

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40 Chpt 3 Data Description 3-2 Measures of Central Tendency 1 /40 Chpt 3 Homework 3-2 Read pages 96-109 p109 Applying the Concepts p110 1, 8, 11, 15, 27, 33 2 /40 Chpt 3 3.2 Objectives l Summarize data using

More information

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc Section 2-2 Frequency Distributions Copyright 2010, 2007, 2004 Pearson Education, Inc. 2.1-1 Frequency Distribution Frequency Distribution (or Frequency Table) It shows how a data set is partitioned among

More information

Chapter 2 Modeling Distributions of Data

Chapter 2 Modeling Distributions of Data Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,

More information

Chapter 3: Describing, Exploring & Comparing Data

Chapter 3: Describing, Exploring & Comparing Data Chapter 3: Describing, Exploring & Comparing Data Section Title Notes Pages 1 Overview 1 2 Measures of Center 2 5 3 Measures of Variation 6 12 4 Measures of Relative Standing & Boxplots 13 16 3.1 Overview

More information

1.2. Pictorial and Tabular Methods in Descriptive Statistics

1.2. Pictorial and Tabular Methods in Descriptive Statistics 1.2. Pictorial and Tabular Methods in Descriptive Statistics Section Objectives. 1. Stem-and-Leaf displays. 2. Dotplots. 3. Histogram. Types of histogram shapes. Common notation. Sample size n : the number

More information

September 11, Unit 2 Day 1 Notes Measures of Central Tendency.notebook

September 11, Unit 2 Day 1 Notes Measures of Central Tendency.notebook Measures of Central Tendency: Mean, Median, Mode and Midrange A Measure of Central Tendency is a value that represents a typical or central entry of a data set. Four most commonly used measures of central

More information

Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

More information

+ Statistical Methods in

+ Statistical Methods in 9/4/013 Statistical Methods in Practice STA/MTH 379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Discovering Statistics

More information

Basic Statistical Terms and Definitions

Basic Statistical Terms and Definitions I. Basics Basic Statistical Terms and Definitions Statistics is a collection of methods for planning experiments, and obtaining data. The data is then organized and summarized so that professionals can

More information

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques SEVENTH EDITION and EXPANDED SEVENTH EDITION Slide - Chapter Statistics. Sampling Techniques Statistics Statistics is the art and science of gathering, analyzing, and making inferences from numerical information

More information

Organizing and Summarizing Data

Organizing and Summarizing Data 1 Organizing and Summarizing Data Key Definitions Frequency Distribution: This lists each category of data and how often they occur. : The percent of observations within the one of the categories. This

More information

AP Statistics Prerequisite Packet

AP Statistics Prerequisite Packet Types of Data Quantitative (or measurement) Data These are data that take on numerical values that actually represent a measurement such as size, weight, how many, how long, score on a test, etc. For these

More information

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016) CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1 Daphne Skipper, Augusta University (2016) 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is

More information

Measures of Position. 1. Determine which student did better

Measures of Position. 1. Determine which student did better Measures of Position z-score (standard score) = number of standard deviations that a given value is above or below the mean (Round z to two decimal places) Sample z -score x x z = s Population z - score

More information

Measures of Dispersion

Measures of Dispersion Measures of Dispersion 6-3 I Will... Find measures of dispersion of sets of data. Find standard deviation and analyze normal distribution. Day 1: Dispersion Vocabulary Measures of Variation (Dispersion

More information

Understanding and Comparing Distributions. Chapter 4

Understanding and Comparing Distributions. Chapter 4 Understanding and Comparing Distributions Chapter 4 Objectives: Boxplot Calculate Outliers Comparing Distributions Timeplot The Big Picture We can answer much more interesting questions about variables

More information

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015 MAT 142 College Mathematics Statistics Module ST Terri Miller revised July 14, 2015 2 Statistics Data Organization and Visualization Basic Terms. A population is the set of all objects under study, a sample

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the

More information

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form. CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.

More information

1 Overview of Statistics; Essential Vocabulary

1 Overview of Statistics; Essential Vocabulary 1 Overview of Statistics; Essential Vocabulary Statistics: the science of collecting, organizing, analyzing, and interpreting data in order to make decisions Population and sample Population: the entire

More information

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or  me, I will answer promptly. Statistical Methods Instructor: Lingsong Zhang 1 Issues before Class Statistical Methods Lingsong Zhang Office: Math 544 Email: lingsong@purdue.edu Phone: 765-494-7913 Office Hour: Monday 1:00 pm - 2:00

More information

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

LESSON 3: CENTRAL TENDENCY

LESSON 3: CENTRAL TENDENCY LESSON 3: CENTRAL TENDENCY Outline Arithmetic mean, median and mode Ungrouped data Grouped data Percentiles, fractiles, and quartiles Ungrouped data Grouped data 1 MEAN Mean is defined as follows: Sum

More information

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution Chapter 2 Summarizing & Graphing Data Slide 1 Overview Descriptive Statistics Slide 2 A) Overview B) Frequency Distributions C) Visualizing Data summarize or describe the important characteristics of a

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms

More information

Chapter 1. Looking at Data-Distribution

Chapter 1. Looking at Data-Distribution Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw

More information

STP 226 ELEMENTARY STATISTICS NOTES

STP 226 ELEMENTARY STATISTICS NOTES ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 2 ORGANIZING DATA Descriptive Statistics - include methods for organizing and summarizing information clearly and effectively. - classify

More information

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.

More information

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,

More information

1.3 Graphical Summaries of Data

1.3 Graphical Summaries of Data Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 1.3 Graphical Summaries of Data In the previous section we discussed numerical summaries of either a sample or a data. In this

More information

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite

More information

Chapter Two: Descriptive Methods 1/50

Chapter Two: Descriptive Methods 1/50 Chapter Two: Descriptive Methods 1/50 2.1 Introduction 2/50 2.1 Introduction We previously said that descriptive statistics is made up of various techniques used to summarize the information contained

More information

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 KEY SKILLS: Organize a data set into a frequency distribution. Construct a histogram to summarize a data set. Compute the percentile for a particular

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data MATH& 146 Lesson 10 Section 1.6 Graphing Numerical Data 1 Graphs of Numerical Data One major reason for constructing a graph of numerical data is to display its distribution, or the pattern of variability

More information

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd Chapter 3: Data Description - Part 3 Read: Sections 1 through 5 pp 92-149 Work the following text examples: Section 3.2, 3-1 through 3-17 Section 3.3, 3-22 through 3.28, 3-42 through 3.82 Section 3.4,

More information

Downloaded from

Downloaded from UNIT 2 WHAT IS STATISTICS? Researchers deal with a large amount of data and have to draw dependable conclusions on the basis of data collected for the purpose. Statistics help the researchers in making

More information

Numerical Summaries of Data Section 14.3

Numerical Summaries of Data Section 14.3 MATH 11008: Numerical Summaries of Data Section 14.3 MEAN mean: The mean (or average) of a set of numbers is computed by determining the sum of all the numbers and dividing by the total number of observations.

More information

DAY 52 BOX-AND-WHISKER

DAY 52 BOX-AND-WHISKER DAY 52 BOX-AND-WHISKER VOCABULARY The Median is the middle number of a set of data when the numbers are arranged in numerical order. The Range of a set of data is the difference between the highest and

More information

MATH NATION SECTION 9 H.M.H. RESOURCES

MATH NATION SECTION 9 H.M.H. RESOURCES MATH NATION SECTION 9 H.M.H. RESOURCES SPECIAL NOTE: These resources were assembled to assist in student readiness for their upcoming Algebra 1 EOC. Although these resources have been compiled for your

More information

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques. Section 2.1 - Introduction Graphs are commonly used to organize, summarize, and analyze collections of data. Using a graph to visually present a data set makes it easy to comprehend and to describe the

More information

Lecture Notes 3: Data summarization

Lecture Notes 3: Data summarization Lecture Notes 3: Data summarization Highlights: Average Median Quartiles 5-number summary (and relation to boxplots) Outliers Range & IQR Variance and standard deviation Determining shape using mean &

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

Math 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency

Math 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency Math 14 Introductory Statistics Summer 008 6-9-08 Class Notes Sections 3, 33 3: 1-1 odd 33: 7-13, 35-39 Measures of Central Tendency odd Notation: Let N be the size of the population, n the size of the

More information

Descriptive Statistics

Descriptive Statistics Chapter 2 Descriptive Statistics 2.1 Descriptive Statistics 1 2.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Display data graphically and interpret graphs:

More information

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data. Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting

More information

Measures of Central Tendency:

Measures of Central Tendency: Measures of Central Tendency: One value will be used to characterize or summarize an entire data set. In the case of numerical data, it s thought to represent the center or middle of the values. Some data

More information

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics Section. Displaying Quantitative Data with Graphs Mrs. Daniel AP Statistics Section. Displaying Quantitative Data with Graphs After this section, you should be able to CONSTRUCT and INTERPRET dotplots,

More information

Chapter 3 Analyzing Normal Quantitative Data

Chapter 3 Analyzing Normal Quantitative Data Chapter 3 Analyzing Normal Quantitative Data Introduction: In chapters 1 and 2, we focused on analyzing categorical data and exploring relationships between categorical data sets. We will now be doing

More information

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11} Entering Data: Basic Commands Consider the data set: {15, 22, 32, 31, 52, 41, 11} Data is stored in Lists on the calculator. Locate and press the STAT button on the calculator. Choose EDIT. The calculator

More information

3. Data Analysis and Statistics

3. Data Analysis and Statistics 3. Data Analysis and Statistics 3.1 Visual Analysis of Data 3.2.1 Basic Statistics Examples 3.2.2 Basic Statistical Theory 3.3 Normal Distributions 3.4 Bivariate Data 3.1 Visual Analysis of Data Visual

More information

Maths Revision Worksheet: Algebra I Week 1 Revision 5 Problems per night

Maths Revision Worksheet: Algebra I Week 1 Revision 5 Problems per night 2 nd Year Maths Revision Worksheet: Algebra I Maths Revision Worksheet: Algebra I Week 1 Revision 5 Problems per night 1. I know how to add and subtract positive and negative numbers. 2. I know how to

More information

Lecture 6: Chapter 6 Summary

Lecture 6: Chapter 6 Summary 1 Lecture 6: Chapter 6 Summary Z-score: Is the distance of each data value from the mean in standard deviation Standardizes data values Standardization changes the mean and the standard deviation: o Z

More information

Section 6.3: Measures of Position

Section 6.3: Measures of Position Section 6.3: Measures of Position Measures of position are numbers showing the location of data values relative to the other values within a data set. They can be used to compare values from different

More information

2.1: Frequency Distributions

2.1: Frequency Distributions 2.1: Frequency Distributions Frequency Distribution: organization of data into groups called. A: Categorical Frequency Distribution used for and level qualitative data that can be put into categories.

More information

3.2-Measures of Center

3.2-Measures of Center 3.2-Measures of Center Characteristics of Center: Measures of center, including mean, median, and mode are tools for analyzing data which reflect the value at the center or middle of a set of data. We

More information

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Probability and Statistics. Copyright Cengage Learning. All rights reserved. Probability and Statistics Copyright Cengage Learning. All rights reserved. 14.5 Descriptive Statistics (Numerical) Copyright Cengage Learning. All rights reserved. Objectives Measures of Central Tendency:

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 2 Summarizing and Graphing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Histograms

More information

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents Chapter 7: Proportions and Percents CHAPTER 7: PROPORTIONS AND PERCENTS Date: Lesson: Learning Log Title: Date: Lesson: Learning Log Title: Chapter 7: Proportions and Percents Date: Lesson: Learning Log

More information

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data Chapter 2 Organizing and Graphing Data 2.1 Organizing and Graphing Qualitative Data 2.2 Organizing and Graphing Quantitative Data 2.3 Stem-and-leaf Displays 2.4 Dotplots 2.1 Organizing and Graphing Qualitative

More information