Chapter 2 Descriptive Statistics Tabular and Graphical Presentations
Frequency Distributions Frequency distribution tabular summary of data showing the number of items that appear in non-overlapping classes.
Frequency Distributions Frequency distribution tabular summary of data showing the number of items that appear in non-overlapping classes. Ex: red, red, blue, red, orange, purple, green, red, blue, red, blue, blue x f Blue 4 Green 1 Orange 1 Purple 1 Red 5
Frequency Distributions Frequency distribution tabular summary of data showing the number of items that appear in non-overlapping classes. Ex: red, red, blue, red, orange, purple, green, red, blue, red, blue, blue Ex2: 0, 1, 2, 1, 1, 1, 2, 1, 2, 0, 0, 4, 1, 0, 2, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1 x f Blue 4 Green 1 Orange 1 Purple 1 Red 5 x f 0 10 1 10 2 4 3 0 4 1
Frequency Distributions Frequency distribution tabular summary of data showing the number of items that appear in non-overlapping classes. Ex: red, red, blue, red, orange, purple, green, red, blue, red, blue, blue Ex2: 0, 1, 2, 1, 1, 1, 2, 1, 2, 0, 0, 4, 1, 0, 2, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1 Ex3: 17, 19, 18, 22, 26, 41, 21, 18, 18, 24, 30, 21, 18, 18, 19, 21, 18, 20 x f Blue 4 Green 1 Orange 1 Purple 1 Red 5 x f 0 10 1 10 2 4 3 0 4 1 class f 17-21 13 22-26 3 27-31 1 32-36 0 37-41 1
Relative and Percent Frequency Distributions Relative frequency = frequency / number of values Percent frequency = relative frequency expressed as a percent x f Relative frequency Percent frequency x f Relative frequency Percent frequency Blue 4 4/12 33.3 0 10.4 40 Green 1 1/12 8.3 1 10.4 40 Orange 1 1/12 8.3 2 4.16 16 Purple 1 1/12 8.3 3 0 0 0 Red 5 5/12 41.6 4 1.04 4
Graphical Representations for Categorical Data Bar chart bars don t touch (start the vertical axis at 0) x f Blue 4 Green 1 Orange 1 Purple 1 Red 5
Graphical Representations for Categorical Data Pie chart x Relative Frequency Blue 33.3% Green 8.3% Orange 8.3% Purple 8.3% Red 41.7%
Creating Classes For Quantitative Data Should have 5-20 classes. Each class should be the same width. Class width = difference between lower limits of consecutive classes. To determine what class width to use: (largest value smallest value + 1 ) / number of classes you want Class midpoint = ( upper class limit lower class limit ) / 2
Frequency Distributions Ex3: 17, 19, 18, 22, 26, 41, 21, 18, 18, 24, 30, 21, 18, 18, 19, 21, 18, 20 Class width? Class midpoints? class midpoint class f (17+21)/2 =19 24 29 34 39 17-21 13 22-26 3 27-31 1 32-36 0 37-41 1
Frequency Distributions Ex4: 1, 1, 3, 5, 7, 7, 10, 15, 16, 16, 17, 21, 25, 31, 46, 47, 53, 62, 68, 77 Class widths? Class midpoints?
Graphical Representations for Quantitative Data Histogram like a bar chart, but the bars touch (start the vertical axis at 0) Frequency histogram graphical representation of frequency distribution Relative frequency histogram graphical representation of relative frequency distribution x f Relative frequency 0 10.4 1 10.4 2 4.16 3 0 0 4 1.04
Graphical Representations for Quantitative Data Histogram like a bar chart, but the bars touch (start the vertical axis at 0) Frequency histogram graphical representation of frequency distribution Relative frequency histogram graphical representation of relative frequency distribution class midpoint class f 19 17-21 13 24 22-26 3 29 27-31 1 34 32-36 0 39 37-41 1
Distribution Shapes Uniform, Symmetric, Skewed, more to come
Distribution Shapes Uniform, Symmetric, Skewed, more to come
Distribution Shapes Uniform, Symmetric, Skewed, more to come
Another Graphical Representation Option for Quantitative Data Stem and Leaf Stems are what the values in each class have in common (typically 10s place, 100s place, etc) Leaves are what distinguishes values in a class (typically bottom one or two digits) Ex4: 1, 1, 3, 5, 7, 7, 10, 15, 16, 16, 17, 21, 25, 31, 46, 47, 53, 62, 68, 77
Stem and Leaf Ex4: 1, 1, 3, 5, 7, 7, 10, 15, 16, 16, 17, 21, 25, 31, 46, 47, 53, 62, 68, 77 0 1 1 3 5 7 7 1 0 5 6 6 7 2 1 5 3 2 4 6 7 5 3 6 2 8 7 7
Stem and Leaf How does the shape of the stem and leaf plot compare to the frequency histogram? 0 1 1 3 5 7 7 1 0 5 6 6 7 2 1 5 3 2 4 6 7 5 3 6 2 8 7 7 Class f 0-9 6 10-19 5 20-29 2 30-39 1 40-49 2 50-59 1 60-69 2 70-79 1
Stem and Leaf When is using a stem and leaf plot preferable to using a frequency histogram? Why? 0 1 1 3 5 7 7 1 0 5 6 6 7 2 1 5 3 2 4 6 7 5 3 6 2 8 7 7
Stem and Leaf Ex5: 60, 67, 68, 70, 71, 71, 73, 74, 75, 76, 77, 77, 82, 84, 87, 93 6 67 78 79 8 8 9 0 7 8 0 1 1 3 4 5 6 7 7 2 4 7 3
When There Are Two Variables Crosstabulation (pivot table) Example from pages 54-55 in your textbook Restaurant Quality Rating Meal Price ($) 1 Good 18 2 Very Good 22 3 Good 28 4 Excellent 38 5 Very Good 33 6 Good 28 7 Very Good 19 8 Very Good 11 9 Very Good 23 10 Good 13 11 Very Good 33 12 Very Good 44 13 Excellent 42 14 Excellent 34 15 Good 25 Quality Rating Relative Frequency Good 0.28 Very Good 0.50 Excellent 0.22 Quality Rating 10-19 20-29 30-39 40-49 Total Excellent 2 14 28 22 66 Very Good 34 64 46 6 150 Good 42 40 2 0 84 Total 78 118 76 28 300
With Cross Tabulations. Beware of Simpson s Paradox example below from p56 in the textbook (when there are more than two variables) Judge Verdict Luckett Kendall Total Upheld 129 (86%) 110 (88%) 239 Reversed 21 (14%) 15 (12%) 36 Total 150 (100%) 125 (100%) 275 Judge Luckett Judge Kendall Verdict Common Pleas Municipal Court Total Verdict Common Pleas Municipal Court Total Upheld 29 (91%) 100 (85%) 129 Upheld 90 (90%) 20 (80%) 110 Reversed 3 (9%) 18 (15%) 21 Reversed 10 (10%) 5 (20%) 15 Total 32 (100%) 118(100%) 150 Total 100 (100%) 25 (100%) 125
When There Are Two Variables and both are quantitative Scatter diagram (with trendline) plotting ordered pairs Quantitative Variable Two (variable one value, variable two value) Quantitative Variable One
Does there appear to be a relationship between the variables?
Correlation Correlation co-relation. Variables move together. Strong positive linear correlation Strong negative linear correlation