12 Statistics. Exercise Set 12-1

Size: px
Start display at page:

Download "12 Statistics. Exercise Set 12-1"

Transcription

1 1 Statistics Exercise Set Measurements or observations that are gathered for an event under study are called data.. The branch of mathematics that involves collecting, organizing, summarizing, and presenting data, and then drawing general conclusions from said data is called statistics. 3. A population consists of all subjects under study; a sample is a representative subgroup, or subset, of the population. 4. Subjects in the population are numbered, and then they are selected according to corresponding random numbers. 5. Number each member of the population, and then select every k th member. The starting number, though, must be selected at random. 6. Divide the population into groups where the members of any group have similar characteristics. Select members from each group at random. 7. Have an existing group of subjects that represent the population. 8. They both are grouping data in ranges of values. 9. Descriptive statistics are used to describe a set of data. These statistics are not used to draw any conclusions about anything other than the data at hand. Inferential statistics involve techniques that are used to describe a population when you only have data from a sample of the population you are trying to describe. 10. Descriptive statistics utilizes deductive reasoning and inferential statistics utilizes inductive reasoning. 11. It is a cluster sample since an existing group of subjects that represent the population is used for a sample. 1. It is a systematic sample since every seventh customer is selected. 13. It is a random sample since each subject of the population has an equal chance of being selected. 14. It is a systematic sample since every hundredth hamburger is checked. 15. It is a stratified sample since the population is divided into groups and members from each group are randomly selected. 16. It is a random sample since every day of the year has an equal chance of being selected. 17. No; students that are well-off might be underrepresented. 18. Yes; the IDs are randomly-chosen, and prison IDs are unlikely to be based on specific characteristics of the prisoner. 19. Yes; the target group is everyone that has a phone, and everyone surveyed obviously has a phone. 0. No; poor or homeless people are far less likely to have a phone. 1. No; the first five products on a shelf are likely to be older so that the store can get rid of older food first.. Yes; the color of an M&M shouldn t affect its weight. 586

2 3. Rank Tally Frequency Fr 18 So 1 Jr 6 Se 4 4. Source Tally Frequency I 13 N 5 R 3 T 4 5. Show Tally Frequency S 6 D 5 B 7 A 7 587

3 6. highest value lowest value Round this up to 66. Start with lowest value and add 66 to get the lower class limits: 1, 87, 153, 19, 85, 351. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Class Tally Frequency highest value lowest value Start with the lowest value and add 7 to get the lower class limits: 7, 34, 41, 48, 55, 6, 69. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Class Tally Frequency

4 8. highest value lowest value difference number of classes 6 which rounds up to 3.9. Start with the lowest value and add 3.9 to get the lower class limits: 4.5, 8.4, 3.3, 36., 40.1, and Set up the classes by subtracting 0.1 from each lower class limit except the first lower class limit. Class Tally Frequency The lower limit for the next class will be 60. So the class width is Successively add 60 to get the lower limits: 0, 60, 10, 180, 40, 300, 360, and 40. Class Tally Frequency

5 30. highest value lowest value Start with the lowest value and add 97 to get the lower limits: 5, 10, 199, 96, 393, 490, 587, 684. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Class Tally Frequency

6 31. highest value lowest value 11, , 63 Start with the lowest value and add 1,17 to get the lower class limits: 150, 1,77,,404, 3,531, 4,658, 5,785, 6,91, 8,039, 9,166, 10,93. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Class Tally Frequency 150 1,76 1,77,403,404 3, ,531 4, ,658 5, ,785 6, ,91 8, ,039 9, ,166 10,9 3 10,93 11,

7 3. combined highest combined lowest difference which rounds up to 31. number of classes 8 Start with the lowest value and add 31 to get the lower class limits: 306, 337, 368, 399, 430, 461, 49, 53. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Class McGwire: Tally & Frequency Sosa: Tally & Frequency highest lowest value 1, ,15 difference 1, number of classes 6 which rounds up to 188. Start with the lowest value and add 111 to get the lower class limits: 75, 913, 1,101, 1,89, 1,477, and 1,665. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Class Tally Frequency , ,101 1,88 3 1,89 1,476 1,477 1, ,665 1, Arrange the data in order. (Note 3, 8, and 9 are written as 03, 08, and 09.) Separate the data according to the first digit. Use the first digits as stems and the second digits as leaves. Stems Leaves Analysis: The most number of calls were made by ten executives in the interval. The least were made by two executives in the group. The most common numbers were 1 calls and 14 calls made by three executives each. 59

8 35. Arrange the data in order. Separate the data according to the first digit. Use the first digits as stems and the second digits as leaves. Stems Leaves Analysis: Most registered vehicles per car stolen are in the range 80 89, while the least are in the 0 49 range and the most common are 84 and Arrange the data in order. Separate the data according to the first digit. Use the first digits as stems and the second digits as leaves. Stems Leaves Arrange the data in order. Separate the data according to the whole number. Stems Leaves 38. Arrange the data in order. Separate the data according to the whole number. Stems Leaves Inferential statistics is used since an inference is made. 40. Descriptive statistics is used since the average describes the data. 41. Inferential statistics is used since a prediction or inference is made. 4. Descriptive statistics is used since the total attendance describes the data. 43. Inferential statistics is used since an inference is made. 593

9 44. Inferential statistics is used since a prediction or inference is made. 45. Answers may vary. 46. Answers may vary. 47. Answers may vary. 48. Answers may vary. 49. Answers may vary. 50. Answers may vary. 51. The majority of states have taxes below $1.80, and just a handful are over $3.00. Explanations may vary. 5. The vast majority of cities have theft rates between 1 in and 1 in Explanations may vary. Exercise Set 1-1. Bar Graph: Label horizontal axis with data values and vertical axis with frequencies. Draw a bar the height of the corresponding frequency over the given data value. Pie Chart: Find the degree measures by dividing the frequencies by the total number of data and multiplying by 360. Then divide the pie chart up accordingly.. Histograms are continuous and represent the data without gaps. This is perfect for grouped data since the lower class boundary of one class is the upper class boundary of the previous class. Bar graphs and pie charts are best for categorical data since we are typically comparing individual items which can be easily visualized with a bar or pie chart. 3. Histograms and frequency polygons both represent the frequency of grouped data. A histogram is similar to a bar graph whereas a frequency polygon is similar to a line graph. 4. A time series graph is used to see how something changes over time. 5. Draw the bars with heights corresponding to the number of transplants. By far the most common type of transplant is kidney, while pancreas is easily the least common. 6. Draw the bars with heights corresponding to the number of students. While flu was the most common condition, the ailments were fairly spread out, with all but flu and ear infection having between 10 and 40 patients. 594

10 7. Draw the bars with heights corresponding to the number of taxicabs. 8. Draw the bars with heights corresponding to the number of unemployed people. 9. Rank Frequency, f degrees f 360 n percent f n 100% Freshman 1 Sophomore 5 Junior 36 Senior % 13% % 8% % 40% % 19% 90 n Cause of death Number of deaths, f Heart disease 43 Cancer 7 Stroke 93 Accidents 4 Other 4 degrees f 360 n percent f n 100% , % 43.% 1, , %.7% 1, , % 9.3% 1, , %.4% 1, , %.4% 1,000 n 1,

11 f 11. Reason Number, f degrees 360 percent f 100% n n Interest in subject % 6% 100 Future earning Potential 18% Future earning potential Pressure from parents % 18% % 1% 100 Pressure from Parents 1% Good Job Prospects 8% Interest In Subject 6% Good job prospects % 8% n Major field Number, f degrees f 360 n percent f n 100% Preschool 893 Elementary 605 Middle 45 Secondary 1, , % 31.5%, , % 1.3%, , % 8.6%,839 1, ,839 1, % 38.6%,839 n, For the histogram, draw vertical bars corresponding to the frequencies for each class. Connect adjacent midpoints with straight lines. Finish the graph by drawing lines back to the horizontal at the beginning and end of the graph. For the frequency polygon, find the midpoints for each class: 94, 103, 11, 11, and 130. Label the horizontal axis with the midpoints. 596

12 14. For the histogram, draw vertical bars corresponding to the frequencies for each class. Frequency Years of Service For the frequency polygon, find the midpoints for each class: 3, 8, 13, 18, 3, and 8. Label the horizontal axis with the midpoints. Connect adjacent midpoints with straight lines. Finish the graph by drawing lines back to the horizontal at the beginning and end of graph. Frequency For the frequency polygon, find the midpoints for each class: 10, 15, 0, 5, and 30. Label the horizontal axis with the midpoints. Connect adjacent midpoints with straight lines. Finish the graph by drawing lines back to the horizontal at the beginning and end of graph. Frequency Frequency Miles per Gallon 16. For the histogram, draw vertical bars corresponding to the frequencies for each class. Seconds Years of Service 15. For the histogram, draw vertical bars corresponding to the frequencies for each class. Frequency Miles per Gallon 597

13 For the frequency polygon, find the midpoints for each class:.6, 3.3, 4.0, 4.7, 5.4, and 6.1. Label the horizontal axis with the midpoints. Connect adjacent midpoints with straight lines. Finish the graph by drawing lines back to the horizontal at the beginning and end of graph. 18. Represent the years on the x axis and the number sold on the y axis, and then draw lines connecting the points. Frequency Seconds Analysis: The most-occurring times are from.3 to 3.6 seconds. Except for a neighborhood of 4.7 seconds, the frequency decreases fast from a neighborhood of 3.3 seconds to greater times. 17. Represent the years on the x axis and the number of people on the y axis, and then draw lines connecting the points. 19. Represent the years on the x axis and the downloads on the y axis, and then draw lines connecting the points. Downloads increased very steadily from 004 to 010, then decreased from 010 to Represent the years on the x axis and the sales on the y axis, and then draw lines connecting the points. Restaurant sales grew dramatically from 1970 to 01, with the rate of growth getting faster. 598

14 1. a) Frequency charts and histograms may vary. One with 7 classes is shown below b) Most states have less than 50,000 employed registered nurses, and very few have more than 100, a) Frequency charts and histograms may vary. One with 6 classes is shown below. b) The most likely high temperature in May is between 80 and 85 degrees; highs less than 75 are very unusual, and the 90s occur occasionally.. a) Frequency charts and histograms will vary. One with 7 classes is shown below. b) The average high for Honolulu in May is remarkably consistent, especially compared to Las Vegas. It was between 73 and 80 every day, with most days falling between 75 and a) Frequency charts and histograms may vary. One with 6 classes is shown below. b) More than half of all states have less than 3,000 employed LPNs, and very few have more than 6,000. The histogram shape is strikingly similar to the one for RNs; the major difference is the labeling. 5. Answers may vary. 6. Answers may vary. 7. Time series graph 8. Pie chart 9. Bar graph 30. Time series graph 31. Bar graph 3. Pie chart 33. Answers may vary. 34. The scale makes the increase look greater. 35. Answers may vary. 36. Answers may vary.. 599

15 Exercise Set Answers may vary.. The mean is calculated by adding the data values and dividing by the total number of values. 3. The median is the middle value (or mean of the middle two values) in a set of data when the data is in either ascending or descending order. 4. The mode of a data set is the value that occurs most often. 5. The midrange is found by averaging the highest and lowest data values. 6. The mode can be used to measure the average of categorical data since it doesn t depend on numerical values. 7. Answers may vary. 8. Answers may vary. 9. Answers may vary. 10. Answers may vary. X 11. mean X n Arrange the data in order The median is the middle value: median 7. The value that occurs most often is 3: mode 3. L+ H midrange 31 X 1. mean X n Arrange the data in order The median is the middle value median 98. The value that occurs most often is 80. mode 80. L+ H midrange 139 X 13. mean X n , , , Since the data are in thousands, the mean is 61,600. Arrange the data in order ,350 1,380 The median is the middle value median 475. Since each value occurs only once, there is no mode. L+ H , 380 midrange

16 X mean X 9.54 n 35 Arrange the data in order The median is the middle value: median 10. There are 4 values that occur 6 times mode 7, 8, 10, and 11 L+ H midrange 10 X 9, 077 mean X, n 10 The data already is in order. The median is the average of the middle values,779 and,668.,779 +,668 median, 73.5 Since each data value occurs only once there is no mode. L+ H 4,313 +,394 midrange 3, X 8.65 mean X 4.35 n 19 Arrange the data in order The median is the middle value: median No data value occurs most often so there is no mode. L+ H midrange X 948 mean X n 5 Arrange the data in order The median is the middle value. median 151 Each data value occurs only once so there is no mode. L+ H midrange 07.5 X 175, 908 mean X 9, 58.3 n 19 Arrange the data in order. 1,364 1,976 1,99 3,15 3,31 3,56 3,831 3,916 5,599 6,908 9,010 9,90 10,901 11,900 1,817 13,807 14,134 7,400 30,45 The median is the middle value median 6,908 Each data value occurs only once so there is no mode. L+ H midrange 1, ,45 15,894.5 X 11 mean X 7.5 n 15 Arrange the data in order The median is the middle value. median 7 The data values 5 and 9 occurs most often. modes are 5 and 9 L+ H midrange

17 X 101, 476 mean X 14, n 7 Arrange the data in order. 11,047 11,970 14,009 15,105 16,111 16,1 17,11 The median is the middle value. median 15,105 Each data value occurs only once so there is no mode. L+ H midrange 11, ,11 14, X 69, 70 mean X 16, n 16 Arrange the data in order. 14,748 15,399 15,5 15,586 16,037 16,148 16,9 16,44 16,58 16,740 16,914 16,99 17,030 18,08 19,650 1,610 The median is the average of the two middle 16, , 58 values, median 16, 485. Each data value occurs only once so there is no mode. L+ H 14, ,610 midrange 18,179 X mean X 9.48 n 1 Arrange the data in order The median is the average of the two middle values, median The data value 8.6 occurs most often, so the mode is 8.6. L+ H midrange Class Frequency Midpoint Frequency Midpoint ,305 X ,305 Class Frequency Midpoint Frequency Midpoint ,005 X 5.15 mpg ,005 60

18 5. Class Frequency Midpoint Frequency Midpoint 7. Class Frequency Midpoint Frequency Midpoint X 4.4 seconds ,99 1,99 X 4.87 or $4.87 million Class Frequency Midpoint Frequency Midpoint 8. Class Frequency Midpoint Frequency Midpoint , , ,05.5 5, , , , , , , ,590 X 1, hours , ,183 1,183 X

19 Class Frequency Midpoint Frequency Midpoint , , , , , , 08 X $ ,08 Class Frequency Midpoint Frequency Midpoint ,779 X 3.7 days Answers may vary. 3. (a) Mode (b) Mode (c) Median 33. Median 34. Mean 35. Mode 36. Mode 37. Mode 38. Mean 75 1, Answers may vary. 40. Answers may vary. 41. Mean and median tend to be close when the data set doesn t have one or two terms that are unusually high or low compared to the others. But when there are outliers like that, it skews the average up or down without affecting the median much. 4. Answers may vary. 43. Estimating that the values in a class will average out to be in the middle of the class. 44. Answers may vary It must be in the range of The 9 th value is 4.9% of the way 1 through the class. The 10 th value is % through the class. If the values 1 are equally spread out, then the first value would be 177 and the 9 th value would be (8) 180 and the 10 th value would be (8) 181. The average of these two values is

20 There are 35 data values, so the median is the 18 th data value. The 18 th data value will fall in the class The first data value in this class is the 10 th data value, so the 18 th data value is the 9 th data value in this class or 9 75% of the way through the class. If the 1 values are equally spread out, then the first value would be 7 and the 9 th value would be (6) Exercise Set Range, variance, and. The range is the difference between the highest value and the lowest value in a data set. 3. Because (a) the range uses only two of the values in the data set, and (b) an extremely large and/or an extremely low value can make the range very large thus giving the impression of more variability than is actually the case. 4. Standard deviation square root of variance. 5. Find the mean and subtract it from each value in the data set. Square each difference, and find the sum of the squares. Divide this sum by (n 1), where n number of values in the data set. Take the square root of the quotient to obtain the. 6. For a given data set, its provides an indication of how far from the mean the individual values (members of the set) are. If two data sets D 1 and D have the same mean, but different s s 1 and s with say, s1 < s it can be concluded that the members of D are more variable than those of D In data set 1 the data are fairly close together with a range of 6; in data set they are more spread out with a range of 31; consequently, the data in set 1 will have a smaller standard deviation than that of set. 8. To be an NFL running back you must be pretty fast, so the times to complete the 40-yard dash will be more consistent. Therefore data set 1 will have a smaller than data set. 9. Since winters in Chicago are much colder than summers, and winters in Los Angeles are not as cold, the in data set 1 will be greater than that of data set. 10. Weights of dogs of the same breed will tend to be more consistent than weights of dogs of varying breeds; therefore data set 1 will have a smaller than data set. 605

21 11. R highest value lowest value X X 14 n 9 9 X X X ( X X) 61 47, ,54 ( X X) 3,54 variance n 1 8 s s The number of junk s varies pretty widely. 1. R highest value lowest value X X n X X X ( X X) , , , ( X X) variance n 1 46, , 63.8 s s 11, The number of hospitals varies quite widely. 13. R highest value lowest value 1, ,799 X 1, ,166 X n X X X ( X X) 1,90 1, ,405, , , ,901 1, ,40, , , , , , ,67.36 ( X X) variance n 1 3, 943, , ,943,0.4 s s 438, The odometer readings vary pretty widely , , ,

22 14. R highest value lowest value X X 4. n X X X ( X X) R highest value lowest value X X n 9, X X X ( X X) , , , , , , ( X X) variance n 1 14 s s The number of hours varies pretty widely. ( X X) 10, variance 1, n 1 8 s s 1, The weights don t vary all that much. 607

23 16. R highest value lowest value X X 9.8 n R highest value lowest value X X 7.7 n 9 9 X X X ( X X) X X X ( X X) ( X X) variance n 1 10 s s The number of stories is fairly uniform ( X X) 7.01 variance 9 n 1 8 s s 9 3 The heights are pretty uniform. 608

24 18. R highest value lowest value 1, X X n 1 8, R highest value lowest value $9.40 $0.7 $8.68 X X n X X X ( X X) X X X ( X X) , , , , , , , , , , , ( X X) variance n 1 5 s s , ,494.9 ( X X) variance n 1 441, , s s 40, The number of calories varies pretty widely. 609

25 0. R highest value lowest value X ,168 X n X X X ( X X) 1. R highest value lowest value $48.84 $9.34 $39.50 X X n X X X ( X X) , , , , , , , ,97.6 ( X X) 6, 97.6 variance, n 1 9 s s, , ( X X) 1, variance 4.59 n 1 7 s s

26 . R highest value lowest value X , 418 X n 9 9 X X X ( X X) , , , , , , , , , variance ( X X) 560,354.4 n 1 560, , R highest value lowest value X , 533 X n 9 9 X X X ( X X) , , , , , , , , , , ( X X) 46, variance 57, n 1 8 s s 57, s s 70,

27 4. R highest value lowest value X X 84 n 5 5 X X X ( X X) ( X X) 558 variance n 1 4 s s R highest value lowest value 5,840 4, 551 1,89 X X n 4, ,357 X X X 5, , ,690 8 ( X X) 5, ,54 5, ,649 5, ,19 ( X X) 1,70,76 variance 45,818 n 1 7 s s 45, R highest value lowest value X X 6.4 n 1 1 X X X ( X X) ( X X) variance 3.7 n 1 11 s s , ,89 5, ,161 5, ,449 4, ,636 4, ,889 1,70,76 61

28 7. The variation is not the same. Find the for each data set. (a) R 1; X 11 X X X ( X X) variance 18.7; s (b) R 1; X 11 X X X ( X X) (c) R 1; X (a) X 30 X X X ( X X) variance 36; s X X X ( X X) ,000 1,000 variance 50; s variance 5.7; s

29 (b) X 35 (e) X 6 X X X ( X X) X X X ( X X) ,000 1,000 variance 50; s (c) X 5 X X X ( X X) ,000 variance 50; s (d) X 150 X X X ( X X) , , , , variance 10; s (f) The remains unchanged when a constant is added or subtracted from the data values. The is multiplied (or divided) by the same constant that is multiplied (or divided) by the data values. 9. a) Average would be close to the hole for Pat, close to the center of the green for Ron. 30. a) b) Pat has a large variation, Ron has a very small variation. c) Answers may vary, but this example shows that variation can sometimes be more meaningful than average. Helena: Juanita: X X.8 n X X 3.0 n b) The variation for Juanita is much greater. c) Answers may vary. 31. Answers may vary. 3. Answers may vary. 5,000 5, 000 variance 6, 50; 4 s 6,

30 Exercise Set If your score is in the 60th percentile that means you scored higher than 60% of the students in the class.. A score in the 90th percentile does not mean you got 90% of the questions right; it means you scored better than 90% of the people who took the test. 3. Quartiles are the 5 th, 50 th, and 75 th percentiles. 4. The second quartile and the median are the same. Since the second quartile is the 50th percentile, that makes it the middle value in the data set, which is how the median is defined. 5. The portion inside the box in a box plot represents the data values between the 1 st and 3 rd quartiles. It is the middle portion of all of the data. 6. An outlier is a data point that just doesn t jive well with the other data points. You can casually spot outliers, but there is also a specific definition when you look at a box plot. An outlier is any data point that is at least specific distance from the box. The distance is 1.5 times the difference between the 1 st and 3 rd quartiles. 7. Arrange the 0 data values in order (a) There are 4 values below % 0 A score of 3 is equivalent to the 0th percentile. (b) There are 15 values below % 0 A score of 44 is equivalent to the 75th percentile. (c) There are 7 values below % 0 A score of 36 is equivalent to the 35th percentile. (d) There is 1 value below % 0 A score of 7 is equivalent to the 5th percentile. (e) There are 18 values below % 0 A score of 49 is equivalent to the 90th percentile. 8. Arrange the 1 data values in order (a) There are 6 values below % 1 A height of 67 in. is equivalent to the 50th percentile. (b) There are 8 values below % 1 A height of 70 in. is equivalent to the 67th percentile. (c) There are values below % 1 A height of 63 in. is equivalent to the 17th percentile. (d) There are 7 values below % 1 A height of 68 in. is equivalent to the 58th percentile. (e) There are 5 values below % 1 A height of 66 in. is equivalent to the 4nd percentile. 9. There are values below Carveta s rank % or 75th percentile There are values below John s rank % or 80th percentile

31 11. There are places after her % or 79%, or 79th 00 percentile 1. There are 1, ,861 stocks with lower dividends. 1, % or 96th percentile; so yes, 1,941 the stock qualifies for her portfolio % of So, 10 students scored lower than Angela % of So, 150 scored below her % of So, scored above him % of So, applicants scored above her, and she will not get an offer. 17. Find Maurice s percentile rank and compare it to Lea s rank of % 600 Maurice is ranked higher because he is in the 63rd percentile. 18. Find Maranda s percentile rank and compare it to Audrelia s rank of % 30 Maranda s rank is higher because she is in the 33rd percentile. 19. Find the basketball team s percentile rank and compare it to the football team s rank. Basketball team: % 344 Football team: % 47 The football team had a better ranking. 0. Find the brother s percentile rank and compare it to the sister s rank. Brother: % 30 Sister: % 193 The brother has a better ranking. 1. Arrange the data in order (a) There are 1 values below % 0 The age of 33 is equivalent to the 60th percentile. (b) 8, since 33 is the 8th value from the top. (c) 0% of The age of 3 corresponds to the 0th percentile because there are 4 values below it.. Arrange the data in order (a) There are 5 values below 96. (b) % 0 The IQ score of 96 is in the 5th percentile. 15, since 96 is the 15th value from the top. (c) 40% of The IQ score of 101 corresponds to the 40th percentile since there are 8 values below it. 3. Arrange the data in order The median is the middle value. Q 34 Find the median of the values less than Q Q1.5 Find the median of the values above Q Q

32 4. Arrange the data in order The median is the middle point Q.03 Find the median of the values less than Q. Q.00 Find the median of the values above Q. Q Arrange the data in order The median is the middle point Q 88.5 Find the median of the values less than Q. Q 1 78 Find the median of the values above Q. Q Arrange the data in order The median is the middle point Q Find the median of the values less than Q Q Find the median of the values above Q Q Arrange the data in order The median is the middle point Q Find the median of the values less than Q. Q Find the median of the values above Q. Q Arrange the data in order The median is the middle point Q.85 Find the median of the values less than Q. Q 1.00 Find the median of the values above Q. Q Find the quartiles: Q 55 Q 1 4 Q Draw a number line ranging from the least to the greatest value and label Q 1, Q, and Q 3 on it. Draw in the rectangular box and vertical line. a) The majority of the values are on the low end of the distribution. b) and 45. are outliers; there are two countries that have far more Internet users than all others. 30. Q Q Q Draw a number line ranging from the least to the greatest value and label Q 1, Q, and Q 3 on it. Draw in the rectangular box and vertical line a) The data set is distributed fairly evenly. b) No; none of the years had unusually high or low prices. 617

33 31. Use the quartile values found in the solution to exercise 7. Draw a number line ranging from the least to the greatest value and label Q 1, Q, and Q 3 on it. Draw in the rectangular box and vertical line. a) The majority of the years have homicides on the higher end of the range. b) 66 is an outlier; one year had an exceptionally low homicide total. 3. Use the quartile values found in the solution to exercise 8. Draw a number line ranging from the least to the greatest value and label Q 1, Q, and Q 3 on it. Draw in the rectangular box and vertical line. a) At least half of the top ten are between and 4 billion. b) To say the least, Google is an outlier. It dwarfs the sizes of the other companies. Groupon is also an outlier 33. It is possible to score a 90% on a test and have a percentile rank of less than 90 or exactly 90, depending on how the other students did. 34. It is not possible to rank in the 100th percentile because you d have to do better than everyone, including yourself. 35. Answers may vary. 36. Answers may vary. 37. Find the quartiles in the current order, then switch Q 1 and Q. 38. a) If a data set has 10 values, the value at the 80 th percentile is ranked ninth, but 0.8 times 10 is 8. b) When the data set has a very large number of values the claim is close to true. Exercise Set In a normal distribution the data is usually centered about the middle of the range.. Answers may vary. 3. The area under a portion of the normal curve corresponds to the percentage of data that fall in that range. 4. The empirical rule says that about 68% of the data are within one of the mean, 95% are within two s of the mean, and about 99.7% are within three s of the mean. 5. Since the normal curve is symmetric about the mean, 50% of the area lies to the left of the mean and 50% of the area lies to the right of the mean. 6. The total area under the normal curve is 1 since it represents the probability of the entire sample space which should equal It is a normal distribution with mean zero and Obviously, not all sets of data that are normally distributed have a mean of zero and 1, which means that we can t apply the standard normal deviation. But, finding z scores converts your data set into a data set that does have a mean of zero and a To find the z score of a data value, you subtract the mean from it and then you divide it by the. 10. Because the area between z 0 and a positive z score is the same as the area between z 0 and the negative of that z score and so the values are one below and above the mean respectively. Since about 68% of the data are within one of the mean and , approximately 34 data values fall between 10.5 and

34 and , so the values are two below and above the mean respectively. Since about 95% of the data are within two s of the mean and , approximately 190 data values fall between 84 and and , so the values are three below and above the mean respectively. Since about 99.7% of the data are within two standard deviations of the mean and , approximately 499 data values fall between 35 and and , so the values are two s below and above the mean respectively. Since about 95% of the data are within two s of the mean and , approximately 114 data values fall between 17.4 and and , so the values are one below and above the mean respectively. Since about 68.% of the data are within two s of the mean and , approximately 54 data values fall between 3. and and , so the values are two s below and above the mean respectively. Since about 95% of the data are within two s of the mean and , approximately 713 data values fall between 175 and The area between z 0 and z is The area between z 0 and z 0.5 is The area between z 0 and z.05 is The area to the right of z 1.0 is The area to the right of z 0.5 is The area between z 0 and z 1.95 is The area to the left of z 0.40 is

35 The area to the left of z 1.45 is The area between z 0.5 and z 1.10 is The area between z 1.5 and z 1.90 is The area between z.45 and z 1.05 is The area between z 0.8 and z 1.3 is The area to the left of z 1.0 is The area between z 0.85 and z 0.0 is The area to the left of z.15 is The area between z 1.55 and z 1.85 is The area to the right of z 1.90 is

36 34. The area to the right of z 0.0 is The area to the left of z 0.60 is The area to the right of z 1.10 is The area between z 1.90 and z 1.95 is The area between z 0.1 and z 0. is a) , so 606 is two s the mean. Since.% of the data falls more than two standard deviations below the mean, about 97.8%. b) and , so the values are one below and above the mean respectively. Since about 68.% of the data are within one s of the mean and , approximately 341 data values fall between 3. and a) , so the value is two s above the mean. Since about.5% of the data is two standard deviations above the mean, the probability that you were born more than 98 days after you were conceived is b) There are 83 days between April 19 and January 7, so we are taking about 83 days after conception , so 83 is one above the mean. Since 50% % 84.1% of the data is less than one above the mean, the probability that the baby will be born before January 7 is about If the mean is 8 ounces, then only 50% of the bags would have at least 8 ounces and 97% of the bags would have ounces. So, if the mean were set to 8.8, then the manager would have his wish. 4. a) and , so the values are one above and one below the mean. Since 68.% of the data is less than one standard deviation away from the mean, the probability that a student will get a score between 63 and 85 is 0.6. b) , so the values are greater than two s below the mean. Since 97.5 percent of the data is greater than two s above the man, the probability of a score over 53 is Find a z value for which the area between it and 0 is From z-table, z Find a z value for which the area between it and 0 is From z-table, z (a) From z-table, z ±.05. (b) From z-table, z ±1.75. (c) From z-table, z ± Answers may vary. 61

37 Exercise Set Many real-life situations, with a large and random population, closely resemble the normal distribution (that is, the theoretical one). Since the mathematics-statistics of this distribution are well known, certain conclusions or probabilities can be drawn from an appropriate real-life situation.. Plot the data, and see if the graph has properties similar to those of the normal distribution. (In more advanced texts, certain tests do exist for deciding.) 3. The area under a normal distribution between two data values is the probability that a randomly selected value is between those two data values. 4. In order to calculate the probabilities of a variable being in some range, you need to find the z scores for the lowest and highest data values for the range. 5. (a) (b) z The area between z 0 and z 0.5 is Since the desired area is in the right tail, subtract from The probability that a randomly selected production worker earns more than $15 is z The area between z 0 and z 0.9 is Since the desired area is in the left tail, subtract 0.31 from The probability that a randomly selected production worker earns less than $14.00 is value mean 6 z The area between z 0 and z 1.33 is Since the desired area is in the right tail, subtract from The probability that it will cost two people more than $6 to go to a movie is (a) z 55, , 547 9, The area between z 0 and z 0.49 is Since the desired area is in the right tail, subtract from The probability that a teacher earns more than $55,000 is (b) z 45, , 547 9, The area between z 0 and z 0.6 is 0.3. Since the desired area is in the left tail, subtract from The probability that the teacher earns less than $45,000 is (a) z The area between z 0 and z 1.3 is Since the area under the curve to the right of z 1.3 is desired, add to The probability that he or she spent more than $60 per purchase is

38 (b) z The area between z 0 and z 1.05 is Since the area under the curve to the left of z 1.05 is desired, add to The probability that he or she spent less than $80 per purchase is (a) value mean z The area between z 0 and z.58 is Since the desired area is in the left tail, subtract from The probability that he or she owned the set less than.5 years is value mean (b) z value mean z The area between z 0 and z.0 is The area between z 0 and (c) z 0.90 is Since the desired area is between z.0 and z 0.90, subtract from The probability that he or she owned the set between 3 and 4 years is value mean z The area between z 0 and z 0.67 is Since the area under the normal curve to the right of z 0.67 is desired, add to The probability that he or she owned the set more than 4. years is value mean (a) z value mean z The area between z 0 and z 0.75 is The area between z 0 and z 0.75 is also The total area is The probability that the CEO is between 53 and 59 years old is value mean (b) z value mean z The area between z 0 and z 0.5 is The area between z 0 and z 1.75 is Since the desired area is between z 0.5 and z 1.75, subtract 0.19 from The probability that the CEO is between 58 and 63 years old is value mean (c) z value mean z The area between z 0 and z 1.5 is The area between z 0 and z 0.5 is Since the desired area is between z 0.5 and z 1.5, subtract from The probability that the CEO is between 50 and 55 years old is

39 11. (a) z1 5, , 000,000.5 z 8, , 000,000 1 The area between z 0 and z.5 is The area between z 0 and z 1 is Since the desired area is between z 1 and z.5, subtract from The probability that a tire s lifetime is between 5,000 and 8,000 miles is (b) z1 7, , 000, z 3, , 000,000 1 The area between z 0 and z 1.5 is The area between z 0 and z 1 is The total area is The probability that a tire s lifetime is between 7,000 and 3,000 miles is (a) 13. (a) (c) z1 31, , 000, z 33, , 000, The area between z 0 and z 0.75 is The area between z 0 and z 1.75 is Since the desired area is between z 0.75 and z 1.75, subtract 0.73 from The probability that a tire s lifetime is between 31,500 and 33,500 miles is value mean z The area between z 0 and z 9.83 is approximately Since the desired area is to the right of z 9.83, subtract from The probability that a visitor spends at least 180 minutes per visit is 0. value mean 50 6 (b) z 1 1 The area between z 0 and z 1 is Since the desired area is to the right of z 1, add to The probability that a visitor spends at least 50 minutes per visit is value mean z 1 6 The area between z 0 and z 1 is Since the desired area is to the left of z 1, add to The probability that at most 50 inches of snow will be received is

40 value mean (b) z The area between z 0 and z 1.5 is Since the desired area is in the right tail, subtract from The probability that at least 53 inches of snow will be received is value mean (a) z value mean z The area between z 0 and z 1.6 is The area between z 0 and z 0.31 is 0.1. The total area is The probability that the customer will have to wait between 5 and 10 minutes is value mean 6 9. (b) z value mean 9 9. z The area between z 0 and z 1.3 is Since the desired area is to the left of z 1.3, subtract from The area between z 0 and z 0.08 is Since the desired area is to the right of z 0.08, add to The total desired area is The probability that a customer will have to wait less than 6 minutes or more than 9 minutes is value mean (a) z value mean z The area between z 0 and z 1.66 is The area between z 0 and z 0.93 is The total area is The probability that it will take a student between 15 and 30 minutes to complete the test is (a) value mean (b) z value mean z The area between z 0 and z 1.14 is Since the desired area is to the left of z 1.14, subtract from The area between z 0 and z 0.59 is 0.. Since the desired area is to the right of z 0.59, subtract 0. from The total desired area is The probability that it will take a student less than 18 minutes or more than 8 minutes to complete the test is value mean z.5 8 The area between z 0 and z.5 is Since the desired area is to the right of z.5, add to The probability that a person burns more than 80 calories is value mean (b) z The area between z 0 and z 0.88 is Since the desired area is in the left tail, subtract from The probability that a person burns less than 93 calories is (c) z z The area between z 0 and z 1.88 is The area between z 0 and z.5 is The total area is The probability that a person burns between 85 and 30 calories is

41 value mean (a) z The area between z 0 and z 0.69 is Since the desired area is to the right of z 0.69, add to The probability that the temperature will be above 6 is (a) value mean (b) z The area between z 0 and z 0.88 is Since the desired area is to the left of z 0.88, add to The probability that the temperature will be below 67 is value mean (c) z The area between z 0 and z 0.5 is The area between z 0 and z 1.19 is Since the desired area is between z 0.5 and z 1.19, subtract from The probability that the temperature will be between 65 and 68 is value mean z The area between z 0 and z 0.5 is Since the desired area is to the right of z 0.5, add to The probability that the person s blood pressure is above 130 is value mean (b) z 1 8 The area between z 0 and z 1 is Since the desired area is to the left of z 1, add to The probability that the person s blood pressure is below 140 is (a) value mean (c) z value mean z The area between z 0 and z 0.13 is The area between z 0 and z 0.5 is The total area is The probability that the person s blood pressure is between 131 and 136 is value mean z The area between z 0 and z 0.47 is Since the desired area is in the left tail, subtract from , Therefore 638 people will score below 93. value mean (b) z The area between z 0 and z 1.33 is Since the desired area is in the right tail, subtract from , Therefore 184 people will score above 10. value mean (c) z The area between z 0 and z 1.33 is The area between z 0 and z 0.33 is The total area is ,000 1,074 Therefore 1,074 people will score between 80 and

42 value mean (d) z The area between z 0 and z 1.67 is The area between z 0 and z 1.0 is Since the desired area is between 1.67 and 1.0, subtract from , So 136 people will score between 75 and (a) z1 1,900, z,000, The area between z 0 and z.91 is The area between z 0 and z.6 is The desired area is So 5 homes will have between 1,900 and,000 square feet. (b) z 3,000, The area between z 0 and z 4.3 is 0.5. Since the desired area is to the right of z 4.3, subtract from Therefore no homes will have more than 3,000 square feet. (c) (d) 1. (a) z,000, The area between z 0 and z.6 is Since the desired area is to the left of z.6, subtract from Therefore 6 homes will have less than,000 square feet. z 1,500, The area between z 0 and z 5.55 is 0.5. Since the desired area is to the right of z 5.55, add to All the homes will have more than 1,500 square feet. z 300, , , The area between z 0 and z.09 is Since the desired area is in the right tail, subtract 0.48 from Therefore 14 homes cost more than $300,

43 (b) z1 00,000 14, , z 300, , , The area between z 0 and z 0.35 is The area between z 0 and (c). (a) z.09 is The desired area is Therefore, 495 homes cost between $00,000 and $300,000. z 150, , , The area between z 0 and z 1.57 is Since the desired area is to the left of z 1.57, subtract 0.44 from Approximately 46 homes cost less than $150,000. value mean z The area between z 0 and z 1.49 is Since the desired area is in the left tail, subtract 0.43 from , Therefore, 68 books were sold for less than $8.00. value mean (b) z The area between z 0 and z 0.47 is Since the desired area is in the right tail, subtract from , So, 319 books were sold for more than $ (c) z z The area between z 0 and z 0.0 is The area between z 0 and z 0.96 is The desired area is , So, 340 books were sold for between $9.50 and $ value mean (d) z The area between z 0 and z 0.7 is The area between z 0 and z 0.5 is The desired area is , Therefore, 93 books were sold for between $9.80 and $ We need to find the percentage of workers who make less than Earl. z The area between z 0 and z 0.59 is 0.. Since the desired area is the left half and the portion in the right half below 0.59, add So, 77.% (or approximately 7%) of the workers make less than Earl. Earl s wage is in the 7nd percentile. 68

44 4. We need to find the percentage of people who keep their TVs less than 4 years. value mean z The area between z 0 and z 0.90 is Since the desired area is in the left tail, subtract from So 18.4% (or approximately 18%) of people keep their TVs less than 4 years. That puts the amount of time you kept your TV in about the 18th percentile. 5. We need to find the percentage of people who spend fewer than 0 minutes at a time on a social networking site. value mean 0 6 z The area between z 0 and z 3.50 is approximately Since the desired area is in the left tail, subtract from You are in the 0th percentile; that is, no one spends less time than you. 6. We need to find the percentile rank for,000 and 5,000. z,000, The area between z 0 and z.6 is Since the desired area is to the left of z.6, subtract from So,000 square feet is in about the 1st percentile. z 5,000, The area between z 0 and z is Since the desired area is to the left of z 17.49, add to So the 5,000 square foot home is in the 99th percentile (remember there is no 100th percentile). The change in percentile rank is a) b) Mean.34; s 0.58 c) z z z The area between z 0 and z 1.45 is Since the desired area is in the left tail, subtract 0.47 from So the probability of a reaction time less than 1.5 seconds is The area between z 0 and z 0.59 is The area between z 0 and z.86 is Since the desired area is between these two values, the probability of a reaction time between and 4 seconds is So the probability of a reaction time between and 4 seconds is

45 8. a) 9. b) Mean 1.67; s 0.44 c) z z z The area between z 0 and z is Since the desired area is in the left tail, subtract from So the probability of a reaction time less than 1.5 seconds is The area between z 0 and z 0.75 is The area between z 0 and z.86 is Since the desired area is between these two values, the probability of a reaction time between and 4 seconds is So the probability of a reaction time between and 4 seconds is 0.5. b) Mean ; s 5.95 c) z z z The area between z 0 and z 1.75 is Since the desired area is in the right tail, subtract from So the probability of a salary over $60,000 is The area between z 0 and z is The area between z 0 and z is Since the desired area is between these two values, the probability of salary between $40,000 and $50,000 is

46 30. a) b) Mean 55.83; s 5.91 c) z z z The area between z 0 and z is Since the desired area is in the right tail, subtract 0.61 from So the probability of a salary over $60,000 is The area between z 0 and z.679 is The area between z 0 and z is Since the desired area is between these two values, the probability of salary between $40,000 and $50,000 is a) Very unlikely. Gas prices tend to fluctuate pretty wildly. b) This would most likely be normally distributed with mean something a bit more than pounds. c) Possibly, but not necessarily. Since basketball favors tall players, but there are still some shorter players, the heights with the largest number of players would probably be somewhat above the mean. d) Probably, although the number of hits may fluctuate depending on the day of the week, which could affect the distribution. e) Probably not, for pretty much the same reason in part c. The ages probably go from 18 up to the 60s, but the distribution would be very strongly skewed toward the younger side. 3. The regional campus with a z score of 0.90, compared to the main campus with a z score of No, it would have the same shape. 34. The A s and F s will have areas under the standard normal curve of and 0.450, respectively. Let X A score that divides the As and X F score that divides the Fs. The z values corresponding to an area of on either side of z 0 are 1.64 and X A ; A 10(1.64) X + X F ; X F 10( 1.64) The Cs will have an area under the standard normal curve of on either side of z 0. Let X C1 score that divides the Cs from Ds and let X C score that divides Cs from Bs. The z values corresponding to an area of on either side of z 0 are 0.84 and XC ; X C1 10(0.84) XC ; 10 X C 10( 0.84) The As will have scores above 76. The Bs will have scores between 69 and 76. The Cs will have scores between 5 and 68. The Ds will have scores between 44 and 51. The Fs will have scores of 43 and below. 631

47 35. The areas under the normal curve to the left and right of z 0 are 0.5 and 0.5. The z values corresponding to these areas are z 0.68 and z lower limit lower limit 15( 0.68) upper limit upper limit 15(0.68) The approximate limits are 90 to The area under the normal curve between z 0 and the cutoff time is The z value corresponding to an area of to the left of z 0 is cutoff time cutoff time 4.3( 0.84) The cutoff time is approximately 55 minutes. 37. P(of taking less than 1.5 seconds to react) In Exercise 7, the probability was 5 found to be P(of taking between and 4 seconds to react) In Exercise 7, the probability 5 was found to be 0.7. The probabilities are reasonably close, so our assumption that the data was approximately normal distributed seems reasonable. 38. P(of taking less than 1.5 seconds to react) In Exercise 8, the probability was 5 found to be P(of taking between and 4 seconds to react) In Exercise 8, the probability 5 was found to be 0.5. The probabilities are reasonably close, so our assumption that the data was approximately normal distributed seems reasonable. Exercise Set A scatter plot is a plot of ordered pairs of data from two data sets and is used to predict whether there may be a correlation between the data sets.. Generally, as x increases, so does y. The points would form a straight, or roughly straight, stream from lower left to upper right. 3. Generally, as x increases, y decreases. The points would form a straight or roughly straight, stream from upper left to lower right. 4. The correlation coefficient tells you the strength of the correlation between two data sets. 5. The regression line for two data sets is the line that best fits the scatter plot of the data. 6. If there is a linear correlation between two data sets the regression line can be used to predict the value of a dependent variable by plugging in the value for the corresponding independent variable into the regression equation. 7. Correlation shows that two variables are related. A correlation does not explain WHY the variables are related. Causation can explain 8. A 5% significance level means that there is a 5% chance of being wrong about the conclusion and a 95% chance of being wrong about the conclusion. 9. Answers may vary. 10. Answers may vary. For this exercise set, use the following formulas. r n( xy) ( x)( y) ( ) ( ) ( ) ( ) n x x n y y ( ) ( )( ) n x x n xy x y b a ( ) ( ) y b( x) n 63

48 11. (a) 1. (a) (b) x y xy x y Σ ,791 7(496) 8(105) r [7(140) (8) ][7(1,791) (105) ] (c) n 7, 5% level, Table 1-4 value n 7, 1% level, Table 1-4 value Since r 9.77 is greater than each value, r is significant at the 5% and the 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 7(496) (8)(105) b (140) (8) (8) a The equation of the regression line is y x. (e) There is a positive linear relationship. (b) x y xy x y , ,600 4 Σ ,006 5, (1, 006) (176)(39) r [6(5,438) (176) ][6(37) (39) ] (c) n 6, 5% level, Table 1-4 value n 6, 1% level, Table 1-4 value Since r is greater than each value, r is significant at the 5% and the 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 6(1, 006) (176)(39) b (5, 438) (176) 39 ( 0.501)(176) a 1. 6 The equation of the regression line is y x. (e) There is a negative linear relationship. 633

49 13. (a) 14. (a) (b) x y xy x y , , , , Σ ,445 7, (, 445) 330(30) r 4(7,350) (330) ][4(6) (30) ] (c) n 4, 5% level, Table 1-4 value n 4, 1% level, Table 1-4 value Since r is not greater than either value, r is not significant at 5% nor at 1% level. (d) Since r is not significant, the computing and drawing of a regression line would be meaningless. (e) No relationship exists. (b) x y xy x y , , , , , , ,600 Σ , ,575 6(4,150) (70)(349) r [6(840) (70) ][6(0,575) (349) ] (c) n 6, 5% level, Table 1-4 value n 6, 1% level, Table 1-4 value Since r is greater than each value, r is significant at the 5% and 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 6(4,150) (70)(349) b (840) (70) 349 (3.36)(70) a The equation of the regression line is y x. (e) There is a positive linear relationship. 634

50 15. (a) 16. (a) (b) x y xy x y , , , ,04 5 Σ ,6 7, (, 6) (185)(65) r [5(7,131) (185) ][5(919) (65) ] (c) n 5, 5% level, Table 1-4 value n 5, 1% level, Table 1-4 value Since r is greater than each value, r is significant at the 5% and 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 5(, 6) (185)(65) b 0.5 5(7,131) (185) 65 ( 0.5)(185) a The equation of the regression line is y x. (e) There is a negative linear relationship. (b) x y xy x y , , , ,116 5 Σ ,47 7,0 43 5(1, 47) (188)(38) r [5(7,0) (188) ][5(43) (38) ] (c) n 5, 5% level, Table 1-4 value n 5, 1% level, Table 1-4 value Since r is less than each value, r is not significant at the 5% nor the 1% level. (d) Since r is not significant, the computing and drawing of a regression line would be meaningless. (e) No relationship exists. 635

51 17. (a) (b) x y xy x y , Σ , (734.6) 6.3(13.6) r [5(150.17) (6.3) ][5(3, ) (13.6) ] (c) n 5, 5% level, Table 1-4 value n 5, 1% level, Table 1-4 value Since r is greater than the 5% value but less than the 1% value, r is significant at the 5% level but not significant at the 1% level. (d) Since r is significant, calculate and draw the line. See graph in part (a). 5(734.6) (6.3)(13.6) b 3.1 5(150.17) (6.3) 1306 (3.1)(6.3) a The equation of the regression line is y x. (e) There is a positive linear relationship. 636

52 18. (a) (b) x y xy x y Σ (51.34) 6.5(48.) r [5(11.95) (6.5) ][5(530.14) (48.) ] (c) n 5, 5% level, Table 1-4 value n 5, 1% level, Table 1-4 value Since r is not greater than either value, r is not significant at either level. (d) Since r is not significant, we can save ourselves some time and not bother calculating or graphing the regression line. (e) No relationship exists. 637

53 19. (a) (b) x y xy x y , ,809 3, , ,64, ,68 43,964, ,160 79,841 1, ,950 0, , , ,30 184, ,90 176, , ,561 1,04 Σ 5, ,309 3,049,155 13,690 9(03, 309) (5, 007)(336) r 0.94 [9(3, 049,155) (5, 007) ][9(13, 690) (336) ] (c) n 9, 5% level, Table 1-4 value n 9, 1% level, Table 1-4 value Since r 0.94 is larger than both values, r is significant at the 5% and 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 9(03, 309) (5, 007)(336) b (3, 049,155) (5, 007) 336 ( )(5, 007) a.76 9 The equation of the regression line is y x. (e) There is a positive linear relationship. (f) y (500) 33; a 500-foot-tall building would have about 33 stories. 638

54 0. (a) Hours per week Regression Line: y x Age (in years) (b) x y xy x y , , , ,844 1 Σ , (671) ()(18) r [5(10,94) () ][5(83.5) (18) ] (c) n 5, 5% level, Table 1-4 value n 5, 1% level, Table 1-4 value Since r is larger than 0.878, but not larger than 0.959, r is significant at the 5% level, but not at the 1% level. (d) Since r is significant at the 5% level, draw the line. See graph in part (a). 5(671) ()(18) b 0.1 5(10, 94) () 18 ( 0.1)() a The equation of the regression line is y x. (e) There is a negative linear relationship. (f) y (35) 4.73; the number of hours for a 35-year-old would be about

55 1. (a) Amount per month (in dollars) Regression Line: y x ,000 1,100 1,00 1,300 x Income (in dollars) (b) x y xy x y , ,000 5,600 1, ,000 1,440,000 90,000 1, ,000 1,000,000 67, , ,000 55, ,50 7,500 1, ,330 8,649 36,100 1, ,000 1,10,000 6,500 Σ 6,757 1,540 1,530,080 6,645, ,050 7(1, 530, 080) (6, 757)(1, 540) r [7(6, 645,149) (6, 757) ][7(358, 050) (1, 540) ] (c) n 7, 5% level, Table 1-4 value n 7, 1% level, Table 1-4 value Since r is larger than both values, r is significant at the 5% and 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 7(1, 530, 080) (6, 757)(1, 540) b (6, 645,149) (6, 757) 1, 540 (0.3548)(6, 757) a The equation of the regression line is y x. (e) There is a positive linear relationship. (f) y (95) ; a student earning $95 per month would spend about $05.88 on recreation. 640

56 . (a) (b) x y xy x y , , , , , , Σ ,464 18, (, 464) (407)(49) r [10(18,65) (407) ][10(391) (49) ] 0.84 (c) n 10, 5% level; Table 1-4 value 0.63 n 10, 1% level; Table 1-4 value Since r 0.84 is larger than both values, r is significant at the 5% and 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 10(, 464) (407)(49) b (18, 65) (407) 49 (0.8)(407) a The equation of the regression line is y x. (e) There exists a positive linear relationship, except for two points. (f) y (56) 8.388; a 56-year-old employee would miss about 8 days. 641

57 3. (a) (b) x y xy x y ,1 7,569 6, ,096 8,464 7, ,760 4,64 4, ,38 5,184 5, ,550 9,05 8, ,77 6,084 5, ,889 6,889 6, ,70 9,604 9,801 Σ ,318 57,443 55,75 8(56, 318) (673)(661) r [8(57, 443) (673) ][8(55, 75) (661) ] (c) n 8, 5% level, Table 1-4 value n 8, 1% level, Table 1-4 value Since r is larger than both values, r is significant at the 5% and 1% levels. (d) Since r is significant, draw the line. See graph in part (a). 8(56, 318) (673)(661) b (57, 443) (673) 661 (0.8603)(673) a The equation of the regression line is y x. (e) There is a positive linear relationship. (f) y (90) 87.65; a student who got a 90 on the Stat 101 one final would be expected to get an 88 on the Stat 10 final. 64

58 4. (a) (b) x y xy x y Σ , ,46 15(,16) (110)(74) r [15(894) (110) ][15(5, 46) (74) ] (c) n 15, 5% level, Table 1-4 value n 15, 1% level, Table 1-4 value Since r is larger than both values, r is significant at the 5% and 1% levels. 643

59 (d) Since r is significant, draw the line. See graph in part (a). 15(,16) (110)(74) b (894) (110) 74 (1.748)(110) a The equation of the regression line is y x. (e) There is a positive linear relationship. (f) y (8) 19.45; a team with eight wins would expect to get 19 goals. 5. (a) (b) x y xy x y , , ,37,610, , , ,998,090, , , ,916,000, ,800 91, ,84,840, , , ,806,50, , , ,339,560, ,300 68, ,664,890, , , ,061,160,000 Σ ,00 3,111, ,308,400,000 8(3,111,50) (6.5)(396, 00) r 0.1 [8(496.83) (6.5) ][8(0, 308, 400, 000) (396, 00) ] (c) n 8, 5% level, Table 1-4 value n 8, 1% level, Table 1-4 value Since r 0.1 is less than both values, r is not significant at either the 5% or 1% level. (d) Since r is no significant, move one to part (e) 644

60 (e) There is no relationship. (f) Since there is no relationship, we can t make a prediction. 6. (a) (b) x y xy x y ,63, , ,660 3, , , , , , , , ,09 39,10 1,444 1,058, ,345,05 9, , ,16 Σ 3.4 4,706 09, ,89.4 3,4,838 7(09, 091.) (3.4)(4, 706) r [7(15,89.4) (3.4) ][7(3, 4,838) (4, 706) ] (c) n 7, 5% level, Table 1-4 value n 7, 1% level, Table 1-4 value Since r is less than both values, r is not significant at either the 5% or 1% level. (d) Since r is no significant, move one to part (e) (e) There is no relationship. (f) Since there is no relationship, we can t make a prediction. 645

61 7. x y xy x y Σ (15) (15)(35) r 1 [5(55) (15) ][5(85) (35) ] Interchange the values for x and y. x y xy x y b) Answers may vary. c) x y xy x y Σ (0) (0)(8) r 0 [7(8) (0) ][7(196) (8) ] r 0 8.a) Σ (15) (35)(15) r 1 [5(85) (35) ][5(55) (15) ] The value of r is the same. a) The variables seem to have a positive linear relationship. b) Answers may vary. x y xy x y c) 3(36) (6)(14) r [3(14) (6) ][3(98) (14) ] The variables show a definite pattern, so that should mean that there is a relationship. But it s definitely not a linear relationship. If you only use the x values that are greater than zero, then there is a positive linear relationship and the r value is going to back that up. 646

62 30. Answers may vary. 31. Answers may vary. 3. Answers may vary. Some of the answers for are open to interpretation and the explanation of reasoning will vary. 33. Positive 34. Negative 35. None; Some people might argue that a small town school district with fewer primary schools would have students who get a better education that than big urban districts with a lot of primary schools. 36. Negative 37. Positive 38. None, though surprisingly studies show that there is a positive correlation. 39. Negative 40. None Review Exercises 1. Item Tally Frequency 3. highest value lowest value B 4 F 5 G 5 S 5 T 6. Arrange the data in order. Separate the data according to the first digit. Use the first digit as the stems and the second digit as the leaves. Stems Leaves The number of minutes spent on the computers ranged from 1 to 39. The biggest group was the middle group (i.e., in the 0s). The rest were evenly divided between the 10s and the 30s. Start with the lowest value and add 15 to get the lower class limits: 10, 117, 13, 147, 16, 177. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Rank Tally Frequency

63 4. Draw the bars with heights corresponding to the number. 7. Represent the years on the x axis and the amount earned on the y axis, and then draw lines connecting the points For the histogram, draw vertical bars corresponding to the frequencies for each class. For the frequency polygon, find the midpoints for each class: 109, 14, 139, 154, 169, and 184. Label the horizontal axis with the midpoints. Connect adjacent midpoints with straight lines. Finish the graph by drawing lines back to the horizontal at the beginning and end of graph. 8. Janine s earnings increased at an increasing rate throughout the five years. X X n ,08 9. The data is already in order. The median is the middle value Median 45 Since each value occurs only once, there is no mode Midrange 187 The four measures are completely different. This is probably due to the fact that the handful of very large numbers at the beginning skews the mean and the midrange a lot. 648

64 9. Hours Frequency Midpoint Frequency Midpoint X Answers may vary X X n , R highest value lowest value X X X ( X X) , , , , , , , ( X X) variance n 1 77, , s s 10, ,597.5 The range of 66 tells us that the numbers vary from lowest to highest by a good amount compared to their sizes. The standard deviation tells us that overall the numbers are fairly spread out. 1. Answers may vary. 13. The data values are already in order. (a) Value Number of data values below Percentile 193, % or 8 nd percentile 33, % or 41 st percentile (b) 75% of or 17. The value 171,000 corresponds to the 75th percentile because there are 17 values below it. 14. The data is already in order The median is the middle value Q 45 Find the median of the values less than Q. Q 1 7 Find the median of the values above Q. Q (a) Most of the states have Native American populations less than 100,000; the three biggest states are all outliers, which is what skewed the measures of average so much in question 8. The area between z 0 and z 1.95 is

65 (b) (g) (c) The area between z 0 and z 0.40 is (h) The area to the right of z.00 is (d) The area between z 1.30 and z 1.80 is (i) The area to the right of z 1.35 is The area between z 1.05 and z.05 is The area to the left of z.10 is (e) (j) (f) The area between z 0.05 and z 0.55 is The area between z 1.10 and z 1.80 is The area to the left of z 1.70 is ; so the values 135 and 35 represent standard deviations below and above the mean respectively. Since 95% of the data falls within s of the mean there would be 45(0.95) 43 patients that weigh between 135 and 35 pounds. 650

66 value mean (a) z value mean z The area between z 0 and z 0.40 is and the area between z 0 and z 0.40 is The desired area is (90) 7.9 Approximately 8 values will be between 190 and 10. value mean (b) z The area between z 0 and z 1.6 is Since the desired area is to the right of 40 subtract from (a) (90) 4.95 Approximately 5 values are greater than 40. value mean 4 3 z the area between z 0 and z 3 is Since the desired area is in the right tail, subtract from Hence, the probability that it will take more than 4 years is value mean 3 3 (b) z The area less than z 0 is Hence, the probability that it will take less than 3 years is 0.5. value mean (c) z value mean z The area between z 0 and z.4 is The area between z 0 and z 4.5 is about The area between z.4 and z 4.5 is The probability that it will take between 3.8 and 4.5 years is value mean.5 3 (d) z value mean z The area between z 0 and z 1.5 is The area between z 0 and z 0.3 is The area between z 1.5 and z 0.3 is Hence, the probability that it will take between.5 and 3.1 years is value mean (a) z1 4 3 The area between z 0 and z 4 is about The area between z 0 and z.67 is The area between z 4 and z.67 is The probability that the bus will have between 36 and 40 passengers is value mean 4 48 (b) z 3 The area between z 0 and z is Since the desired area is in the left tail, subtract from Hence, the probability that the bus will have fewer than 4 passengers is (c) value mean z 0 3 The area above z 0 is Hence, the probability that the bus will have more than 48 passengers is

67 (d) 3. The area between z 0 and z 1.67 is The area between z 0 and 0. z 0.33 is Since the desired area is between z 1.67 and z 0.33, subtract 0.19 from to get value mean z 0.75 The area between z 0 and z 0.75 is Since the desired area is in the left tail subtract 0.73 from , Hence, 454 will weigh less than 43.5 pounds. value mean z value mean z The area between z 0 and z 0.38 is The area between z 0 and z 0.5 is The desired area is Hence, will cost between $80.00 and $ The weights are approximately normally distributed. The heights aren t even close. There appears to be a negative correlation x y xy x y Σ (174.3) (6)(1.6) r [7(630) (6) ][7(70.94) (1.6) ] For n 7 and 5% level, the value in Table 1-4 is Since r is larger than this value, r is significant at the 5% level. 7(174.3) (6)(1.6) b 0.1 7(630) (6) 1.6 ( 0.1)(6) a The equation of the regression line is y x. y (9) 3.06, so 3.06 is the best predicted GPA for someone who watches 9 hours of TV a week. 4. (a) Answers may vary. (b) r 0.16 is not significant; Answers may vary. 65

68 Chapter Test 1. Source Tally Frequency W 6. Draw the bars with heights corresponding to the frequency. L 7 K 7 E 5 3. Source Frequency f f degrees 360 n percent f 100% n W % 4% 5 L % 8% 5 K % 8% 5 E % 0% 5 n 5 4. a) highest value lowest value 1, ,59 difference 1, number of classes 8 which rounds up to 00. Start with the 300 and add 00 to get the lower class limits: 500, 700, 900, 1,100, 1,300, 1,500, 1,700, and 1,900. Set up the classes by subtracting one from each lower class limit except the first lower class limit. Class Tally Frequency , ,100 1,99 0 1,300 1, ,500 1, ,700 1,

69 6. Represent the years on the x axis and the wages on the y axis, and then draw lines connecting the points. b) c) There are 30 data values, and 643 is the nd data value, so there are 1 data values below % 30. It is the 70th percentile. d) 0% of The value $44 million corresponds to the 0th percentile because there are 6 values below it. e) The values 1,000, 1,400 and 1,850 are outliers. 5. Arrange the data in order. Separate the data according to the first two digits. Use the first two digits as stems and the last digits as leaves. Stems Leaves The scores range from 00 to 60. Except for a small portion at the lower end and at the higher end, the scores are very much evenly spread out. The graph shows an increase in the wage for all periods. The two steepest increases were during the jumps from 1975 to 1980 and from 005 to 010. Analysis may vary. 7. (a) (b) Arrange the data in order The median is the middle value. median 85 (c) Since each value occurs only once, there is no mode. (d) L+ H Midrange 84 (e) R highest value lowest value

70 (f) X X X ( X X) (b) The area between z 1.56 and z 1.96 is (c) (g) ( X X) variance 17.1 n 1 6 s s The area between z 0.06 and z 0.73 is Errors Frequency Midpoint Frequency Midpoint (d) (e) The area to the right of z 1.8 is X (a) The area between z 0 and z 1.50 is The area to the left of z 1.36 is value mean (a) z1 1 4 value mean z The area between z 0 and z 1 is The area between z 0 and z 1.5 is The desired area is Hence, the probability that it will take between 34 and 35 minutes is

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques SEVENTH EDITION and EXPANDED SEVENTH EDITION Slide - Chapter Statistics. Sampling Techniques Statistics Statistics is the art and science of gathering, analyzing, and making inferences from numerical information

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10 8: Statistics Statistics: Method of collecting, organizing, analyzing, and interpreting data, as well as drawing conclusions based on the data. Methodology is divided into two main areas. Descriptive Statistics:

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data Chapter 2 Organizing and Graphing Data 2.1 Organizing and Graphing Qualitative Data 2.2 Organizing and Graphing Quantitative Data 2.3 Stem-and-leaf Displays 2.4 Dotplots 2.1 Organizing and Graphing Qualitative

More information

Measures of Central Tendency

Measures of Central Tendency Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of

More information

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set. Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean the sum of all data values divided by the number of values in

More information

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution Chapter 2 Summarizing & Graphing Data Slide 1 Overview Descriptive Statistics Slide 2 A) Overview B) Frequency Distributions C) Visualizing Data summarize or describe the important characteristics of a

More information

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one. Probability and Statistics Chapter 2 Notes I Section 2-1 A Steps to Constructing Frequency Distributions 1 Determine number of (may be given to you) a Should be between and classes 2 Find the Range a The

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Downloaded from

Downloaded from UNIT 2 WHAT IS STATISTICS? Researchers deal with a large amount of data and have to draw dependable conclusions on the basis of data collected for the purpose. Statistics help the researchers in making

More information

CHAPTER 3: Data Description

CHAPTER 3: Data Description CHAPTER 3: Data Description You ve tabulated and made pretty pictures. Now what numbers do you use to summarize your data? Ch3: Data Description Santorico Page 68 You ll find a link on our website to a

More information

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 3 - Displaying and Summarizing Quantitative Data Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative

More information

CHAPTER 2 DESCRIPTIVE STATISTICS

CHAPTER 2 DESCRIPTIVE STATISTICS CHAPTER 2 DESCRIPTIVE STATISTICS 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is how the data is spread or distributed over the range of the data values. This is one of

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015 MAT 142 College Mathematics Statistics Module ST Terri Miller revised July 14, 2015 2 Statistics Data Organization and Visualization Basic Terms. A population is the set of all objects under study, a sample

More information

MAT 110 WORKSHOP. Updated Fall 2018

MAT 110 WORKSHOP. Updated Fall 2018 MAT 110 WORKSHOP Updated Fall 2018 UNIT 3: STATISTICS Introduction Choosing a Sample Simple Random Sample: a set of individuals from the population chosen in a way that every individual has an equal chance

More information

2.1: Frequency Distributions and Their Graphs

2.1: Frequency Distributions and Their Graphs 2.1: Frequency Distributions and Their Graphs Frequency Distribution - way to display data that has many entries - table that shows classes or intervals of data entries and the number of entries in each

More information

Univariate Statistics Summary

Univariate Statistics Summary Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:

More information

IT 403 Practice Problems (1-2) Answers

IT 403 Practice Problems (1-2) Answers IT 403 Practice Problems (1-2) Answers #1. Using Tukey's Hinges method ('Inclusionary'), what is Q3 for this dataset? 2 3 5 7 11 13 17 a. 7 b. 11 c. 12 d. 15 c (12) #2. How do quartiles and percentiles

More information

Ms Nurazrin Jupri. Frequency Distributions

Ms Nurazrin Jupri. Frequency Distributions Frequency Distributions Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data so that it is possible to get a general overview of the results.

More information

Basic Statistical Terms and Definitions

Basic Statistical Terms and Definitions I. Basics Basic Statistical Terms and Definitions Statistics is a collection of methods for planning experiments, and obtaining data. The data is then organized and summarized so that professionals can

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms

More information

Measures of Dispersion

Measures of Dispersion Lesson 7.6 Objectives Find the variance of a set of data. Calculate standard deviation for a set of data. Read data from a normal curve. Estimate the area under a curve. Variance Measures of Dispersion

More information

Section 9: One Variable Statistics

Section 9: One Variable Statistics The following Mathematics Florida Standards will be covered in this section: MAFS.912.S-ID.1.1 MAFS.912.S-ID.1.2 MAFS.912.S-ID.1.3 Represent data with plots on the real number line (dot plots, histograms,

More information

Measures of Dispersion

Measures of Dispersion Measures of Dispersion 6-3 I Will... Find measures of dispersion of sets of data. Find standard deviation and analyze normal distribution. Day 1: Dispersion Vocabulary Measures of Variation (Dispersion

More information

Elementary Statistics

Elementary Statistics 1 Elementary Statistics Introduction Statistics is the collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

Chapter 3 Analyzing Normal Quantitative Data

Chapter 3 Analyzing Normal Quantitative Data Chapter 3 Analyzing Normal Quantitative Data Introduction: In chapters 1 and 2, we focused on analyzing categorical data and exploring relationships between categorical data sets. We will now be doing

More information

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 KEY SKILLS: Organize a data set into a frequency distribution. Construct a histogram to summarize a data set. Compute the percentile for a particular

More information

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form. CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.

More information

Processing, representing and interpreting data

Processing, representing and interpreting data Processing, representing and interpreting data 21 CHAPTER 2.1 A head CHAPTER 17 21.1 polygons A diagram can be drawn from grouped discrete data. A diagram looks the same as a bar chart except that the

More information

Statistics can best be defined as a collection and analysis of numerical information.

Statistics can best be defined as a collection and analysis of numerical information. Statistical Graphs There are many ways to organize data pictorially using statistical graphs. There are line graphs, stem and leaf plots, frequency tables, histograms, bar graphs, pictographs, circle graphs

More information

Round each observation to the nearest tenth of a cent and draw a stem and leaf plot.

Round each observation to the nearest tenth of a cent and draw a stem and leaf plot. Warm Up Round each observation to the nearest tenth of a cent and draw a stem and leaf plot. 1. Constructing Frequency Polygons 2. Create Cumulative Frequency and Cumulative Relative Frequency Tables 3.

More information

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques. Section 2.1 - Introduction Graphs are commonly used to organize, summarize, and analyze collections of data. Using a graph to visually present a data set makes it easy to comprehend and to describe the

More information

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016) CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1 Daphne Skipper, Augusta University (2016) 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is

More information

Chapter 11. Worked-Out Solutions Explorations (p. 585) Chapter 11 Maintaining Mathematical Proficiency (p. 583)

Chapter 11. Worked-Out Solutions Explorations (p. 585) Chapter 11 Maintaining Mathematical Proficiency (p. 583) Maintaining Mathematical Proficiency (p. 3) 1. After School Activities. Pets Frequency 1 1 3 7 Number of activities 3. Students Favorite Subjects Math English Science History Frequency 1 1 1 3 Number of

More information

Mean,Median, Mode Teacher Twins 2015

Mean,Median, Mode Teacher Twins 2015 Mean,Median, Mode Teacher Twins 2015 Warm Up How can you change the non-statistical question below to make it a statistical question? How many pets do you have? Possible answer: What is your favorite type

More information

Organizing Data. Class limits (in miles) Tally Frequency Total 50

Organizing Data. Class limits (in miles) Tally Frequency Total 50 2 2 Organizing Data Objective 1. Organize data using frequency distributions. Suppose a researcher wished to do a study on the number of miles the employees of a large department store traveled to work

More information

15 Wyner Statistics Fall 2013

15 Wyner Statistics Fall 2013 15 Wyner Statistics Fall 2013 CHAPTER THREE: CENTRAL TENDENCY AND VARIATION Summary, Terms, and Objectives The two most important aspects of a numerical data set are its central tendencies and its variation.

More information

Organizing and Summarizing Data

Organizing and Summarizing Data 1 Organizing and Summarizing Data Key Definitions Frequency Distribution: This lists each category of data and how often they occur. : The percent of observations within the one of the categories. This

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys Unit 7 Statistics AFM Mrs. Valentine 7.1 Samples and Surveys v Obj.: I will understand the different methods of sampling and studying data. I will be able to determine the type used in an example, and

More information

Frequency Distributions

Frequency Distributions Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40 Chpt 3 Data Description 3-2 Measures of Central Tendency 1 /40 Chpt 3 Homework 3-2 Read pages 96-109 p109 Applying the Concepts p110 1, 8, 11, 15, 27, 33 2 /40 Chpt 3 3.2 Objectives l Summarize data using

More information

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the

More information

MATH NATION SECTION 9 H.M.H. RESOURCES

MATH NATION SECTION 9 H.M.H. RESOURCES MATH NATION SECTION 9 H.M.H. RESOURCES SPECIAL NOTE: These resources were assembled to assist in student readiness for their upcoming Algebra 1 EOC. Although these resources have been compiled for your

More information

The Normal Distribution

The Normal Distribution 14-4 OBJECTIVES Use the normal distribution curve. The Normal Distribution TESTING The class of 1996 was the first class to take the adjusted Scholastic Assessment Test. The test was adjusted so that the

More information

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Bar Graphs and Dot Plots

Bar Graphs and Dot Plots CONDENSED LESSON 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs

More information

Name Date Types of Graphs and Creating Graphs Notes

Name Date Types of Graphs and Creating Graphs Notes Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.

More information

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes.

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes. M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes. Population: Census: Biased: Sample: The entire group of objects or individuals considered

More information

UNIT 1A EXPLORING UNIVARIATE DATA

UNIT 1A EXPLORING UNIVARIATE DATA A.P. STATISTICS E. Villarreal Lincoln HS Math Department UNIT 1A EXPLORING UNIVARIATE DATA LESSON 1: TYPES OF DATA Here is a list of important terms that we must understand as we begin our study of statistics

More information

6th Grade Vocabulary Mathematics Unit 2

6th Grade Vocabulary Mathematics Unit 2 6 th GRADE UNIT 2 6th Grade Vocabulary Mathematics Unit 2 VOCABULARY area triangle right triangle equilateral triangle isosceles triangle scalene triangle quadrilaterals polygons irregular polygons rectangles

More information

Distributions of Continuous Data

Distributions of Continuous Data C H A P T ER Distributions of Continuous Data New cars and trucks sold in the United States average about 28 highway miles per gallon (mpg) in 2010, up from about 24 mpg in 2004. Some of the improvement

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,

More information

Distributions of random variables

Distributions of random variables Chapter 3 Distributions of random variables 31 Normal distribution Among all the distributions we see in practice, one is overwhelmingly the most common The symmetric, unimodal, bell curve is ubiquitous

More information

Chapter 1. Looking at Data-Distribution

Chapter 1. Looking at Data-Distribution Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw

More information

Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

More information

AND NUMERICAL SUMMARIES. Chapter 2

AND NUMERICAL SUMMARIES. Chapter 2 EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

Using a percent or a letter grade allows us a very easy way to analyze our performance. Not a big deal, just something we do regularly.

Using a percent or a letter grade allows us a very easy way to analyze our performance. Not a big deal, just something we do regularly. GRAPHING We have used statistics all our lives, what we intend to do now is formalize that knowledge. Statistics can best be defined as a collection and analysis of numerical information. Often times we

More information

CHAPTER 2: SAMPLING AND DATA

CHAPTER 2: SAMPLING AND DATA CHAPTER 2: SAMPLING AND DATA This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and Georgia Highlands College. OUTLINE 2.1 Stem-and-Leaf Graphs (Stemplots),

More information

Box Plots. OpenStax College

Box Plots. OpenStax College Connexions module: m46920 1 Box Plots OpenStax College This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License 3.0 Box plots (also called box-and-whisker

More information

Chapter 2: Frequency Distributions

Chapter 2: Frequency Distributions Chapter 2: Frequency Distributions Chapter Outline 2.1 Introduction to Frequency Distributions 2.2 Frequency Distribution Tables Obtaining ΣX from a Frequency Distribution Table Proportions and Percentages

More information

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

Chapter 2 - Frequency Distributions and Graphs

Chapter 2 - Frequency Distributions and Graphs 1. Which of the following does not need to be done when constructing a frequency distribution? A) select the number of classes desired B) find the range C) make the class width an even number D) use classes

More information

Descriptive Statistics

Descriptive Statistics Chapter 2 Descriptive Statistics 2.1 Descriptive Statistics 1 2.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Display data graphically and interpret graphs:

More information

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables

More information

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data MATH& 146 Lesson 10 Section 1.6 Graphing Numerical Data 1 Graphs of Numerical Data One major reason for constructing a graph of numerical data is to display its distribution, or the pattern of variability

More information

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks,

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks, STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I 4 th Nine Weeks, 2016-2017 1 OVERVIEW Algebra I Content Review Notes are designed by the High School Mathematics Steering Committee as a resource for

More information

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use? Chapter 4 Analyzing Skewed Quantitative Data Introduction: In chapter 3, we focused on analyzing bell shaped (normal) data, but many data sets are not bell shaped. How do we analyze quantitative data when

More information

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different? A frequency table is a table with two columns, one for the categories and another for the number of times each category occurs. See Example 1 on p. 247. Create a bar graph that displays the data from the

More information

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II. 3 rd Nine Weeks,

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II. 3 rd Nine Weeks, STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I Part II 3 rd Nine Weeks, 2016-2017 1 OVERVIEW Algebra I Content Review Notes are designed by the High School Mathematics Steering Committee as a resource

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

Understanding Statistical Questions

Understanding Statistical Questions Unit 6: Statistics Standards, Checklist and Concept Map Common Core Georgia Performance Standards (CCGPS): MCC6.SP.1: Recognize a statistical question as one that anticipates variability in the data related

More information

Chapter 5snow year.notebook March 15, 2018

Chapter 5snow year.notebook March 15, 2018 Chapter 5: Statistical Reasoning Section 5.1 Exploring Data Measures of central tendency (Mean, Median and Mode) attempt to describe a set of data by identifying the central position within a set of data

More information

WHOLE NUMBER AND DECIMAL OPERATIONS

WHOLE NUMBER AND DECIMAL OPERATIONS WHOLE NUMBER AND DECIMAL OPERATIONS Whole Number Place Value : 5,854,902 = Ten thousands thousands millions Hundred thousands Ten thousands Adding & Subtracting Decimals : Line up the decimals vertically.

More information

Test Bank for Privitera, Statistics for the Behavioral Sciences

Test Bank for Privitera, Statistics for the Behavioral Sciences 1. A simple frequency distribution A) can be used to summarize grouped data B) can be used to summarize ungrouped data C) summarizes the frequency of scores in a given category or range 2. To determine

More information

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc Section 2-2 Frequency Distributions Copyright 2010, 2007, 2004 Pearson Education, Inc. 2.1-1 Frequency Distribution Frequency Distribution (or Frequency Table) It shows how a data set is partitioned among

More information

BUSINESS DECISION MAKING. Topic 1 Introduction to Statistical Thinking and Business Decision Making Process; Data Collection and Presentation

BUSINESS DECISION MAKING. Topic 1 Introduction to Statistical Thinking and Business Decision Making Process; Data Collection and Presentation BUSINESS DECISION MAKING Topic 1 Introduction to Statistical Thinking and Business Decision Making Process; Data Collection and Presentation (Chap 1 The Nature of Probability and Statistics) (Chap 2 Frequency

More information

1 Overview of Statistics; Essential Vocabulary

1 Overview of Statistics; Essential Vocabulary 1 Overview of Statistics; Essential Vocabulary Statistics: the science of collecting, organizing, analyzing, and interpreting data in order to make decisions Population and sample Population: the entire

More information

Measures of Position

Measures of Position Measures of Position In this section, we will learn to use fractiles. Fractiles are numbers that partition, or divide, an ordered data set into equal parts (each part has the same number of data entries).

More information

1.2. Pictorial and Tabular Methods in Descriptive Statistics

1.2. Pictorial and Tabular Methods in Descriptive Statistics 1.2. Pictorial and Tabular Methods in Descriptive Statistics Section Objectives. 1. Stem-and-Leaf displays. 2. Dotplots. 3. Histogram. Types of histogram shapes. Common notation. Sample size n : the number

More information

Date Lesson TOPIC HOMEWORK. Displaying Data WS 6.1. Measures of Central Tendency WS 6.2. Common Distributions WS 6.6. Outliers WS 6.

Date Lesson TOPIC HOMEWORK. Displaying Data WS 6.1. Measures of Central Tendency WS 6.2. Common Distributions WS 6.6. Outliers WS 6. UNIT 6 ONE VARIABLE STATISTICS Date Lesson TOPIC HOMEWORK 6.1 3.3 6.2 3.4 Displaying Data WS 6.1 Measures of Central Tendency WS 6.2 6.3 6.4 3.5 6.5 3.5 Grouped Data Central Tendency Measures of Spread

More information

SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BBA240 STATISTICS/ QUANTITATIVE METHODS FOR BUSINESS AND ECONOMICS

SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BBA240 STATISTICS/ QUANTITATIVE METHODS FOR BUSINESS AND ECONOMICS SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BBA240 STATISTICS/ QUANTITATIVE METHODS FOR BUSINESS AND ECONOMICS Unit Two Moses Mwale e-mail: moses.mwale@ictar.ac.zm ii Contents Contents UNIT 2: Numerical

More information

Chapter 2 Organizing Data

Chapter 2 Organizing Data Chapter Organizing Data Section.1 1. Highest Level of Education and Average Annual Household Income (in thousands of dollars) 0 80 60 48.6 6.1 71.0 84.1 40 34.3 0 16.1 0 Ninth grade High school Associate

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Library, Teaching & Learning 014 Summary of Basic data Analysis DATA Qualitative Quantitative Counted Measured Discrete Continuous 3 Main Measures of Interest Central Tendency Dispersion

More information

DAY 52 BOX-AND-WHISKER

DAY 52 BOX-AND-WHISKER DAY 52 BOX-AND-WHISKER VOCABULARY The Median is the middle number of a set of data when the numbers are arranged in numerical order. The Range of a set of data is the difference between the highest and

More information

Lecture 1: Exploratory data analysis

Lecture 1: Exploratory data analysis Lecture 1: Exploratory data analysis Statistics 101 Mine Çetinkaya-Rundel January 17, 2012 Announcements Announcements Any questions about the syllabus? If you sent me your gmail address your RStudio account

More information

Ch6: The Normal Distribution

Ch6: The Normal Distribution Ch6: The Normal Distribution Introduction Review: A continuous random variable can assume any value between two endpoints. Many continuous random variables have an approximately normal distribution, which

More information

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Probability and Statistics. Copyright Cengage Learning. All rights reserved. Probability and Statistics Copyright Cengage Learning. All rights reserved. 14.6 Descriptive Statistics (Graphical) Copyright Cengage Learning. All rights reserved. Objectives Data in Categories Histograms

More information

Chapter 2. Frequency distribution. Summarizing and Graphing Data

Chapter 2. Frequency distribution. Summarizing and Graphing Data Frequency distribution Chapter 2 Summarizing and Graphing Data Shows how data are partitioned among several categories (or classes) by listing the categories along with the number (frequency) of data values

More information