Describig data with graphics ad umbers Types of Data Categorical Variables also kow as class variables, omial variables Quatitative Variables aka umerical ariables either cotiuous or discrete. Graphig categorical variables Te most commo causes of death i Americas betwee 5 ad 9 years old i 999. Bar graphs Graphig umerical variables Heights of BIOL 3 studets (cm) Stem-ad-leaf plot table 65 68 63 73 7 63 7 55 52 9 7 68 42 6 54 65 56 77 73 65 65 75 55 66 68 65 8 65 9 8 7 6 5 4 3 3 5 7 3 3 5 5 5 5 5 5 6 8 8 8 2 4 5 5 6 2 Height Group 4-5 5-6 6-7 6 5 7-8 5 8-9
Histogram Histogram with more data Cumulative Distributio Cumulative Distributio Cumulative.8.6 Cumulative.8.6 Associatios betwee two categorical variables.4.4.2.2 5 6 7 8 9 2 2 Height (i cm) of Bio3 Studets 5 6 7 8 9 2 2 Height (i cm) of Bio3 Studets 5th percetile (media) 9th percetile Associatio betwee reproductive effort ad avia malaria Table 2.3A. Cotigecy table showig icidece of malaria i female great tits subjected to experimetal cotrol group egg removal. egg removal group row total malaria 7 5 22 o malaria colum total 28 5 43 35 3 65 Relative frequecy..8.6.4.2. Mosaic plot Cotrol Treatmet Egg removal Figure 2.3B. Mosaic plot for reproductive effort ad avia malaria i great tits (Table 2.3A). Blue fill idicates diseased birds whereas the white fill idicates birds free of malaria. = 65 birds. Grouped Bar Graph 25 2 5 5 Malaria No malaria Malaria No malaria Cotrol Egg removal
Multiple histograms Associatios betwee categorical ad umerical variables 6 No-coserved 4 2 2 4 6 8 6 Associatios betwee two umerical variables Coserved 4 2 2 4 6 8 Protei legth Scatterplots Evaluatig Graphics Do t mislead with graphics Lie factor Chartjuk Better represetatio of truth Lie Factor Lie factor = size of effect show i graphic size of effect i data Lie Factor Example Effect i graphic: 2.33/.8 = 29. Effect i data: 6748/5844 =.5 Lie factor = 29. /.5 = 25.3
Chartjuk Summary: Graphical methods for frequecy distributios Needless 3D Graphics Summary: Associatios betwee variables Respose variable Type of Data Categorical data Numerical data Method Bar graph Histogram Cumulative frequecy distributio Categorical Numerical Great book o graphics Explaatory variable Categorical Numerical Cotigecy table Grouped bar graph Mosaic plot Multiple histograms Scatter plot Cumulative frequecy distributios Two commo descriptios of data Locatio (or cetral tedecy) Describig data Width (or spread) Measures of locatio Media Mode
Y = " i= Y i Y =56, Y 2 =72, Y 3 =8, Y 4 =42 Y = (56+72+8+42) / 4 = 47 Media The media is the middle measuremet i a set of ordered data. is the size of the sample The data: 8 28 24 25 36 4 34 ca be put i order: 4 8 24 25 28 34 36 Media is 25. 2.5. 7.5 5. 2.5. Media Mode 5 6 7 8 9 2 3 4 5 6 7 8 vs. media i politics 24 U.S. Ecoomy Republicas: times are good icome icreasig ~ 4% per year Democrats: times are bad Media family icome fell Why? Mouse weight at 5 days old, i a lie selected for small size Measures of width 69.3 cm Media 7 cm Mode 65-7 cm Cumulative.8.6.4.2 5 6 7 8 9 2 2 Height (i cm) of Bio3 Studets Rage Stadard deviatio Variace Coefficiet of variatio
Rage Variace 4 7 8 2 22 22 24 25 26 28 28 28 3 34 36 The rage is 36-4 = 22 Var[Y] = N # i= ( Y i " µ ) 2 N s 2 = Sample variace # ( Y i "Y ) 2 i= " is the sample size Shortcut for calculatig sample variace # % # & s 2 = % (% $ "'% % $ ) i= Y i 2 & ( "Y 2 ( ( ( ' Stadard deviatio (SD) Positive square root of the variace! is the true stadard deviatio s is the sample stadard deviatio I class exercise Aswer Coefficiet of variace (CV) Calculate the variace ad stadard deviatio of a sample with the followig data: 6,, 2 Variace=7 Stadard deviatio = 7 CV = s /. Y
Equal meas, differet variaces Maipulatig meas Maipulatig variace.4.3.2. V = V=2-5 5 Value V= The mea of the sum of two variables: E[X + Y] = E[X]+ E[Y] The mea of the sum of a variable ad a costat: E[X + c] = E[X]+ c The mea of a product of a variable ad a costat: E[c X] = c E[X] The mea of a product of two variables: E[X Y] = E[X] E[Y] if ad oly if X ad Y are idepedet. The variace of the sum of two variables: Var[X + Y] = Var[X]+ Var[Y] if ad oly if X ad Y are idepedet. The variace of the sum of a variable ad a costat: Var[X + c] = Var[X] The variace of a product of a variable ad a costat: Var[c X] = c 2 Var[X] Parets heights Variace Father Height 74.3 7.7 Mother Height 6.4 58.3 Father Height +Mother Height 334.7 84.9