Topic 5 - Joint distributions and the CLT

Size: px

Start display at page:

Download "Topic 5 - Joint distributions and the CLT"

Clifford Oliver
5 years ago
Views:

1 Topic 5 - Joint distributions and the CLT Joint distributions Calculation of probabilities, mean and variance Expectations of functions based on joint distributions Central Limit Theorem Sampling distributions Of the mean Of totals 1

2 Often times, we are interested in more than one random variable at a time. For example, what is the probability that a car will have at least one engine problem and at least one blowout during the same week? X = # of engine problems in a week Y = # of blowouts in a week P(X 1, Y 1) is what we are looking for To understand these sorts of probabilities, we need to develop joint distributions. 2

3 Discrete distributions A discrete joint probability mass function is given by f(x,y) = P(X = x, Y = y) where 1. f( x, y) 0 for all x, y 2. f( x, y) 1 all ( xy, ) 3. P(( X, Y) A) f( x, y) all ( xy, ) A 4. EhXY ( (, )) hx (, y) f( x, y) all ( xy, ) 3

4 Return to the car example Consider the following joint pmf for X and Y X\Y /2 1/16 1/32 1/32 1/32 1 1/16 1/32 1/32 1/32 1/32 2 1/32 1/32 1/32 1/32 1/32 P(X 1, Y 1) = P(X 1) = E(X + Y) = 4

5 Joint to marginals The probability mass functions for X and Y individually (called marginals) are given by f ( x) f( x, y), f ( y) f( x, y) X all y Y all x Returning to the car example: f X (x) = f Y (y) = E(X) = E(Y) = 5

6 Continuous distributions A joint probability density function for two continuous random variables, (X,Y), has the following four properties: 1. f( x, y) 0 for all x, y 2. f( x, y) dxdy P(( X, Y) A) f( x, y) dxdy A 4. EhXY ( (, )) hx (, y) f( x, ydxdy ) - - 6

7 Continuous example Consider the following joint pdf: 2 (1 3 ) x y f( x, y) 0 x 2, 0 y 1 4 Show condition 2 (total volume is 1) holds on your own. Show P(0 < X < 1, ¼ < Y < ½) = 23/512 11/2 2 x(1 3 y ) P(0 x1,1/ 4 y1/ 2) dydx 4 01/ y1/2 y1/ / 4 xy [ y] dx1/ 4 x[5 / 8 17 / 64] dx 1 23/ 256 xdx 23/ 256[ x / 2] 23/ 256[1/ 2 0] 23/ x1 x0 7

8 Joint to marginals The marginal pdfs for X and Y can be found by f ( x) f( x, y) dy, f ( y) f( x, y) dx X For the previous example, find f X (x) and f Y (y). Y 1 2 x(1 3 y ) 3 y1 fx( x) dy = x/ 4[ y y ] y0 = x/ 4[2 0] x/ x(13 y ) (13 y ) (13 y ) fy ( y) dx = xdx = [ x / 2] x2 x0 1 3y 2 2 8

9 Independence of X and Y The random variables X and Y are independent if f(x,y) = f X (x) f Y (y) for all pairs (x,y). For the discrete clunker car example, are X and Y independent? For the continuous example, are X and Y independent? x y x y x y f( x, y) f ( x) f ( y) ( ) (1 3 ) (1 3 ) (1 3 ) x y 9

10 Sampling distributions We assume that each data value we collect represents a random selection from a common population distribution. The collection of these independent random variables is called a random sample from the distribution. A statistic is a function of these random variables that is used to estimate some characteristic of the population distribution. The distribution of a statistic is called a sampling distribution. The sampling distribution is a key component to making inferences about the population. 10

11 Statistics used to infer parameters We take samples and calculate statistics to make inferences about the population parameters. Sample Population Mean x Std. Dev. s Variance 2 s 2 Proportion ˆp p 11

12 StatCrunch example StatCrunch subscriptions are sold for 6 months ($5) or 12 months ($8). From past data, I can tell you that roughly 80% of subscriptions are $5 and 20% are $8. Let X represent the amount in $ of a purchase. E(X) = Var(X) = 12

13 StatCrunch example continued Now consider the amounts of a random sample of two purchases, X 1, X 2. A natural statistic of interest is X 1 + X 2, the total amount of the purchases. Outcomes X 1 + X 2 5,5 Probability X 1 + X 2 Probability 5,8 8,5 8,8 13

14 StatCrunch example continued E(X 1 + X 2 ) = E([X 1 + X 2 ] 2 ) = Var(X 1 + X 2 ) = 14

15 StatCrunch example continued If I have n purchases in a day, what is my expected earnings? the variance of my earnings? the shape of my earnings distribution for large n? Let s experiment by simulating 10,000 days with 100 purchases per day using StatCrunch. 15

16 Simulation instructions Data > Simulate data > Binomial Specify Rows to be 10000, Columns to be 1, n to be 100 and p to be.2. This will give you a new column called Binomial1 To compute the total for each day, go to Data > Transform data and enter the expression, 8*Binomial1+5*(100-Binomial1). This will add a new column to the data table. Make a histogram and set the bin width to 1 for best results. For the new sum column, do a histogram and a QQ plot. Both should verify normality! StatCrunch 16

17 Should result in a dataset like this 17

18 Central Limit Theorem We have just illustrated one of the most important theorems in statistics. As the sample size, n, becomes large the distribution of the sum of a random sample from a distribution with mean and variance 2 converges to a Normal distribution with mean n and variance n 2. A sample size of at least 30 is typically required to use the CLT (arguable in the general statistics community). The amazing part of this theorem is that it is true regardless of the form of the underlying distribution. 18

19 Airplane example Suppose the weight of an airline passenger has a mean of 150 lbs. and a standard deviation of 25 lbs. What is the probability the combined weight of 100 passengers will exceed the maximum allowable weight of 15,500 lbs? How many passengers should be allowed on the plane if we want this probability to be at most 0.01? 19

20 What are the probabilities at n = 99? 99* * The mean is 2 The variance is The standard deviation is PX ( 15500) TOT

21 The distribution of the sample means For constant c, E(cY) = ce(y) and Var(cY) = c 2 Var(Y) n n n n Var( X ) Var( x) Var( x) n The CLT says that for large samples, X is approximately normal with a mean of and a variance of 2 /n. So, the variance of the sample mean decreases with n. 21

22 What are the probabilities we get a sample average at some level? If the parent population is assumed with a mean of 150 lbs. and a standard deviation of 25 lbs., what s the probability we get a sample average below 141 with a sample size of 30? Talking about the sampling distribution, the mean is 150 and the standard deviation is

23 Sampling distribution applet In StatCrunch, go to the Applets tab and click on sampling distributions. It will demonstrate how any parent distribution will converge to normal with larger, repeated samples. The closer the parent is to symmetrical, the quicker the sampling distribution will converge. The additional file for Topic 5 has discussion and examples on both sampling distributions and joint probability distributions. There are also additional examples of double integration. 23

This is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures.

This is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures. Chapter 5: JOINT PROBABILITY DISTRIBUTIONS Part 1: Sections 5-1.1 to 5-1.4 For both discrete and continuous random variables we will discuss the following... Joint Distributions (for two or more r.v. s)