Mth 464 Fll 2012 Notes on Mrginl nd Conditionl Densities klin@mth.rizon.edu October 18, 2012 Mrginl densities. Suppose you hve 3 continuous rndom vribles X, Y, nd Z, with joint density f(x,y,z. The mrginl density for X is the PDF of X. In clss we gve formul for this. It is just Similrly, the PDF for Y nd Z re f Y (y f Z (z f(x,y,z (1 f(x,y,z dx dz f(x,y,z dx dy Notice tht to get the density for one of the vribles, wht we do is integrte over ll the other vribles. This generlizes nturlly to n continuous rndom vribles X 1,X 2,,X n with joint density f(x 1,x 2,,x n : we define the ith mrginl density of f to be the PDF of X i. The formul is f Xi (x f(x 1,,x i 1,x,x i+1,,x n dx 1 dx i 1 dx i+1 dx n (2 Notice tht we integrte exctly n-1 times. We gve derivtion of this formul in clss. Here it is gin, for n3. To get the mrginl density of X, the bsic ide is to first find the CDF F X, then differentite it. Here goes: let X, Y, Z, nd f be s bove, nd let x be ny rel number. We strt with the observtion tht F X (x P(X x ( P (,x] R R The 2nd line just sys tht we wnt the probbility of the event tht X x nd tht we don t cre bout the vlues of Y nd Z. From this nd the definition of the joint density, we get F X (x x f(x,y,z dx 1
Differentiting, we get 1 d dx F X(x d dx ( x d dx ( x f(x,y,z dx f(x,y,z dx By the fundmentl theorem of clculus, we hve ( d x f(x,y,z dx f(x,y,z dx So s desired. Exmple. d dx F X(x f(x,y,z In the drts exmple with rdius 1, the PDF is { 1/π, x f(x,y 2 + y 2 1 0, otherwise If we wnt the PDF for the X vrible lone, this is given by 1 x 2 1/π dy 1 x 2 2 1 x π 2 if 1 x 1; 0 otherwise. The first line is just the formul for the mrginl of X. To go from the first to the second line, it is importnt to relize tht f(x,y is nonzero if nd only if (x,y lies on or inside the unit circle. Conditionl density, book s version. Grinsted nd Snell gve the following definition of conditionl density: Let X be rndom vrible with PDF f, nd let E be n event with P(E > 0. Then we define the conditionl density of X given E hs occurred to be f(x E f(x/p(e 1 This derivtion is not ment to be proof. To prove the formul long these lines, we would hve to justify exchnging the order of differentition nd integrtion. Tht belongs in course on rel nlysis. 2
This extends nturlly to 2 or more rndom vribles. For exmple, if f(x,y is the joint density for X nd Y, nd E is n event, then the conditionl density of (X,Y given E is f(x,y E f(x,y/p(e The key thing bout these formuls is tht they mke sense only when P(E > 0. If P(E 0, then we would be dividing by 0. As n exmple, in the drts exmple, if we let E be the event the drt lnds in the first qudrnt, then we hve f(x,y E f(x,y/p(e { 4/π, x 0 nd y 0 nd x 2 + y 2 1 0, otherwise Importnt! A key property of f(x E is tht it is still probbility density. So if we wnt the conditionl probbility P(A E for some event A, we would evlute A f(x E dx. A consequence is tht f(x E dx 1. Conditionl density of Y given Xx. A closely relted concept is the conditionl density of one rndom vrible Y given tht second rndom vrible X hs certin vlue x. This is different, but relted to, the concept of conditionl density given in the book. Both notions re useful, but in different wys. If X nd Y re two continuous rndom vribles nd x is rel number, the conditionl density of Y given Xx is defined to be f Y X (y x f(x,y. (3 The key property of this is tht it is PDF: if we wnt the probbility of n event A given Xx, then P(A X x f Y X (y x dy A (py ttention to wht vrible is being integrted, nd wht s being left lone. In prticulr, we hve f Y X (y x dy 1. (4 The reson we need seprte definition when conditioning on Xx is tht the event (Xx hs probbility 0. So there is 3
no wy we cn use the book s definition (or something like it to define the conditionl density. In clss I gve motivtion for Eq. (3. Here it is, in more generl form: let δ > 0 be ny number, nd [, b] be ny intervl. Then We hve P( Y b }{{} cll this A P(A B x δ X x + δ }{{} cll this B [ f(x,y dx ] P(A B P(B dy We cn pproximte the term in the squre brckets by the rectngle rule, to get so tht f(x,y dx 2δ f(x,y P(A B 2δ We cn tret P(B similrly: So we get P(B [ 2δ f X (x dx P( Y b x δ X x + δ ] f(x,y dy dx P(A B P(B 2δ 2δ 4
One cn show tht in the limit s δ 0, the becomes exct, so we get P( Y b X x In prticulr, for ny rel number y, we hve P(Y y X x y f(x,y dy Differentiting gives us Eq. (3. (Notice tht the vrible x should be treted s constnt in the bove. Finlly, there is second wy to see why the definition in Eq. (3 is resonble: suppose we wnt to define f Y X (y x. Wht properties should it hve? At the minimum, it should be PDF, i.e., Eq. (4 should hold. A second property we might insist on is tht for ech given vlue of x, f Y X (y x should be proportionl to f(x,y s function of y, so tht events hve the right reltive frequency. 2 With these conditions, we hve f Y X (y x C x f(x,y where C x is constnt (it relly depends on x to be chosen so tht Eq. (4 holds. It is esy to check (you should do it! tht C x. 2 This sttement will tke some digesting... 5