Joint probability distributions

Similar documents
Today s outline: pp

Probability Model for 2 RV s

This is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures.

Lecture 8: Jointly distributed random variables

Will Monroe July 21, with materials by Mehran Sahami and Chris Piech. Joint Distributions

14.30 Introduction to Statistical Methods in Economics Spring 2009

Recursive Estimation

Page 129 Exercise 5: Suppose that the joint p.d.f. of two random variables X and Y is as follows: { c(x. 0 otherwise. ( 1 = c. = c

Exam 2 is Tue Nov 21. Bring a pencil and a calculator. Discuss similarity to exam1. HW3 is due Tue Dec 5.

ECE353: Probability and Random Processes. Lecture 11- Two Random Variables (II)

This is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures.

Topic 5 - Joint distributions and the CLT

Multivariate probability distributions

STATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS

CS 112: Computer System Modeling Fundamentals. Prof. Jenn Wortman Vaughan April 21, 2011 Lecture 8

Tangent Planes and Linear Approximations

Pairs of a random variable

14.4: Tangent Planes and Linear Approximations

MATH 52 MIDTERM I APRIL 22, 2009

Sampling and Monte-Carlo Integration

Chapter 5: Joint Probability Distributions and Random

Discrete Mathematics Course Review 3

Feature Selection. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 262

STAT 135 Lab 1 Solutions

We use non-bold capital letters for all random variables in these notes, whether they are scalar-, vector-, matrix-, or whatever-valued.

MATH SPRING 2000 (Test 01) FINAL EXAM INSTRUCTIONS

Automatic Reasoning (Section 8.3)

Overview. CS389L: Automated Logical Reasoning. Lecture 6: First Order Logic Syntax and Semantics. Constants in First-Order Logic.

Locally convex topological vector spaces

12.5 Triple Integrals

Integration. Example Find x 3 dx.

Precomposing Equations

Generative and discriminative classification techniques

Iterated Integrals. f (x; y) dy dx. p(x) To evaluate a type I integral, we rst evaluate the inner integral Z q(x) f (x; y) dy.

6-1 THE STANDARD NORMAL DISTRIBUTION

Machine Learning

Lecture 2 September 3

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

1 Overview. 2 Applications of submodular maximization. AM 221: Advanced Optimization Spring 2016

A Class of Symmetric Bivariate Uniform Distributions

Kevin James. MTHSC 206 Section 16.5 Applications of Double Integrals

Distance Measures for Gene Expression (and other high-dimensional) Data

Behavior of the sample mean. varx i = σ 2

(c) 0 (d) (a) 27 (b) (e) x 2 3x2

Parameter Estimation. Learning From Data: MLE. Parameter Estimation. Likelihood. Maximum Likelihood Parameter Estimation. Likelihood Function 12/1/16

Exam in Calculus. Wednesday June 1st First Year at The TEK-NAT Faculty and Health Faculty

ENSC 805. Spring Assignment 1 Solution.

Introduction to Machine Learning

SD 372 Pattern Recognition

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

= f (a, b) + (hf x + kf y ) (a,b) +

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

OUTLINE. Paper Review First Paper The Zero-Error Side Information Problem and Chromatic Numbers, H. S. Witsenhausen Definitions:

lecture 19: monte carlo methods

Introduction to RStudio

Lecture 17. ENGR-1100 Introduction to Engineering Analysis CENTROID OF COMPOSITE AREAS

4. LINE AND PATH INTEGRALS

Rectangular Coordinates in Space

Clustering analysis of gene expression data

1 Double Integral. 1.1 Double Integral over Rectangular Domain

Laplace Transform of a Lognormal Random Variable

Testing Continuous Distributions. Artur Czumaj. DIMAP (Centre for Discrete Maths and it Applications) & Department of Computer Science

Lecture 9: Predicate Logic. First: Finish Propositional Logic in Haskell. Semantics in Haskell. Syntax In Haskell. CS 181O Spring 2016 Kim Bruce

Exact equations are first order DEs of the form M(x, y) + N(x, y) y' = 0 for which we can find a function f(x, φ(x)) so that

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand Midterm 1

Computer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models

COMP90051 Statistical Machine Learning

mirsig: a consensus-based network inference methodology to identify pan-cancer mirna-mirna interaction signatures

Updated Sections 3.5 and 3.6

Common Core Algebra 2. Chapter 1: Linear Functions

CSI30. Chapter 1. The Foundations: Logic and Proofs Rules of inference with quantifiers Logic and bit operations Specification consistency

Intro to Probability Instructor: Alexandre Bouchard

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Deep Learning and Its Applications

What you will learn today

Inference and Representation

Markov Networks in Computer Vision

Markov Networks in Computer Vision. Sargur Srihari

CS 195-5: Machine Learning Problem Set 5

PEER Report Addendum.

2D TRANSFORMATIONS AND MATRICES

Chapter 6. THE NORMAL DISTRIBUTION

Computer Applications for Engineers ET 601

6 Randomized rounding of semidefinite programs

4.3 The Normal Distribution

The Distributive Property and Expressions Understand how to use the Distributive Property to Clear Parenthesis

EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2017 Abhay Parekh and Jean Walrand September 21, 2017.

3.2 - Interpolation and Lagrange Polynomials

COMP232 - Mathematics for Computer Science

Curves, Tangent Planes, and Differentials ( ) Feb. 26, 2012 (Sun) Lecture 9. Partial Derivatives: Signs on Level Curves, Tangent

Note that ALL of these points are Intercepts(along an axis), something you should see often in later work.

Lecture 4: Undirected Graphical Models

Math Stuart Jones. 4.3 Curve Sketching

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Short-Cut MCMC: An Alternative to Adaptation

ISyE 6416: Computational Statistics Spring Lecture 13: Monte Carlo Methods

Massachusetts Institute of Technology. Department of Computer Science and Electrical Engineering /6.866 Machine Vision Quiz I

Transcription:

Joint probability distributions Let X and Y be two discrete rv s defined on the sample space S. The joint probability mass function p(x, y) is p(x, y) = P(X = x, Y = y). Note p(x, y) and x y p(x, y) = 1. P[(X, Y ) A] = (x,y) A p(x, y). The marginal probability mass function of X and Y, denoted by P X (x) and P Y (y) are P X (x) = y p(x, y), P Y (y) = x p(x, y).

Example Let X = number of meals in restaurants and Y = number of movies a randomly selected JMU student has on a typical weekend. The joint pmf of X and Y is given by y p(x, y) 1 2 3 x.5.12.1.8 1.6 red.12.11.9 2.3.6.5.5 3.3.2.2.1 P(Y 2) = p(, 2) + p(1, 2) + p(2, 2) + p(3, 2) + p(, 3) + p(1, 3) + p(2, 3) + p(3, 3) =.51 The marginal distributions are: p X () =.35, p X (1) =.38, p X (2) =.19, p X (3) =.8 p Y () =.17, p Y (1) =.32, p Y (2) =.28, p Y (3) =.23.

Exercise Let X=number of heads and Y=number of heads minus the number of tails obtained in 3 flips of a balanced coin. Get the joint probability distribution of X and Y.

Joint pdf Let X and Y be continuous rv s, then f (x, y) is the joint probability density function for X and Y if for any two-dimensional set A P[(X, Y ) A] = A f (x, y)dxdy. In particular, P(a X b, c Y d) = b a d c f (x, y)dydx. The marginal probability density function of X and Y, denoted by f X (x) and f Y (y), are given by f X (x) = f (x, y)dy and f Y (y) = f (x, y)dx.

Example* Given the joint pdf f (x, y) = P( < X <.3, < Y <.5) =.3.3 ( 2 3 xy + 2 3 y 2 ) y=.5 dx =.3 f X (x) = 1 f Y (y) = 1 2 { 2 3 (x + 2y), < x < 1, < y < 1, otherwise ( 1 3 x + 1 6.5 2 3 (x + 2y)dydx = )dx = x2 6 + x 6.3 =.2 3 (x + 2y)dy = ( 2 3 xy + 2 3 y 2 ) 1 y= = 2 3 (x + 1), < x < 1 2 3 (x + 2y)dx = 1 3 (1 + 4y), < y < 1.

exercise Given the joint pdf f (x, y) = Find P( < X < 1 2, < Y < 1). Find f Y (y). { 3 5x(y + x), < x < 1, < y < 2, otherwise

1 P( < X < 1 2, < Y < 1) =.5 =.5 3 5 x( y 2 2 + xy) 1 y= dx =.5 ( 3 ( 3 2 x 2 + 1 5 x 3 ).5 = 5 8 =.625. 3 5x(y + x)dydx 1 x + 3 5 x 2 )dx = f Y (y) = 1 3 5 x(y +x)dx = ( 3 5 y x2 2 + 1 5 x 3 ) 1 x= = 3 1 y + 1 5, < y < 2. Check 2 ( 3 1 y + 1 5 )dy = ( 3 2 y 2 + 1 5 y) 2 = 1.

Example Given the joint pdf { 24xy, x 1, y 1, x + y 1 f (x, y) =, otherwise Let A = {(x, y) : x 1, y 1, x + y.5}, then P[(X, Y ) A] =.5.5 x 24xydydx =.625. f X (x) = f (x, y)dy = 1 x 24xydy = 12x(1 x) 2, x 1.

Independent rv s Two rv s X and Y are said to be independent if for every pair of (x, y), p(x, y) = P X (x)p Y (y), when X and Y are discrete or f (x, y) = f X (x)f Y (y) when X and Y are continuous. example: meals and movie example: p(, ) =.5 p X () p Y () =.35.17. example *: f X (x)f Y (y) = 2 3 (x + 1) 1 3 (1 + 4y) f (x, y), so X and Y are dependent.

Expected values, covariance and correlation Let X and Y be two rv s with joint pmf p(x, y) or f (x, y), then { E(h(X, Y )) = x y h(x, y)p(x, y), if X and Y are discrete h(x, y)f (x, y)dxdy, if X and Y are continuous The covariance between X and Y is Cov(X, Y ) = E[(X µ X )(Y µ Y )] { = x y (x µ X )(y µ Y )p(x, y), X, Y discrete (x µ X )(y µ Y )f (x, y)dxdy, X, Y continuous proposition: Cov(X, Y ) = E(XY ) µ X µ Y.

example Meals and movies example: y p(x, y) 1 2 3 p(x) x.5.12.1.8.35 1.6.12.11.9.38 2.3.6.5.5.19 3.3.2.2.1.8 p(y).17.32.28.23 Note µ X = xp(x) = 1, µ Y = yp(y) = 1.57, E(XY ) = x y xyp(x, y) = 1 1.12 + 1 2.11 + 1 3.9 + 2 1.6 + 2 2.5 + 2 3.5 + 3 1.2 + 3 2.2 + 3 3.1 = 1.5, so Cov(X, Y ) = E(XY ) µ X µ Y = 1.5 1 1.57 =.7

example Given the joint pdf { 24xy, x 1, y 1, x + y 1 f (x, y) =, otherwise f X (x) = f (x, y)dy = 1 x 24xydy = 12x(1 x 2 ), x 1. Similarly, f Y (y) = 12y(1 y 2 ), y 1, and µ X = µ Y = 2 5. E(XY ) = 1 1 x xy24xydydx = 8 Thus Cov(X, Y ) = 2 15 ( 2 5 )( 2 5 ) = 2 75. 1 x 2 (1 x) 3 dx = 2 15.

The correlation coefficient of X an Y, denoted by corr(x, Y), or ρ X,Y or just ρ, is Cov(X,Y ) ρ X,Y = σ X σ Y. Meals and movies example: E(X 2 ) = 1 2.38 + 2 2.19 + 3 2.8 = 1.86, σx 2 = 1.86 1 2 =.86, σ X =.927, and σ Y = 1.22, so ρ X,Y =.7.927 1.22 =.7 proposition: 1. If a and c are both positive or negative, Corr(aX + b, cy + d) = Corr(X, Y ). 2. 1 ρ 1. 3. If X and Y are independent, then ρ = but ρ = does not necessarily imply independence. 4. ρ = ±1 iff Y = ax + b for a.

Suppose X and Y are independent with pdf s f X (x) = 3x 2, x 1 f Y (y) = 2y, y 1. Find E( X Y ). Optional: Find P(X Y ).

The joint pdf of X and Y is f (x, y) = 6x 2 y, x 1, y 1. E( X Y ) = 1 1 ( x y )6x 2 ydxdy = 1.5. P(X Y ) = 1 x 6x 2 ydydx =.6

Suppose the joint pmf of discrete r.v.s X and Y is given by p(x, y) = kxy, x = 1, 2, y = 1, 2, 3 and k is an unknown constant. 1). Find the proper value for k. 2). Find the marginal probability distributions for X and Y.