Random Varables and Probablty Dstrbutons Some Prelmnary Informaton Scales on Measurement IE231 - Lecture Notes 5 Mar 14, 2017 Nomnal scale: These are categorcal values that has no relatonshp of order or rank among them. (e.g. colors, speces) Ordnal scale: These are categorcal values that has relatonshp of order or rank among them (e.g. mltary ranks, competton results). Though the relatve order has no defned magntude (e.g. Champon can get 40 ponts, runner up 39 and thrd place 30). Interval scale: There s a numercal order but the dfference can only be defned n ntervals, snce there s no absolute mnmum. We cannot compare n relatve values. For nstance, we cannot say 10 degree celsus s twce as hot as 5 degree celsus; what about -5 vs +5? Rato scale: Scale wth an absolute mnmum. (e.g. If I have 50TL and my frend has 100TL, I can say that she has twce the money that I have.) Heght, weght, age are smlar examples. See more on https://en.wkpeda.org/wk/level_of_measurement. Infnty The concept of nfnty s very broad. Currently, you just need to keep the dstncton of countable and uncountable nfntes n mnd. Countably nfnte: 1, 2, 3, 4,... (e.g. natural numbers, ntegers, ratonal numbers) Uncountably nfnte: 1, 1.01, 1.001, 1.0001, 1.00001,... (e.g. real numbers) How many real numbers are there between 0 and 1? Descrptve Statstcs Here are bref descrptons of mean (expectaton), medan, mode, varance, standard devaton, quantle. Mean: X = N X Medan: Let s say X k are ordered from smallest to largest and there are n values n the sample. Medan(X)= X (n+1)/2 f n s odd and (usually) Medan(X)= X (n/2) + X (n/2+1). 2 Quantle: On an ordered lst of values for quantle (α) provdes the (α n) th smallest value of the lst. For nstance, f α = 70% = 0.7 quantle value s the 7th smallest value n a lst of 10 values. α = 1 means the maxmum. Quantle s an mportant parameter n especally statstcs. Mode: X k wth the hghest frequency n the sample. In a sample of (1, 2, 2, 3, 4, 5), 2 s the mode. N Varance: V (X) = (X X) 2 n 1 N Standard Devaton: σ(x) = (X X) 2 n 1 1
set.seed(231) #Let's pck 10 values from the numbers between 1 and 50. numbers <- sample(1:50,10,replace=true) #The sorted verson of the numbers sort(numbers) ## [1] 1 9 15 16 16 18 26 31 32 35 #The mean values of the numbers sum(numbers)/10 ## [1] 19.9 #or n R mean(numbers) ## [1] 19.9 #Medan of the numbers medan(numbers) ## [1] 17 #Quantle 7/9 of the numbers quantle(numbers,7/9) ## 77.77778% ## 31 #Quantle 0 of the numbers (also the mn) quantle(numbers,0) ## 0% ## 1 #Quantle 1 of the numbers (also the max) quantle(numbers,1) ## 100% ## 35 #No smple soluton for mode n R freq_table<-table(numbers) freq_table ## numbers ## 1 9 15 16 18 26 31 32 35 ## 1 1 1 2 1 1 1 1 1 names(freq_table[whch.max(freq_table)]) ## [1] "16" #Sample varance of numbers sum((numbers - mean(numbers))^2)/(10-1) ## [1] 118.7667 #For large values you can take n ~ n-1 #n R var(numbers) 2
## [1] 118.7667 #Sample standard devaton of values sqrt(sum((numbers - mean(numbers))^2)/(10-1)) ## [1] 10.89801 #n R sd(numbers) ## [1] 10.89801 Random Varables Random varables are the abstractons of uncertan events so that we can generalze events n formal functons nstead of explctly enumeratng the outcomes. For nstance, assume X s the number of tals n 2 con tosses. X can take values 0, 1 and 2. X s a dscrete random varable. P (X = 0) = P ({H, H}) = 1/4 (1) P (X = 1) = P ({H, T }, {T, H}) = 2/4 (2) P (X = 2) = P ({T, T }) = 1/4 (3) There are also the contnuous random varables. Contnuous random varables are usually defned n ntervals nstead of ndvdual values. For nstance, defne Y as any real number between 0 and 1 and and all values wthn the nterval are equally probable (.e. unform dstrbuton). (4) P (Y 0.25) = 1/4 (5) P (X 0.5) = 2/4 (6) P (X 0.75) = 3/4 (7) (8) Fundamental Concepts There are several fundamental concepts to keep n mnd. Probablty Mass Functon (pmf): pmf s the pont probablty for dscrete dstrbutons (.e. f(x) = P (X = x)). For nstance P (X = H) = 1/2, P (X = T ) = 1/2. n f(x ) = 1 Probablty Densty Functon (pdf): pdf s the nterval probablty for contnuous dstrbutons (.e. f(x) = P (a < X < b) = b f(x)dx). Snce almost all pont probabltes n contnuous dstrbutons a are 0 (due to nfnty), ntervals. f(x)dx = 1 3
Fgure 1: Fgure 2: 4
Cumulatve Dstrbuton Functon (cdf): cdf s the cumulatve probablty for all values smaller than x (.e. F (x) = P (X x)). For the con toss an example cdf would be two or less tals (P (X 2)). Man relatonshp between cdf and pdf s (F (X a) = a f(x)dx). Expected Value (E[X]): Expected value of a probablty dstrbuton s calculated as follows. for dscrete dstrbutons. µ = E[X] = n x f(x ) µ = E[X] = for contnuous dstrbutons. xf(x)dx Example: Calculate the expected value of number of tals n two con tosses. n E[X] = x f(x ) = 0 P (X = 0) + 1 P (X = 1) + 2 P (X = 2) (9) = 0 1/4 + 1 1/2 + 2 1/4 (10) = 1 (11) Varance (V (X)): Varance s calculated as follows for dscrete dstrbutons. V (X) = E[(X µ) 2 ] = n (x µ) 2 f(x ) V (X) = E[(X µ) 2 ] = for dscrete dstrbutons. Varance can also be calculated as V (X) = E[X 2 ] (E[X]) 2. (x µ) 2 f(x)dx Some Dscrete Dstrbutons Bernoull Dstrbuton It can also be called sngle con toss dstrbuton. For a sngle event wth probablty of success p and falure q = 1 p, the dstrbuton s called Bernoull. pmf: f(x = 0; p) = q, f(x = 1) = p E[X] = 0 (1 p) + 1 p = p V [X] = pq Example: Con Toss p = 0.5, q = 1 p = 0.5 pmf: f(x = 0) = 0.5, f(x = 1) = 0.5 5
E[X] = 0 (1 0.5) + 1 0.5 = 0.5 V (X) = 0.5 0.5 = 0.25 Bnomal Dstrbuton Thnk of multple Bernoull trals (e.g. several con tosses). pmf: f(x; p, n) = ( ) n x p x q (n x) E[X] = np V (X) = npq cdf: F (X x) = n =0 f() Example: Multple Con Tosses (x5 cons, p = 0.5) pmf: f(x = 3; n = 5) = ( 5 3) (0.5) 3 (1 0.5) (5 3) = 0.3125 #R way #(d)ensty(bnom)al dbnom(x=3,sze=5,prob=0.5) ## [1] 0.3125 E[X] = 5 0.5 = 2.5 V (X) = 5 0.5 0.5 = 1.25 cdf: F (X 3; n = 5) = 5 =0 f() = 0.8125 #R way pbnom(q=3,sze=5,prob=0.5) ## [1] 0.8125 Multnomal Dstrbuton Now suppose there s not one probablty (p) but there are many probabltes (p 1, p 2,..., p k ). pmf: f(x 1,..., x k ; p 1,..., p k ; n) = ( ) n x 1,...,x k p x 1 1 px k k where ( ) n n! x 1,...,x k = x 1!... x k!, k x = n and k p = 1. Example: Customers of a coffee shop prefer Turksh coffee wth probablty 0.4, espresso 0.25 and flter coffee 0.35. What s the probablty that out of the frst 10 customers, 3 wll prefer Turksh coffee, 5 wll prefer espresso and 2 wll prefer flter coffee? f(3, 5, 2; 0.4, 0.25, 0.35; 10) = ( 10 3,5,2) 0.4 3 0.25 5 0.35 1 0 = 4.3 10 6 = 0.0193 #Explct form factoral(10)/(factoral(3)*factoral(5)*factoral(2))*0.4^3 * 0.25^5 * 0.35^2 ## [1] 0.01929375 #Densty multnomal dmultnom(x=c(3,5,2),prob=c(0.4,0.25,0.35)) ## [1] 0.01929375 Bnomal dstrbuton s a specal case of multnomal dstrbuton. 6
Hypergeometrc Dstrbuton Hypergeometrc dstrbuton can be used n case the sample s dvded n two such as defectve/nondefectve, whte/black, Ankara/Istanbul. Suppose there are a total of N tems, k of them are from group 1 and N k of them are from group 2. We want to know the probablty of gettng x tems from group 1 and n k tems from group 2. ( k N k ) x)( pmf: f(x, n; k, N) = E[X] = nk N n x ( N n) V [X] = N n N 1 n k N (1 k N ) Example: Suppose we have a group of 20 people, 12 from Istanbul and 8 from Ankara. If we randomly select 5 people from t what s the probablty that 1 of them s from Ankara and 4 of them from Istanbul. ( 8 20 8 ) 1)( f(1, 5; 8, 20) = 5 1 ( 20 5 ) = 0.256 #Explct form x=1 n=5 k=8 N=20 (choose(k,x)*choose(n-k,n-x))/choose(n,n) ## [1] 0.255418 #Densty hypergeometrc, see?dhyper for explanatons dhyper(x=1,m=8,n=12,k=5) ## [1] 0.255418 Negatve Bnomal Dstrbuton Negatve Bnomal dstrbuton answers the queston What s the probablty that k-th success occurs n n trals?. Dfferently from the bnomal case, we fx the last attempt as success. pmf: f(x; p, n) = ( ) n 1 x 1 p x q (n x) Example: Suppose I m repeatedly tossng cons. What s the probablty that 3rd Heads come n the 5th toss? f(3; 0.5, 5) = ( 5 1 3 1) 0.5 3 0.5 (5 3) = 0.1875 #Explct form choose(5-1,3-1)*0.5^3*0.5^(5-3) ## [1] 0.1875 #Bnomal way dbnom(3-1,5-1,0.5)*0.5 ## [1] 0.1875 #Negatve bnomal way dnbnom(x=5-3,sze=3,prob=0.5) ## [1] 0.1875 7
Geometrc Dstrbuton Geometrc dstrbuton answers What s the probablty that frst success comes n the n-th tral? pmf: f(x; p, n) = q (n 1) p E[X] = 1/p V [X] = 1 p p 2 8