Expectations for Discrete Distributions |
The definition of the expected value for a finite sample space with outcomes x1, x2, x3, ..., xn, with probabilities p1, p2, p3, ..., pn, respectively, is:
x1*p1 + x2*p2 + x3*p3 + ... + xn*pnFor example, a Bernoulli trial with probability p of generating a 1, and probability (1-p) of generating a 0, has expected value (or mean)
0*(1-p) + 1*p = p
Example 4.23 of the Moore & McCabe text "Introduction to the Practice of Statistics" reports that the distribution of sizes of U.S. families is:
size: 2 3 4 5 6 7 prob: .413 .236 .211 .090 .032 .018To use R to compute the expected (i.e. average) size of a family from this table, first create two data vectors, then multiply and sum:
size <- 2:7 p <- c(.413, .236, .211, .090, .032, .018) sum(p*size)The multiplication of two datasets in R is element by element, so R multiplies 2*.413, 3*.236, etc. The sum function then adds the products together, and returns 3.146. Note that the Moore & McCabe text reports it has ignored families of 8 or more members, so this is not the true mean family size (which is about 3.17).
We can also compute the expected value of a probability distribution for which we have a density function in R, such as the Binomial. Suppose we would like to numerically compute the expected value of a random variable that has the Binomial(10,.2) distribution generated by counting the number of successes in 10 independent Bernoulli trials each with probability .2 of success:
k <- 0:10 p <- dbinom(k,10,.2) sum(k*p)
We could also verify this result by making use of the fact that a Binomial(10,.2) random variable is just the sum of 10 Bernoulli random variables each with probability .2 of returning a 1, and the fact that the expected value of a sum of random variables is the sum of their expected values. Thus the mean of the Binomial(10,.2) distribution must be 10*(.2) = 2.
Suppose that we are tossing a fair coin repeatedly, and counting the number of tails we observe before we get the first heads. Let X be the number of tails we observe, then here is a little table showing the first few terms:
k: 0 1 2 3 4 5 6 7 8... P(X=k): 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512... products: 0 1/4 1/4 3/16 1/8 5/64 3/64 7/256 1/64...We have to sum the terms in the products. One way to procede is to simply pick some large number of terms, such as 100 or 1000, compute the geometric probabilities and sum the products:
k <- 0:100 p <- dgeom(k,.5) sum(k*p)This works well here because the terms are decreasing rapidly in magnitude, so we get an essentially exact sum for a fairly small number of terms. What is the smallest number of terms for which R returns the correct answer 1.0? Try a couple of other geometric distributions, ie. different values of p, such as 1/3, 1/4, etc. and see if you can guess the pattern, ie. what is the expected value of a geometric random variable with probability p? It should be clear after a few examples; if not click here.
Here is a function that computes the expected value in a different way,
summing terms until the total converges to a fixed value. It reports
the number of terms it used. See if you can understand how it works!
Egeom <- function(p){ # # usage: Egeom(p) computes the expected value of a geometric # random variable with probability p of success on each trial # e <- 0 e0 <- -1 i <- 0 while(e != e0) { e0<-e i <- i+1 e <- e + i*dgeom(i,p) } cat(i,"terms \n") cat("expected value =",e,"\n") e }
Egeom(1/3)
For further amusement, try computing the expected values of poisson or negative binomial random variables.
The variance of a random variable is just the expected value of the squared deviation from the mean: E((X-E(X))^2). Returning to Example 4.23 of the Moore & McCabe text:
size: 2 3 4 5 6 7 prob: .413 .236 .211 .090 .032 .018To use R to compute the variance of the size of a family from this table, repeat the mean calculation, then compute the weighted mean of the squared deviations from the mean:
size <- 2:7 p <- c(.413, .236, .211, .090, .032, .018) M <- sum(p*size) V <- sum(p*(size-M)^2)If the family sizes were all equally likely (ie. each of the values had probability 1/6) would the variance be larger or smaller?
Try computing the variance of a Binomial(10,.2) random variable. You should get 1.6.