Often we wish to study multiple random variables which may have some inter-dependence among them. The following introduces the concept of a multivariate version of the probability mass function, which describes the probabilities like $$P(X = x, Y = y)$$.

### Joint probability mass function

Suppose a box contains

• 4 red balls
• 3 white balls
• 2 blue balls

and we draw 2 balls, one at a time without replacement. Let

• $$R$$ be the number of red balls in our draw
• $$B$$ be the number of blue balls in our draw

Then using counting, we can find for example that $P(R = 0, B = 1) = \frac{{4 \choose 0}{2 \choose 1}{3 \choose 1}}{9 \choose 2}$ or more generally that $P(R = r, B = b) = \frac{{4 \choose r}{2 \choose b}{3 \choose 2-(r+b)}}{9 \choose 2}$ when $$r$$ and $$b$$ satisfiy $$0\leq r+b \leq 2$$. The function $m(r,b) = P(R = r, B = b)$ is called the joint probability mass function (or joint PMF) of $$R$$ and $$B$$. We can visualize it as a table or matrix whose $$(r,b)$$ entry is $$P(R = r, B = b)$$.

$$B$$ = 0 $$B$$ = 1 $$B$$ = 2
$$R$$ = 0 0.083 0.167 0.028
$$R$$ = 1 0.333 0.222 0.000
$$R$$ = 2 0.167 0.000 0.000

The R command that creates such a matrix is

probs = matrix(c(3/36,  6/36, 1/36,
12/36, 8/36, 0,
6/36,  0,    0), nrow = 3, ncol = 3, byrow = T)
print(probs)
##            [,1]      [,2]       [,3]
## [1,] 0.08333333 0.1666667 0.02777778
## [2,] 0.33333333 0.2222222 0.00000000
## [3,] 0.16666667 0.0000000 0.00000000

### Visualization

We can also make a 3d bar plot to visualize the joint PMF. We do this for introductory purposes, but it’s not so common to come across 3d bar plots.

library(latticeExtra)
r = 0:2
b = 0:2
probs = matrix(c(3/36, 6/36, 1/36,
12/36, 8/36, 0,
6/36, 0, 0), nrow = 3, ncol = 3, byrow = T)

cloud(probs, data = NULL, type = "h",
zlab = "prob",
row.values = r, column.values = b,
xlim = 0:4, ylim = 0:4,
xlab = "r", ylab = "b",
panel.3d.cloud = panel.3dbars, col.facet='grey',
xbase = 0.4, ybase = 0.4, scales = list(draw = F, col = 1),
par.settings = list(axis.line = list(col = "transparent")))

### Questions

1. Can you use the joint PMF of $$R$$ and $$B$$ to find
1. $$P(R = r)$$ for each $$r = 0, 1, 2$$?
2. What about $$P(B = b)$$?

Note. The PMFs $$P(R = r)$$ and $$P(B = b)$$ are called the marginals or marginal PMFs.

Hint. Think about summing values along rows or columns of our joint PMF matrix. For example, the sum $\sum_{r = 0}^2 P(R = r, B = b)$ gives one of the probabilities above. Which one?

2. To sum along the 2 column, for example, of the joint PMF matrix you would run the following command. How would you sum along the 2nd row?

sum(probs[,2])
## [1] 0.3888889
1. Can you find $$E[R]$$ using the marginal PMF of $$R$$?

2. Later we’ll see that $$E[RB]$$ will act like a measure of whether $$R$$ and $$B$$ are independent random variables. Considering how we compute expectations of single random variables, how do you think we’ll compute things like $$E[RB]$$ or more generally $$E[f(R,B)]$$ where $$f(x,y)$$ is a given function?