Often we wish to study multiple random variables which may have some inter-dependence among them. The following introduces the concept of a multivariate version of the probability mass function, which describes the probabilities like \(P(X = x, Y = y)\).

Suppose a box contains

- 4 red balls
- 3 white balls
- 2 blue balls

and we draw 2 balls, one at a time without replacement. Let

- \(R\) be the number of red balls in our draw
- \(B\) be the number of blue balls in our draw

Then using counting, we can find for example that \[P(R = 0, B = 1) = \frac{{4 \choose 0}{2 \choose 1}{3 \choose 1}}{9 \choose 2}\] or more generally that \[P(R = r, B = b) = \frac{{4 \choose r}{2 \choose b}{3 \choose 2-(r+b)}}{9 \choose 2}\] when \(r\) and \(b\) satisfiy \(0\leq r+b \leq 2\). The function \[m(r,b) = P(R = r, B = b)\] is called the **joint probability mass function** (or joint PMF) of \(R\) and \(B\). We can visualize it as a table or matrix whose \((r,b)\) entry is \(P(R = r, B = b)\).

\(B\) = 0 | \(B\) = 1 | \(B\) = 2 | |
---|---|---|---|

\(R\) = 0 | 0.083 | 0.167 | 0.028 |

\(R\) = 1 | 0.333 | 0.222 | 0.000 |

\(R\) = 2 | 0.167 | 0.000 | 0.000 |

The R command that creates such a matrix is

```
probs = matrix(c(3/36, 6/36, 1/36,
12/36, 8/36, 0,
6/36, 0, 0), nrow = 3, ncol = 3, byrow = T)
print(probs)
```

```
## [,1] [,2] [,3]
## [1,] 0.08333333 0.1666667 0.02777778
## [2,] 0.33333333 0.2222222 0.00000000
## [3,] 0.16666667 0.0000000 0.00000000
```

We can also make a 3d bar plot to visualize the joint PMF. We do this for introductory purposes, but it’s not so common to come across 3d bar plots.

```
library(latticeExtra)
r = 0:2
b = 0:2
probs = matrix(c(3/36, 6/36, 1/36,
12/36, 8/36, 0,
6/36, 0, 0), nrow = 3, ncol = 3, byrow = T)
cloud(probs, data = NULL, type = "h",
zlab = "prob",
row.values = r, column.values = b,
xlim = 0:4, ylim = 0:4,
xlab = "r", ylab = "b",
panel.3d.cloud = panel.3dbars, col.facet='grey',
xbase = 0.4, ybase = 0.4, scales = list(draw = F, col = 1),
par.settings = list(axis.line = list(col = "transparent")))
```

- Can you use the joint PMF of \(R\) and \(B\) to find
- \(P(R = r)\) for each \(r = 0, 1, 2\)?

- What about \(P(B = b)\)?

*Note*. The PMFs \(P(R = r)\) and \(P(B = b)\) are called the**marginals**or**marginal PMFs**.*Hint*. Think about summing values along rows or columns of our joint PMF matrix. For example, the sum \[\sum_{r = 0}^2 P(R = r, B = b)\] gives one of the probabilities above. Which one? - \(P(R = r)\) for each \(r = 0, 1, 2\)?
To sum along the 2 column, for example, of the joint PMF matrix you would run the following command. How would you sum along the 2nd row?

`sum(probs[,2])`

`## [1] 0.3888889`

Can you find \(E[R]\) using the marginal PMF of \(R\)?

Later we’ll see that \(E[RB]\) will act like a measure of whether \(R\) and \(B\) are independent random variables. Considering how we compute expectations of single random variables, how do you think we’ll compute things like \(E[RB]\) or more generally \(E[f(R,B)]\) where \(f(x,y)\) is a given function?