I'm trying to calculate the covariance for an example that I've created - using the covariance formula Cov(X,Y) = E(XY)-E(X)E(Y) as in this question - but I'm running into trouble.
In my example, I roll a 3-sided die 150 times and count how many times each side appears. In R I can simulate this for 1,000 rolls and show the first 3 results like so:
library(tidyverse)
set.seed(0)
m <- replicate(n = 1000, table(sample(c('X','Y','Z'), size=150, p=c(1/3,1/3,1/3), replace = TRUE))) %>% t()
m %>% head(3)
Which yields:
## X Y Z
## 48 42 60
## 42 59 49
## 54 45 51
I can compute the covariance like so:
cov(m)
Which yields:
## X Y Z
## X 31.89802 -16.08600 -15.81202
## Y -16.08600 31.47373 -15.38773
## Z -15.81202 -15.38773 31.19976
Now, I think:
- E[XY] = 2500
- E[X] = 50
- E[Y] = 50
... this gives me:
- Cov(X,Y) = E(XY)-E(X)E(Y) = 2500-(50*50) = 0
What am I doing wrong / how do I calculate the covariance correctly? It looks to be about -1/2 the variance...