1

Suppose I have a big subset of smaller subsets where each small subset contains x amount of numbers. How would I go about finding the mean and variance of the larger subset through the mean and variance of fragments of the smaller subsets?


For example: Let subset A = [[1,3], [5, 7], [10, 19]]

Lets split subset A into the 2 smaller subsets -- B and C

B = [[1, 3]] and C = [[5, 7], [10, 19]]

If we know the mean and variance of B is 2 and 1 respectively, and we know the mean and variance of C is 10.25 and 28.6875 respectively.

How would we find the mean and variance of A from the mean and variances of B and C?

Thanks!

Omrii
  • 113

1 Answers1

2

You seem to be using the $\frac1n$ version of variance

If $n_B$ and $n_C$ are the number of elements of $B$ and $C$, and $\mu_B, \mu_C, \sigma^2_B, \sigma^2_C,$ their means and variances then:

  • $n_A=n_B+n_C$
  • $\mu_A=\frac1{n_A}(n_B \mu_B+n_C\mu_C)$
  • $\sigma^2_A=\frac1{n_A}(n_B (\sigma^2_B +\mu_B^2) +n_C (\sigma^2_C +\mu_C^2)) - \mu_A^2$
Henry
  • 157,058
  • I think you meant $n_C (\sigma^2_C +\mu_C^2)$ instead of $n_C (\sigma^2_B +\mu_B^2)$ for $\sigma^2_A$. That solution works like a charm. Thanks! – Omrii Jan 19 '21 at 23:44
  • @Omrii - You are correct - thank you - now edited – Henry Jan 20 '21 at 00:49