A question have mean and standard deviation of two groups, it ask to find out combined mean and standard deviation. I could not understand that how formula of combined standard deviation connects with general sample/population standard deviation formula.
Asked
Active
Viewed 201 times
1 Answers
0
Outline: For sample A, use the formula $S^2_A = \frac{1} {n_A-1} [\sum_A X_i^2 - n\bar X^2_A]$ along with $n_a$ and $\bar X_A$ to find $\sum_A S^2_A.$ Similarly, for sample B. For the combined sample C: $\sum_C X_i^2 = \sum_A X^2_i +\sum_B X^2_i.$ Finally, use $S^2_C = \frac{1} {n_C-1} [\sum_C X_i^2 - n_C\bar X^2_C],$ where $n_C = n_A + n_B$ and $\bar X_c = \frac {1} {n_C} (n_A \bar X_A + n_b \bar X_B).$
In case it helps here are two samples from R statistical software, along with all of the quantities used above:
set.seed(4618); x.a = round(rnorm(10, 50, 3)); x.b = round(rnorm(20, 52,4))
x.a; length(x.a); mean(x.a); var(x.a); sum(x.a^2)
## 55 47 57 53 49 52 52 51 51 52
## 10 # size of first sample
## 51.9 # mean of first sample
## 7.877778 # variance of first sample
## 27007 # sum of squares of first sample
9*var(x.a) + 10*mean(x.a)^2
## 27007
x.b; length(x.b); mean(x.b); var(x.b); sum(x.b^2)
## 57 46 52 56 53 58 55 50 57 45 48 53 50 61 54 48 52 50 51 52
## 20
## 52.4
## 17.09474
## 55240
x.c = c(x.a, x.b) # combine samples
x.c; length(x.c); mean(x.c); var(x.c); sum(x.c^2)
[1] 55 47 57 53 49 52 52 51 51 52 57 46 52 56 53 58 55 50 57 45
[21] 48 53 50 61 54 48 52 50 51 52
## 30
## 52.23333
## 13.7023
## 82247
Boxplots of the three samples, where variable width suggests different sample sizes.
BruceET
- 51,500
