The well-known formula of calculating Sum of Squared Error for a cluster is this: SSE formula
where "c" is the mean and "x" is the value of an observation.
But this formula also brings the same result: Alternative SSE formula
where "m" is the number of the observations and "y" takes in every iteration, values of the observations.
For example, if we have {3, 7, 8} , our mean "c" = 6 and:
Using the usual formula: (6-3)² + (6-7)² + (6-8)² = 14
Using the alternative formula: [ 1∕(2*3) ] × [ (3-3)² + (3-7)² + (3-8)² + (7-3)² + (7-7)² + (7-8)² + (8-3)² + (8-7)² + (8-8)²] = 14
Starting from the first formula, I 'm trying to prove the alternative, but I 'm lost. Can someone help me with the proof?