0

Let D be a (finite) set of data points. E.g., D might contain people’s heights, their exam results, or the temperature of an object at various times. Further, let m be D’s mean, and let s be D’s standard deviation. Then there’s a sense in which s is ‘symmetric’ around m: for one thing, values in m-s are usually considered just as ‘normal’ or ‘standard’ as the values in m+s. For another thing, we can ‘rotate’ D around m without affecting s: If each d ∈ D is replaced with d+2(m-d), the resulting set will have the same standard deviation as the original set. (E.g., if m = 10 and 7 ∈ D, then when we ‘rotate’ D around m, 7 would be replaced with 7+2(10-7) = 13.) Thirdly, I’ve noticed the following (slightly more interesting) fact:

For each d ∈ D, we can find a number d', s.th. if d is replaced with d' in D (while the rest of D stays the same), then the resulting set D' will have the same standard deviation as the original set.

The number d' in question will usually be close to, but distinct from, d+2(m-d). The mean of the new set will usually be different from the mean of the original set. (I say ‘usually’ because if all elements of D are the same, then d = d' = m = d+2(m-d).) As one would expect, D' can be ‘rotated’ again without affecting the standard deviation, and each element of the rotated D' can again be replaced without affecting the standard deviation; and so forth.

Now, my question is: are there other ways in which the standard deviation could be said to be ‘symmetric’, and is its ‘symmetry’ ever relevant in practice, when doing statistics/data analysis. Also, is there any theoretical interest/value in exploring the ‘symmetry’ of the standard deviation? (I’m very new to statistics, so please forgive me if the question has an obvious / boring answer.)

  • Note that visually $d \mapsto d + 2(m-d)$ corresponds to reflecting $d$ through $m$. – lisyarus Nov 13 '17 at 11:51
  • 1
    I'm not sure this comment is relevant to your question but as far as I understand you may find interesting the 'Datasaurus Dozen' dataset, where sets of points have pretty similar statistics but look essentially different. – Denis Korzhenkov Nov 13 '17 at 12:38
  • @Denis Korzhenkov I do find this interesting: thanks! My question arose when I tried to understand what the standard deviation actually measures. Most sources say it measures how the data points are ‘spread out’. Yet then I noticed the results I mentioned, and that challenges this interpretation at least a little bit: when d’ is substituted for d, there’s an intuitive sense in which the data is now ‘spread’ differently as the graph now looks different. So, it goes in the same direction as the Datasaurus. – MarkOxford Nov 13 '17 at 13:31

0 Answers0