0

I have very limited math skills so please forgive me for the question - if I am way off or otherwise.

I have an database of hundreds of thousands of records...most fall within 30-60 days - but a few have zero, in very rare cases less. On the reverse end of the spectrum I have cases where the day falls into the hundreds and in some cases thousands.

I need to calculate (given absolute min and max and average) some median value. Basically the average but sans extremes.

Is this what standard deviation does? Am I best just lopping off extremes on both ends and recalculating averages?

  • 1
    Perhaps post this on cross validated instead - http://stats.stackexchange.com/ . Whether or not you should ignore outliers depends on the situation- you might want to explain some of that situation in your question. – Ethan Bolker Sep 02 '16 at 00:50
  • OK thanks I wasn't even sure on where to start – Alex.Barylski Sep 02 '16 at 00:58
  • It is unclear what is being measured and averaged: number of days? What is a 'case'? Very few what have $0$ what? Does 'very few cases less' mean you have some negative values? Do you have access to the actual data, or are you given only min, max, and average? If you want a 'typical' value not much influenced by extremes, then consider the median. Another possibility is a 'trimmed mean' (for example, delete the highest 5% and lowest 5% of values, avg. the remaining 90%). I will try to give a more complete explanation if you can give a clearer statement of the problem, perhaps sample data. – BruceET Sep 02 '16 at 01:03

0 Answers0