1

I have a data set that contains the following for 5 years (though I can get more years):

  • Total expenditure on equipment for staff
  • Total number of staff
  • Expenditure per staff member

The expenditure per staff member is a derived statistic arrived at by dividing the yearly expenditure by the total number of staff per year. If I know in advance how many more staff members we will have next year (or an average by how much it increases), is there a way to use the pre-existing data to estimate a new Expenditure per staff member, which I could multiply by the increased figure to derive how we would need to have available to spend on equipment?

tmnsnmt
  • 25
  • 1
    I suggest you figure out the "expenditure per staff member" for each year separately for at least $10$ years. Once you have these numbers. Plot them on a graph by year. How consistent is that graph? Is it consistent if you drop last year (quite frankly, the expenditure for either last year or this year are not likely to be good predictors of the future). If the numbers jump all other the place, you are not going to be able to get a good estimate. If they are consistent, you can get a good estimate. But be prepared for it to be too low anyway. – Paul Sinclair Oct 29 '21 at 19:09
  • Thanks. I plotted it, did the average percentage difference over ten years and got 2.89%. I guess that is consistent enough? How would I use this data to do my estimate? – tmnsnmt Oct 29 '21 at 22:02
  • 2
    Does it have an upward or downward trend? If not, then take the average value to get the expenditure per staff this year. If so, then use that trend to estimate what the expenditure per staff member will be. If you graphed this in a spreadsheet like Excel, then there will be an option in the graphing utility to add a trendline. Add it, and then also select the option to show the equation. Use the equation to calculate the expenditure for this year. Once you have that number, multiply it by your expected staff. – Paul Sinclair Oct 29 '21 at 23:02
  • Thanks for the explanation. I did as you explained and found the equation. It had an upward trend with R squared 0.48 and P 0.01 – what does this say about the trend? – tmnsnmt Oct 31 '21 at 22:01
  • 1
    $R^2$ is a measure of how well your data fits the trendline. It takes values between $0$ and $1$ If the graph shows points closely scattered around the trendline, then the data is really consistent, and you can expect the value calculated from the trendline equation to be a good predictor of future measurements. These will have an $R^2$ value close to $1$. On the other hand, if the data swings back and forth past the trendline, sometimes going much higher, sometimes much lower, then the $R^2$ value will be low. $0.48$ means the data values differ quite a bit from what the trendline predicts. – Paul Sinclair Nov 01 '21 at 00:17
  • You can still use the trendline equation to predict the value for next year, but expect that the actual cost to be significantly higher or lower than this prediction. But this doesn't match what you were saying earlier, so something may be wrong. Did you plot the data using an xy-scatter plot? Trendlines don't work that well with the various other graph styles Excel offers. And I'm not sure what "$P$" value you are referring to. – Paul Sinclair Nov 01 '21 at 00:38

0 Answers0