If we are comparing maintenance costs (M$) per mile (M$/m = Mtype) of two car models over a 5 year period, where the number of cars that are type (a) is much less than type (b), and the number of type (b) cars keeps increasing at a faster rate than type (a) so the mix keeps getting heavier for type (b), which would be a more reliable comparison of the average cost of maintenance for type (a) and (b) cars, and why: (1) the average annual costs for a and b types , where Avg. M$a =(Ma1+Ma2+Ma3+Ma4+Ma5)/5 and Avg. M$b = (Mb1+Mb2+Mb3+Mb4+Mb5)/5 or (2) Weighted 5-year averages where, WAvg. M$a = Sum Ma1-5/Sum ma1-5 and WAvg. M$b = Sum Mb1-5/Sum mb1-5. If the weighted average is considered more reliable, why does it matter that there are more type )b) cars and their number is increasing faster than type (a) cars as long as the number of both types does not present small sample problems?
1 Answers
It all depends on what you are trying to do. If you care about estimating the reputation of each car type, as determined by maintenance costs, then this doesn't really depend on how many of the car you have as long as it's not too small (say like around 50-100 or more). So you would consider each average equally, without any weights. However, if you want to evaluate something like average maintenance cost over all cars where both types of cars are being included, say to measure overall satisfaction with all the cars, then you'll probably want a weighted average. As for when it's increasing or decreasing over time, if the only thing you care about is future 5 year averages, then you can simply use a time series forecasting method to predict future years, and then include those future year predictions in your calculations. The simplest forecasting method is to assume that the difference between last year and year before will continue to be the difference between years in future years. Or you can fit a quadratic polynomial to your annual data, if you suspect that costs may be increasing/decreasing at an increasing or decreasing rate. If you don't have a lot of years of data though, you probably don't want to go beyond a quadratic or cubic fit to make future predictions.
- 26,142
-
Thanks for the answer. It was my thought that either could be good depending on how you want to use the result, but I am a long time out of school and an engineer not a math expert. The way I want to use the results is to support a conclusion that (a) has a higher maintenance cost per mile than (b). If the average of the 5-year total costs and miles (weighted average) is used (a) has 2% higher cost than (b), but based on the average of the annual costs for the 5 years (a) has a (3%) higher cost than (b). I have been told that 2% is the right answer. – Wayne Jun 28 '15 at 16:57
-
I am not ready to accept that 2% is more correct than 3% without understanding why. The data is not available by car so to allow trending or modeling. Only the total costs and miles by type was kept each year. Is there an argument for either answer to be a more reliable predictor of the future, assuming the design and construction of the two types remains unchanged? – Wayne Jun 28 '15 at 17:02