1

This problem is from an engineering management textbook (Morse & Babcock, 5th ed) :

2005     $48k
2006     $64k
2007     $67k
2008     $83k

"What is the sales forecast for 2009, using the simple regression (least squares) method?"

The book works through an example (shown below and not to be confused with the above), but does not give enough information for me to understand what's going on. Hopefully it won't confuse my question, but I'll give the problem they worked through here also, so that I can get help understanding how they got their numbers. They have:

regression problem in book

Table 3-2 data was:

2005  $1100
2006  $1300
2007  $1200
2008  $1600

in b of the given problem, I can't figure where they got the parenthesized numbers--any of them! It couldn't have been by multiplying the sum of X's by the sum of Y's, or by the sum of y's. So I can't possibly do the homework problem until I understand how they are doing the give example.

Any help is appreciated.

Update: here is my work on the homework question. Can anybody give me a bit of confirmation that I've done right?

enter image description here

  • The quoted problem seems like a pedagogical nightmare on a number of levels, not the least of which is to suggest that fitting a simple linear regression to four data points is a reasonable idea in the first place. :) – cardinal Sep 21 '11 at 23:33
  • The formula for b you need here would look to be formula 14 here, while the formula for a would correspond to formula 28 in that link. But I agree with @cardinal, this seems to me a poor problem, pedagogically speaking. There is not even an indication to evaluate the goodness of fit... – J. M. ain't a mathematician Sep 21 '11 at 23:52
  • Does that mean the question in my book was unanswerable as given? The only difference between my book's formula and the one in the link (in my question) are the x & y designations are d and i, and that those variables in "a" have a horizontal bar over them with a note saying "where D and I are the mean values of D and I, repsectively, and indicate a summation from i=1 to n." The first D and I in the quote had horizontal bars over them too. – Captain Claptrap Sep 21 '11 at 23:55
  • Is it legal to take a picture of the page and post it? – Captain Claptrap Sep 22 '11 at 00:01
  • It's answerable, sure. @cardinal and me were just bemoaning the habit of mindlessly using linear fitting being implicitly pushed. BTW: it's just one page, so I think fair use covers it. Do post it. – J. M. ain't a mathematician Sep 22 '11 at 00:26
  • @ J. M. - just added pic – Captain Claptrap Sep 22 '11 at 00:45
  • Huh. I suppose the data in Table 3-2 were not very forthcoming? – J. M. ain't a mathematician Sep 22 '11 at 00:57
  • Oops, I deleted that when I revised the OP to include the pic. I'll add it and give it here: 2005: $1100, 2006: $1300, 2007: $1200, 2008: $1600 – Captain Claptrap Sep 22 '11 at 01:00
  • I am now quite confused. You gave data at the top of the post. Then you gave data at the bottom which doesn't match the one at the top? Which is which? – J. M. ain't a mathematician Sep 22 '11 at 01:14
  • @ J. M. - The data at the top is for the homework question. The rest is for the sample question given in the book that I cannot see how they did. If you can help me see how they did it, I can do the homework question. Sorry, maybe I shouldn't have included the homework question. – Captain Claptrap Sep 22 '11 at 01:20
  • Alright, I think I see it now. The regression is actually being performed with $D=a+b\cdot(I-2005)$. For instance, $(11\times 0+13\times 1+12\times 2+16\times 3)\times 100=8500$ and $1100+1300+1200+1600=5200$... – J. M. ain't a mathematician Sep 22 '11 at 01:34
  • @ J. M.- Would you consider looking at my answer also like I asked robjohn? I'll probably have it tomorrow night. – Captain Claptrap Sep 22 '11 at 01:52
  • I've updated my OP with my work on the answer. – Captain Claptrap Sep 23 '11 at 00:40
  • Huh... something went wrong in your computation for the intercept $a$. It was a good idea to "change units" such that you were dealing with the more manageable $(48,64,67,83)$, so kudos! Your problem started when you used the "scaled" result 10.8 along with the unscaled average 65.5k. Consistency of units is important! – J. M. ain't a mathematician Sep 23 '11 at 01:54
  • @J.M. - Thanks for pointing that out. I updated my answer in the OP. I've got a good feelin' I got this one down. Thanks! – Captain Claptrap Sep 23 '11 at 22:15

1 Answers1

1

Does this help?

$$ \begin{align} a&=\frac{n\sum(D_iI_i)-\sum I_i\sum D_i}{n\sum I_i^2-(\sum I_i)^2}\\ &=\frac{4(0\cdot1100+1\cdot1300+2\cdot1200+3\cdot1600)-(0+1+2+3)(1100+1300+1200+1600)}{4(0^2+1^2+2^2+3^2)-(0+1+2+3)^2}\\ &=\frac{4(8500)-(6)(5200)}{4(14)-(6)^2} \end{align} $$

robjohn
  • 345,667