0

I'm trying to run a multivariate regression in which not all variables are independent, and an not sure if this is possible.

The reason is as follows: Let's say we have a large number of contracts, and we want to know how likely they are to default, based on a number of factors (age of the person, etc)

I have a history of contracts, and I want to run a regression on them. Three of the possible variables are: Date of origin Date of default Age of contract at default

Now clearly age is a factor of the date of origin and default, but I don't know if it defaulted because "it was 5 years old" or because it was 2008 and was a bad year.

Can I run them as if they were independent? If not what is the best prectice?

Many thanks!

  • If I understood you right, in your 3 variables of: "Date of origin Date of default Age of contract at default", it is possible to write out: Age = Date of Default - Date of Origin? If this is true, then you cannot run that regression with all 3 variables. One is going to get dropped. – Greg Dec 14 '14 at 20:53
  • Thanks Greg, that is exactly the case. How would it be possible to tell which of those 3 is relevant? Any suggestions? – sapo_cosmico Dec 14 '14 at 20:58
  • Is your data only limited to defaults, or does it also include contracts still ongoing? – Greg Dec 14 '14 at 23:47
  • It is all defaults, and data on how much was recovered per year. Did that answer the question? – sapo_cosmico Dec 14 '14 at 23:49
  • Yes. I think you can solve the problem with interactions. Include in the regression, Year of Origin, Year of Origin x Age, Year of Default, Year of Default x age. (Note: The fact that your data only includes defaults will lead to selection bias in your estimates, but that is a problem for another day). – Greg Dec 14 '14 at 23:51
  • Wonderful, I'll try that, thanks Greg! – sapo_cosmico Dec 14 '14 at 23:53

0 Answers0