0

I have a set with 10.000 equations like this below:

1*A1 + 1*A2 + 0*A3 + 0*A4 + 1*A5... 1*A800 = 1
0*A1 + 1*A2 + 1*A3 + 0*A4 + 0*A5... 0*A800 = 0
0*A1 + 0*A2 + 0*A3 + 1*A4 + 1*A5... 0*A800 = 1
0*A1 + 1*A2 + 1*A3 + 0*A4 + 1*A5... 1*A800 = 0

I need to find a solution to all A1, A2...A800 constants BUT there will be never an exact solution because of how the data above was colected (they are like probabilities, also the number of equations will always be much larger than the number of constants, so the chance of an exact solution is pretty low), I need an aproximated solution which should be optimal in such a way that if I replace the solution of all constants back to the equation above, the error (expected result - actual result) is the minimum.

So how is the logic behind this solution? After knowing the logic, I will program it in C++ to solve the equations or, if there is a better way of doing this in Excel, that would be nice too. I am a Mechanical Engineer and Programmer, and even having some knowledge on solving liner equations, finding optiminal solutions and such, I cant find a way out of this problem!

NOTE: all A1, A2... A800 must be always positive!

Samul
  • 127

2 Answers2

2

The usual way to solve overdetermined system is by least-squares. If your system is

$$Ax=b,$$ you form the so-called normal equations,

$$A^TAx=A^Tb$$ which form a square system (in your case $800\times800$, dense). You will obtain a solution that is optimal in the least-squares sense (minimum sum of the squared residues).

Anyway, due to the special form of the coefficients, maybe some other approach is better.


Update:

The last minute note (positiveness constraint) makes it a quite different and much harder problem. https://en.wikipedia.org/wiki/Non-negative_least_squares

  • This is a constrained problem. – Chrystomath Jul 14 '20 at 14:21
  • @Chrystomath: this was missing. –  Jul 14 '20 at 14:21
  • I know. I was writing the same answer. – Chrystomath Jul 14 '20 at 14:22
  • I thought I knew math untill you replied me! I only understood the first part Ax=b but didnt understand the remaining. How the square system would help me? Also, is my system of Ax=b? It's linear, but I dont know if it is like Ax=b – Samul Jul 14 '20 at 15:40
  • @Samul: hem, any linear system has the form $Ax=b$. This is the most general. –  Jul 14 '20 at 15:43
  • @YvesDaoust Thank you. I watched 2 videos on Youtube about LOS and read a little bit about it since your answer, but I still cant understand how that helps me solve my problem cause I dont understand how to transform Ax1 + Bx2 + Cx3... = y into y = A + bx (linear form). – Samul Jul 14 '20 at 15:51
  • @Samul: if you can't make the difference between $Ax=b$ and $y=A+bx$ (which is meaningless), I am not surprised that you don't understand my answer. –  Jul 14 '20 at 15:55
  • I am pretty sure what you pointed me, does not help at all. So far, the best I came was this https://www.fq.math.ca/Scanned/36-3/tang.pdf This PDF explains how to do EXACTLY what I want, but it assumes some things that I think are incompatible with my case. – Samul Jul 14 '20 at 16:03
  • @Samul: good luck. –  Jul 14 '20 at 16:05
0

After searching a lot about the subject, I came to the conclusion that regression linear would not solve this specific problem using OLS (at least not computationaly efficient), for example.

The best method I found is using iterative method and choose a method that converges. I found that the The Jacobi Method works pretty well for solving a set of linear equations. And a "plus" side of Jacoby Method is that you can "choose" how long you wanna run the algorithm according to how satisfied you are with the results. If you reach a point where the error of the tried values is getting enoughly small, then you can stop!

source: https://college.cengage.com/mathematics/larson/elementary_linear/5e/students/ch08-10/chap_10_2.pdf

For those who cant program, here is a pretty straighforward video showing how to do that using only excel -> https://www.youtube.com/watch?v=lW0lm4O3Ptw

Samul
  • 127