-1

I have some numerical data tables, with a column of calculated results based on three inputs in each row. How can I find the formula to apply to the inputs and recalculate the same result ?

the data series is formatted as such (excel file attached):

data1, data2, data3 : result

(there are 330 rows in the tables, hence 990 inputs and 330 results)

examples from the actual data serie :

  • 13, 12, 11 : 857
  • 12, 11, 1 : 771
  • 2, 9, 6 : 91
  • 1, 3, 2 : 0

Excel File : DataSerie

Bikay
  • 31
  • In principle, there is no one way to do so; it depends entirely on context. It is trivial to get a polynomial that fits arbitrarily many data points, for instance, even the points $$ (1,e) \qquad (2,e^2) \qquad (3,e^3) \qquad (4,e^4) $$ which has interpolating polynomial $$ \frac 1 6 (e - 1)^3 e x^3 - \frac 1 2 (e - 1)^2 e (2 e - 3) x^2 + \frac 1 6 e (-26 + 57 e - 42 e^2 + 11 e^3) x - e (-4 + 6 e - 4 e^2 + e^3) $$ but clearly $f(x)=e^x$ works too. [cont.] – PrincessEev Dec 30 '23 at 22:01
  • In general, you could make arbitrarily many functions fit the data (especially in multiple dimensions). You have to make some basic assumptions as to the type of pattern the data should be allowed to follow / what it could possibly follow, and those can only be derived from the context in which the data arises. Even then you might only be able to get a best-fit approximation due to any number of reasons, e.g. errors or inaccuracies in measurement. – PrincessEev Dec 30 '23 at 22:01
  • @PrincessEev data points being d1, d2, and d3, I know the calculated results fit into this general formula: result = (d1+a)n + (d2+b)m + (d3+c)*k + e . but this is based on some visual/intuitive inspection of the data points. is there a way to find these coefficients and variables? ( I could write a brute-force code to try many numbers and find some close-enough approximations, but that's not the right way) – Bikay Dec 30 '23 at 22:10

1 Answers1

0

As was mentioned in the comments, without a specific form of function, there is no way to know in general. However, based on your comment, I will write what to do in case it is an affine function.

Let $x_i,y_i,z_i$ be the values from input columns $1,2,3$ on row $i$ in your dataset. Let $v_i$ be the value in the calculated columns. We speculate that there exist numbers $a,b,c,d$ such that $v_i=ax_i+by_i+cz_i+d$ for each row $i$. Choosing the first four rows of data (but any four will do, if our assumption that it is an affine function is correct), we create a matrix equation $$\begin{pmatrix} x_1 & y_1 & z_1 & 1 \\ x_2 & y_2 & z_2 & 1 \\ x_3 & y_3 & z_3 & 1 \\ x_4 & y_4 & z_4 & 1 \end{pmatrix}\begin{pmatrix}a\\b\\c\\d\end{pmatrix}=\begin{pmatrix}v_1\\v_2\\v_3\\v_4\end{pmatrix}.$$

With the four rows you gave, we have

$$\begin{pmatrix} 13 & 12 & 11 & 1 \\ 12 & 11 & 1 & 1 \\ 2 & 9 & 6 & 1 \\ 1 & 3 & 2 & 1\end{pmatrix}\begin{pmatrix}a\\b\\c\\d\end{pmatrix}=\begin{pmatrix}857\\771\\91\\0\end{pmatrix}.$$

Letting a computer algebra system calculate $M^{-1}\vec{v}$ is one way. In fact, you can google "matrix equation solver" and find free sites into which you can easily type this in.

If the system has no solutions, that means it isn't an affine function. If the system has multiple solutions, you can start over with four new rows.

  • If I've understood it right this procedure doesn't consider the +a, +b, +c operations mentioned in my comment above, but solves the n,m,k coefficient values. I'm sure the first columns in data, is reduced by one then multiplied by 66. It fits the whole rows (though it's just a guessing). and if there's a way to also include the +a,b,c adjustments into the matrix equations, let me know. Btw all I found searching for online matrix solvers showed different kinds of matrices, 3x3, or probably for other "forms" of equations. – Bikay Dec 30 '23 at 23:08
  • 1
    What I have is equivalent to what you wrote. If we have $(d_1+a)n+(d_2+b)m+(d_3+c)k+e$, this is the same as $$nd_1+md_2+kd_3+[an+bm+ck+e].$$ The chunk $an+bm+ck+e$ is all one constant, which I have renamed $d$. –  Dec 30 '23 at 23:11