Given an example equation:
$$ z = Mx + Ny $$
where $M$, $N$ are unknown parameters and $x, y, z$ are features of a dataset.
My initial guess is to use gradient descent and the least squares error to obtain $M$ and $N$ from the dataset.
After that, we construct the inequality equations and apply these with the new equation (known $M$ and $N$) to a Lagrange multiplier to minimise $z$.
Is this a correct approach to the problem?