There is a set of coordinates $P=\{P_i\}$. $P_i=[x_i,y_i]$ and a set of coordinates $Q=\{Q_i\}$, $Q_i=[X_i, Y_i]$, where $Q_i$ coordinates are given by the following non-linear functions
$$X = f (a_1, a_2, a_3, ..., a_7)$$
$$Y = g (a_1, a_2, a_3, ..., a_7)$$
Parameters ai are estimated by non-linear least squares adjustment (BFGS) so as $\left\lVert P-Q \right\rVert$ is minimal.
Algorithm has a poor convergence when P has non-zero shifts sx, sy (i.e., large) against Q.
The values of ai are in interval $(-30P_i , 30P_i)$, but shifts can be "very large" compared to $a_i$: $\pm 10^9$.
Residuals with included shifts are given by
$$rx_i = f (a_1, a_2, a_3, \ldots, a7)_i -sx - x_i$$
$$ry_i = g (a_1, a_2, a_3, \ldots, a_7)_i -sy - y_i$$
And a Jacobi matrix $J$
$$ J = \begin{pmatrix}\frac{\partial X}{\partial a_1}&\cdots&\frac {\partial X}{\partial a_7}& -1& 0\\ \frac{\partial Y}{\partial a_1}&\cdots &\frac{\partial Y}{\partial a_7}& 0 &-1\end{pmatrix}$$,
where $$ \frac{\partial X}{\partial sx} = \frac{\partial Y}{\partial sy}=-1 $$ $$ \frac{\partial X}{\partial sy} = \frac{\partial Y}{\partial sx}=0 $$
How to ensure a convergence of this problem?
I tried to estimate $sx$ and $sy$ from a 2D Helmert transformation. But it works only for an initial vector very close to local minima.
In a current case (random initial vector) this method does not work, shifts are estimated incorrectly...
UPDATED QUESTION I tried to implement the case with zero shifts sx, sy, where
$$rx_i = f (a_1, a_2, a_3, \ldots, a7)_i - x_i$$
$$ry_i = g (a_1, a_2, a_3, \ldots, a_7)_i - y_i$$
and
$$ J = \begin{pmatrix}\frac{\partial X}{\partial a_1}&\cdots&\frac {\partial X}{\partial a_7}\\ \frac{\partial Y}{\partial a_1}&\cdots &\frac{\partial Y}{\partial a_7}\end{pmatrix}$$
and there is a very fast convergence for such sets P,Q...
But I am still not able to estimate shifts between P, Q and solve the old problem...