Regression problems for one explained variable of one independent variable are generally of the form
$$f(x,y;a,b,c\cdots)\approx0$$where you want to find the "best" values of the unknown parameters $a,b,c\cdots$ for a given data set $(x_i,y_i)$.
The common procedure is to minimize the global quadratic error, computed as
$$\delta(a,b,c\cdots)=\sum_{i=1}^nf^2(x_i,y_i;a,b,c\cdots).$$
(The square makes sure that all terms are positive and do not compensate each other.)
This is called the least-squares formulation. It is often intractable by analytical methods, and a numerical solution is searched by the sophisticated Levenberg-Marquardt algorithm.
On the other hand, the particular case of an affine model, $y=ax+b$ is quite accessible.
$$\delta(a,b)=\sum_{i=1}^n(y_i-ax_i-b)^2.$$
The optimum is found by canceling the partial derivatives on $a$ and $b$:
$$\frac{\partial \delta(a,b)}{\partial a}\propto\sum_{i=1}^nx(y-ax-b)=\sum_{i=1}^nxy-a\sum_{i=1}^nx^2-b\sum_{i=1}^nx=0,$$
$$\frac{\partial \delta(a,b)}{\partial b}\propto\sum_{i=1}^n(y-ax-b)=\sum_{i=1}^ny-a\sum_{i=1}^nx-b\sum_{i=1}^n1=0.$$
and this forms an easy $2\times2$ system of linear equations.
From this useful tool called linear regression, other nonlinear problems can be solved if they can be put in the form
$$y=g(af(x)+b),$$i.e.$$g^{-1}(y)=af(x)+b.$$
In this case, instead of performing the linear regression on $(x,y)$, you do it on $(f(x),g^{-1}(y))$.
For example, the power law $y=bx^a$ becomes $\ln(y)=a\ln(x)+\ln(b)$ by taking the logarithm.