Some general considerations: if $f(x_0)=0$ and $f'(x_0) \neq 0$ then the standard error estimate for Newton's method looks like this:
$$x-\frac{f(x)}{f'(x)}-x_0=x-\frac{f'(x_0)(x-x_0)+1/2 f''(x_0)(x-x_0)^2+o((x-x_0)^2)}{f'(x_0)+O(x-x_0)}-x_0 \\
= x-(1+O(x-x_0)) \left (x-x_0+1/2 \frac{f''(x_0)}{f'(x_0)}(x-x_0)^2+o((x-x_0)^2) \right ) - x_0 \\
= -1/2 \frac{f''(x_0)}{f'(x_0)}(x-x_0)^2+o((x-x_0)^2).$$
Some consequences: you will not have exact equality as you wrote it. What you hope for is an inequality, $|X_{n+1}-x_0| \leq \lambda |X_n-x_0|^2$ for some $\lambda$. In general this does not hold globally, it only holds once $X_n$ is in some interval around $x_0$. To understand this interval better you need to look at what that $o((x-x_0)^2)$ term actually is. Assuming you have a third continuous derivative, we can do the calculation over again using explicit Lagrange remainders:
$$x-\frac{f(x)}{f'(x)}-x_0=x-\frac{f'(x_0)(x-x_0)+1/2f''(x_0)(x-x_0)^2+1/6 f'''(\xi_x)(x-x_0)^3}{f'(x_0)+f''(\eta_x)(x-x_0)}-x_0$$
where $\xi_x$ and $\eta_x$ are numbers between $x_0$ and $x$. Simplifying a bit you wind up at:
$$(x-x_0) \left ( 1-\frac{1+1/2 f''(x_0)(x-x_0)/f'(x_0)+1/6 f'''(\xi_x)(x-x_0)^2/f'(x_0)}{1+f''(\eta_x)(x-x_0)/f'(x_0)} \right ).$$
Simplifying further by getting a common denominator and you wind up at:
$$(x-x_0)^2 \left ( \frac{f''(\eta_x)/f'(x_0)-1/2 f''(x_0)/f'(x_0)-1/6 f'''(\xi_x)(x-x_0)/f'(x_0)}{1+f''(\eta_x)(x-x_0)/f'(x_0)} \right ).$$
So in general we shouldn't expect global quadratic convergence because of that third derivative term.
When you specialize the above to the case of Newton's method for square roots, you get something which is quite algebraically obvious. I could've derived this from the start, but I wanted to go over the general considerations as well.
$$\frac{x+a/x}{2}-\sqrt{a}=\frac{1}{2x} (x-\sqrt{a})^2.$$
Thus we indeed do not have global quadratic convergence for square roots; the coefficient blows up as $x \to 0$. (This should not be surprising, because if $X_1$ is the same when $X_0=x$ and when $X_0=a/x$, for any $x$.) Thus a really robust error estimate should try to argue that $X_n$ can't get too close to zero. A good result that I think holds is $X_n \geq \alpha := \min \{ X_0,\sqrt{a} \}$. In this case it you get your desired estimate with $\lambda=\frac{1}{2\alpha}$.