I'm a computer engineering student and I have a design problem. I'd like to be able to use a numerical method to find the square root of a number with only logic gates on a physical hardware level and I'm restricted to binary numbers.
If I set $f(x) = x^2 - a$ then by Halley's method we obtain that $x_{n+1}=\frac{x_{n}^{3}+3 a x_{n}}{3x_{n}^{2}+a}$.
After some simplification using polynomial long division I've gotten this down to $x_{n+1}=\frac{x_{n}}{3}+(\frac{8}{3})(\frac{1}{\frac{1}{x_{n}}+\frac{3x_{n}}{a}})$
I would like to use Halley's method because it has quite rapid convergence, faster than the simpler form of Newton's method and use the IEEE 754 floating point standard.
Are there any other manipulations that I can do to this to reduce the complexity or size of the numbers used in each iteration?
I would like to keep $x_{n}$ as a power of one. If there is an area of mathematics that I can study these methods in more detail I would appreciate you pointing me in that direction.