I was wondering how one can compute the KL Divergence between two binary distributions (say, with parameters $p$ and $q$ and assume $p < \frac12$ and $q < \frac12$ for simplicity) accurately. The formula is clearly: \begin{equation} D(p,q) = p \log \frac{p}{q} + (1-p) \log \frac{1-p}{1-q}. \end{equation}
However, implementing the above in a computation system with floating point precision (e.g. a C code using double variables) would cause some rounding errors: when $p$ and $q$ are too close, the result becomes negative (although its magnitude is very tiny, let's say it is $-1 \times 10^{-18}$). So I guess there should be a better way of computing the Divergence while still keeping the result as accurate as possible.
(P.S. one can always apply an ad-hoc fix like setting the divergence to $0$ whenever $|p-q|<\epsilon$ for some $\epsilon$ but I am actually looking for a more clever solution)