I will generalize the problem as follows:
States $A$ and $B$ are represented by strings of $n$ bits each; within the rightmost $m$ bits, at least $k$ bits of $A$ must be $0$ and the corresponding bits of $B$ must be $1$; and the Hamming distance between the rightmost $m$ bits of $A$ and the rightmost $m$ bits of $B$ must be at least $h.$
We want to find a formula to express a function $f(n,m,k,h)$ that gives the number of possible pairs of states $(A,B)$ that meet these constraints.
Initially, I'll just do the case $h\leq k$ since it's simpler than the general case;
the conditions requiring $k$ zeros in $A$ and the corresponding $1$s in $B$
guarantee that the Hamming distance will always be at least $k.$
If exactly $r$ of the rightmost $m$ bits of $A$ are $1$s (that is, $m-r$ bits are $0$), there are $\binom mr$ possible choices of column to put them in.
So there are $\binom mr$ ways for $A$ to have $r$ $1$s in its rightmost $m$ bits.
For each of these choices of $A$, there are exactly $r$ don't-care (X) bits in
the rightmost $r$ bits of $B$ (the others are forced to $1$),
hence $2^r$ choices for the rightmost $m$ bits of $B.$
So just considering the rightmost $m$ bits of $A$ and $B$, and assuming those bits of $A$ include exactly $r$ $1$s, we have $\binom mr 2^r$ choices.
If we look at all the bits that are not the rightmost $m$ bits of $A$ or $B,$
we will always find $2(n - m)$ don't-care (X) bits there.
So for each choice of the rightmost $m$ bits of $A$ and $B$ where
exactly $r$ of the rightmost $m$ bits of $A$ are $1$s,
we have $2^{2(n-m)}$ ways to fill in the remaining Xs.
That's $\binom mr 2^{r+2(n-m)}$ choices for exactly $r$ $1$s in the rightmost $m$ bits of $A.$
But $r$ can be anything from zero to $m-k$, so the total number of transitions is
$$ \sum_{r=0}^{m-k} \binom mr 2^{r+2(n-m)}. $$
I would pull a constant factor out of the sum as follows:
$$f(n,m,k,h) = 4^{n-m} \sum_{r=0}^{m-k} \binom mr 2^r \quad \text{when $h\leq k$}.$$
It's hard to get it simpler than that. Wolfram Alpha suggests there might be
something involving a hypergeometric function,
but I don't see that as an improvement.
For $h > k,$ instead of $r$ independent X bits in the rightmost bits of $B$ we have $r$ bits that must have at least $h - k$ zeros.
That is, those $r$ bits of $B$ can contain some number of $1$s; the number of $1$s may be as few as $0$, but it cannot be more than $m-h$ and of course it cannot be more than $r$.
Whichever is less, $r$ or $m-h,$ is the maximum number of $1$s.
So instead of always having $2^r$ ways to fill in those $r$ bits we have
$$\sum_{q=0}^{\min(r,m - h)} \binom rq.$$
Note that when $m - h \geq r,$ this comes out to $2^r,$
that is, the larger value of $h$ restricts the possibilities only in the cases where it is larger than the actual number of zeros in $A.$
Again it is possible to express this using a hypergeometric function but I do not see the advantage of that.
The total number of possible transitions therefore is
$$f(n,m,k,h) = 4^{n-m} \sum_{r=0}^{m-k}
\binom mr \sum_{q=0}^{\min(r,m - h)}\binom rq
\quad \text{when $h>k$}.$$
If you want to write this without needing to write
$\min(r,m-h)$, you can write it like this:
\begin{align}
f(n,m,k,h) = 4^{n-m}
\bigg( &\sum_{r=0}^{m-h} \binom mr \sum_{q=0}^r \binom rq \\
& + \sum_{r=m-h+1}^{m-k} \binom mr \sum_{q=0}^{m - h}\binom rq \bigg)
\quad \text{when $h>k$}.
\end{align}
(This is likely how I would do it if I were writing the function in software.)