5

Consider a code $C$ that uniquely maps each k-digit binary number $P=p_1 p_2 \cdots p_k$ to some k-digit binary number $Q=q_1 q_2 \cdots q_k$. I.e., the code is $C(P) = Q$ and $C^{-1}(Q) = P$.

There are $2^k!$ such codes.

Now consider that code words are transmitted over a noisy channel which will flip exactly 1 bit in each code word Q, yielding another code word Q'. This will be decoded to $C^{-1}(Q')= P' \neq P$.

I am interested in the code that minimizes $|P' - P|$ for all $P$. How would I go about constructing it? I am grateful for any hints.

(My intuition is that the identity that maps each $P$ on itself, i.e., $\forall P: C(P)=P$ is optimal, but I don't know how to prove it. I hope my question is clear -- I'm not a mathematician. I'm happy to elaborate. To clarify in advance: I don't want to correct the error or detect which bit flipped, just minimize the error.)

rodion
  • 151
  • 2
  • 1
    You want a code so that if $Q_1$ and $Q_2$ only differ by one bit, no matter where that bit is, $|C(Q_1)-C(Q_2)|$ is small. This is not the case for the identity, where, if the differing bit is early, the resulting difference is large. – Arthur May 08 '13 at 21:04
  • I haven't found any better code yet. I wrote a script to test it for k=2 and k=3 showing the identity to be optimal (among other codes), hence my intuition. I'm happy to be proven wrong that the identity is not optimal! – rodion May 08 '13 at 21:05
  • So you have $k$ bits and would like to obtain again $k$ bits via some coding. If your coding includes some exor then your error will propogate to other bits as well. If your code is a permutation code then the performance of this code will be the same with identity coding. – Seyhmus Güngören May 08 '13 at 21:09
  • I think this problem makes sense in case the flipping action has some density function. – Seyhmus Güngören May 08 '13 at 21:11
  • 2
    This problem does translate into a more general problem in graph theory: Given a graph $G$, index the vertices and put on each edge the absolute value of the difference of the indices on the two vertices it connects. Choose indices so that the sum over all the edges is minimized. I find this an interesting problem in itself, and I do not know if it is solved, completely or partially. Anyways, the translation is as follows: let each $Q$ be a node and say they are adjacent iff they differ by one bit. I do not know if this helps, but it might give you an angle. – Arthur May 08 '13 at 21:27
  • Are you measuring the distance $|P-P'|$ as an integer or as "modulo $2^k$"? Sounds like the former? And it may not make any difference to your question - just asking because in some applications it might happen that you would think of $P'=2^k-1$ and $P'+1=2^k\equiv0\pmod{2^k}$ as being close to each other instead of being at a maximum distance. – Jyrki Lahtonen May 09 '13 at 05:26
  • @JyrkiLahtonen, I am measuring $|P - P'|$ as an integer. – rodion May 09 '13 at 09:14

1 Answers1

1

If you are measuring |P-P'| as an integer, then something like a modified gray code may be better than the identity function.

For example:

David's tweaked grey-like encoding:

P Q    P'             relative
       with one error errors
0 0000 {1, 2, 3, 4}   {+1, +2, +3, +4}
1 0001 {0, 5, 6, 7}   {-1, +4, +5, +6}
2 0010 {0, 7, 8, 9}   {-2, +5, +6, +7}
3 0100 {0, 6, 9, a}   {-3, +3, +6, +7}
4 1000 {0, 5, 8, a}   {-4, +1, +4, +6}
5 1001 {1, 4, b, c}   {-4, -1, +6, +7}
6 0101 {1, 3, b, d}   {-5, -3, +5, +7}
7 0011 {1, 2, c, d}   {-6, -5, +5, +6}
8 1010 {2, 4, c, e}   {-6, -4, +4, +6}
9 0110 {2, 3, d, e}   {-7, -6, +4, +5}
a 1100 {3, 4, b, e}   {-7, -6, +1, +4}
b 1101 {5, 6, a, f}   {-6, -5, -1, +4}
c 1011 {5, 7, 8, f}   {-7, -5, -4, +3}
d 0111 {6, 7, 9, f}   {-7, -6, -4, +2}
e 1110 {8, 9, a, f}   {-6, -5, -4, +1}
f 1111 {b, c, d, e}   {-4, -3, -2, -1}
worst errors:          -7, +7

identity coding:

P Q    P'             relative
       with one error errors
0 0000 {1, 2, 4, 8}   {+1, +2, +4, +8}
1 0001 {0, 3, 5, 9}   {-1, +2, +4, +8}
2 0010 {3, 0, 6, a}   {+1, -2, +4, +8}
3 0011 {2, 1, 7, b}   {-1, -2, +4, +8}
...
worst errors:          -8, +8
median absolute error: 3.

As you can see, the worst-case |P-P'| for the above tweaked grey-like encoding is 7, which is better than the worst-case |P-P'| of 8 for the identity mapping.

Therefore, C(P)=P is not optimal for k=4 (and, I suspect, for all higher k).

If you are more interested in the average or the median error |P-P'| over every code and over every possible single-bit error -- or, as Jyrki Lahtonen hints, the "circular error" (|P-P'| mod 2^k) -- then perhaps a standard "reflected" gray code may be better.

I'm not sure why you don't want to make Q a few bits longer than P -- then you can apply Hamming codes and correct any single-bit bit error, giving zero error.

David Cary
  • 1,817