3

I need a fast way to evaluate modulo 90 of any unsigned 32 bit. One way I am aware of is to represent 1/90 as 0.7111111111111 * 2^-6, then represent 0.7111111111111 as 3054198966 * 2^-32. Hence, if I multiply the input with 3054198966, take upper 32 bits with rounding and shift right by 6 bits, I get the result of input divided by 90. I then take the result, multiply by 90, and subtract that from the input. This technique, which relies on MulHigh, is expensive - it costs 1 32bx32b-->64b multiplication, 1 32b multiplication and 1 subtraction. I wander whether there is a cheaper way.

1 Answers1

1

Consider the C program below

unsigned int test(unsigned int x)
{
        return x % 90;
}

The code generated for this program with cc -O2 (clang-800.0.38) is

movl    %edi, %eax
shrl    %eax
movl    $3054198967, %ecx
imulq   %rax, %rcx
shrq    $37, %rcx
imull   $90, %ecx, %eax
subl    %eax, %edi
movl    %edi, %eax
popq    %rbp
retq

which seems to agree with your technique, except that it uses $3054198967$ instead of $3054198966$.

Bottom line: Use x % 90 and trust the compiler to generate efficient code.

lhf
  • 216,483
  • See also https://github.com/halide/Halide/blob/master/src/IntegerDivisionTable.cpp. – lhf Jun 09 '17 at 16:47