I need a fast way to evaluate modulo 90 of any unsigned 32 bit. One way I am aware of is to represent 1/90 as 0.7111111111111 * 2^-6, then represent 0.7111111111111 as 3054198966 * 2^-32. Hence, if I multiply the input with 3054198966, take upper 32 bits with rounding and shift right by 6 bits, I get the result of input divided by 90. I then take the result, multiply by 90, and subtract that from the input. This technique, which relies on MulHigh, is expensive - it costs 1 32bx32b-->64b multiplication, 1 32b multiplication and 1 subtraction. I wander whether there is a cheaper way.
Asked
Active
Viewed 263 times
3
1 Answers
1
Consider the C program below
unsigned int test(unsigned int x)
{
return x % 90;
}
The code generated for this program with cc -O2 (clang-800.0.38) is
movl %edi, %eax
shrl %eax
movl $3054198967, %ecx
imulq %rax, %rcx
shrq $37, %rcx
imull $90, %ecx, %eax
subl %eax, %edi
movl %edi, %eax
popq %rbp
retq
which seems to agree with your technique, except that it uses $3054198967$ instead of $3054198966$.
Bottom line: Use x % 90 and trust the compiler to generate efficient code.
lhf
- 216,483
-
See also https://github.com/halide/Halide/blob/master/src/IntegerDivisionTable.cpp. – lhf Jun 09 '17 at 16:47
x % 90slow? – lhf Jun 09 '17 at 16:12