Summary
Perhaps there are an answer, and it is
$$p(y,x,L) := [[(f(y) \& f(x)) >> (64-L)] = 0]$$
but is difficult to proof that it is the "faster way". The only algorithm that actually looks "the faster", with performance gain when compared to "common" ones, is one that uses the cache strategy (pre-processing y values). So, enhancing the above formula with the replacing of $f(y)$ by a constant $c$.
Discussion and details
This is not a generic solution, is valid only for use in situations where only one parameter need "real time" speed.
The problem explaned in the question seems not offers a trivial solution, so using a brute force solution: to cache a pre-processed number... Is an economic cache, and valid as solution of my problem in a database.
Cache-phase: some examples, supposing 8 bits.
- original=$00010100$ and cached=$10100000$.
- original=$01010100$ and cached=$10101000$.
- original=$11010101$ and cached=$11010101$.
Real-time checking: comparing the cached number c with x.
Suppose comparision by $L=3$, with $x=01011000$. Need to transform x into $10110000$ them preserve only first 3 (L) non-zero bits, resulting $y=10100000$. Examples:
- for $c:=10100000$, compare $c\&y$ (is $10100000$) with y, that is true.
- for $c:=10101000$, compare $c\&y$ (is $10100000$) with y, that is true.
- for $c:=11010101$, compare $c\&y$ (is $11000000$) with y, that is false.
Conclusion
Cache-phase is a "while non-zero" function that does only x << 1 to obtain the cached 64 bits representation. Let label it $f(x)$, a function sometimes named "BitScanForward".
The function $p(c,x,L)$ that checks prefixes of size L, need also a function $d(L)$ that do "padding left ones" ($d(2)=11000000$, $d(3)=11100000$, etc.). Is an integer comparison with a bitwise AND after after f transformation: $$p(c,x,L) := [x = [c \& f(x) \& d(L)]]$$
or, using the faster Ben Voigt's solution:
$$p(c,x,L) := [[(c \& f(x)) >> (64-L)] = 0]$$
There are many performance gains:
the replace of the &d(L) part by >>(64-L), it removes the function-call and ensure a good implementation for this part of the function;
the replace of the x= part, a "comparison with variable" operation, by a comparison with a constant. To use a constant instead a variable is the main performance gain.
As this constant is zero, modern compilers will use something as test eax,eax instead of cmp eax,0, that is some marginally better performing instruction.
PS: the function f(x) can be implemented as "left align" using a function g(x,L) that tells you how far to shift, so $f(x)=x<<g(x,L)$. As Ben Voigt explained, is the faster way to do it, and the g usual name in the C-lang's libraries, is BitScanReverse64().
x << 1loop to drop zeros) and them comparing by a AND-mask of1110000with L ones. – Peter Krauss Aug 23 '18 at 12:27