Detects zero bytes inside a 32 bit integer

Code
01: bool hasZeroByte(unsigned int x)
02: {
03:   return ((x - 0x01010101) & ~x & 0x80808080) != 0;
04: }
05:
06: bool hasZeroByteSimple(unsigned int x)
07: {
08:   if ((x & 0x000000FF) == 0)
09:     return true;
10:   if ((x & 0x0000FF00) == 0)
11:     return true;
12:   if ((x & 0x00FF0000) == 0)
13:     return true;
14:   if ((x & 0xFF000000) == 0)
15:     return true;
16:
17:   return false;
18: }
bits.stephan-brumme.com

Explanation One of the most prominent example of detecting zeroes is strlen. To improve performance many bytes are examined in parallel.

The expression x − 0x01010101 sets the highest bit of a byte if a single input byte is within the input set
{ 0x00, 0x81, 0x82, 0x83, ..., 0xFF } because the result will be { 0xFF, 0x80, 0x81, 0x82, ..., 0xFE }

On the other hand, ~x sets the highest bit of a byte if a single input byte is within the input set
{ 0x00, 0x01, 0x02, 0x03, ..., 0x7F } because the result will be { 0xFF, 0xFE, 0xFD, 0xFC, ..., 0x80 }

The only value contained in both input sets is 0x00, exactly what we are looking for.
That means, only for x = 0 the highest bit is still set after ANDing: (x − 0x01010101) & ~x
Please do not be confused by the fact that a result ≠ 0 actually indicates a zero byte.

Then, all bits except for the highest bit of each byte are reset to zero using AND 0x80808080.
The resulting value is equal to zero if all (!) of these highest bits are zero.
If at least one is still set to 1, caused by a zero byte, the comparison ≠ 0 gives "true".

hasZeroByteSimple shows the common approach to this problem.

Restrictions • designed for 32 bits, easily extendable to 64 bits

These ads
help me to pay
my server bills

Performance • constant time, data independent

+ Intel® Pentium™ D:
hasZeroByte: ≈ 6 cycles per number
hasZeroByteSimple: ≈ 7 cycles per number

+ Intel® Core™ 2:
hasZeroByte: ≈ 3 cycles per number
hasZeroByteSimple: ≈ 3 cycles per number

+ Intel® Core™ i7:
hasZeroByte: ≈ 3.5 cycles per number
hasZeroByteSimple: ≈ 3 cycles per number
Performance Chart
CPU cycles
(full optimization,
lower values are better)

Assembler Output
hasZeroByte:
01:   ; x - 0x01010101
02:   lea eax, [ecx-01010101h]
03:   ; ~x
04:   not ecx
05:   and eax, ecx
06:   and eax, 80808080h
07:   ; compare to zero
08:   neg eax
09:   sbb eax, eax
10:   ; set eax=1, boolean value for true
11:   neg eax

hasZeroByteSimple:
01:   ; if ((x & 0x000000FF) == 0)
02:   test al, al
03:   jnz $if2
04:   $found:
05:   ; 1 means true
06:   mov al, 1
07:   jmp $done
08:   $if2:
09:   ; if ((x & 0x0000FF00) == 0)
10:   test eax, 0000FF00h
11:   jz $found
12:   ; if ((x & 0x00FF0000) == 0)
13:   test eax, 00FF0000h
14:   jz $found
15:   ; if ((x & 0xFF000000) == 0)
16:   test  eax, 0FF000000h
17:   setz  al
18:   $done:
bits.stephan-brumme.com

Download The full source code including a test program is available for download.

References newsgroup posting of Alan Mycroft on April 27, 1987

More Twiddled Bits
  1. Absolute value of a float
  2. Absolute value of an integer
  3. Approximative inverse of a float
  4. Bit manipulation basics
  5. Bit mask of lowest bit not set
  6. Count bits set in parallel a.k.a. Population Count
  7. Detects zero bytes inside a 32 bit integer
  8. Endianess
  9. Extend bit width
  10. Float inverse square root approximation
  11. Float square root approximation
  12. Is power of two
  13. Minimum / maximum of integers
  14. Parity
  15. Position of lowest bit set
  16. Round up to the next power of two
  17. Sign comparison
  18. Sign of a 32 bit integer
  19. Swap two values
Extra: Javascript bit manipulator

... or go to the index