Parity

Code
01: bool parity(unsigned int x)
02: {
03:   x ^= x >> 16;
04:   x ^= x >>  8;
05:   x ^= x >>  4;
06:   x &= 0x0F;
07:   return ((0x6996 >> x) & 1) != 0;
08: }
09:
10: bool parity2(unsigned int x)
11: {
12:   x ^= x >> 1;
13:   x ^= x >> 2;
14:   x &= 0x011111111;
15:   x *= 0x011111111;
16:   return ((x >> 28) & 1) != 0;
17: }
bits.stephan-brumme.com

Explanation Bit parity tells whether a given input contains an odd number of 1s.
When the input is subdivided into several chunks, the total parity can be determined by computing the
parity of the sub parities. Might sound weird, so here is a formula to make it clear:

parity(A and B) = parity(parity(A) and parity(B))

If we break down the 32 bits into chunks of 4 bits then the problem is greatly simplified. XOR is the way to go:

(0 XOR 0) → 0
(0 XOR 1) → 1
(1 XOR 0) → 1
(1 XOR 1) → 0

That basically means that the parity of two bits can be computed with a XOR operation.
Even better, XOR can be applied to the full value to compute the parities of multiple bit pairs in parallel.

first algorithm (parity, line 1 to line 8):

The code first "merges" bits 0 − 15 with bits 16 − 31 using a right shift and XOR (line 3).
Including line 4 and line 5 reduce the input width is reduced from 32 bits to 8 bits by repeatedly applying shifted XORs.
We are only interested in the lower bits. That means, after step 1 the lower 16 bits are of importance,
then the lower 8, then the lowest 4. After having applied XOR three times, we delete all higher bits (line 6).

We are left with sixteen possible values for x. Their parity is:
parity(15) → parity(1111b) → 4 bits set → even
parity(14) → parity(1110b) → 3 bits set → odd
parity(13) → parity(1101b) → 3 bits set → odd
parity(12) → parity(1100b) → 2 bits set → even
parity(11) → parity(1011b) → 3 bits set → odd
parity(10) → parity(1010b) → 2 bits set → even
parity( 9) → parity(1001b) → 2 bits set → even
parity( 8) → parity(1000b) → 1 bits set → odd
parity( 7) → parity(0111b) → 3 bits set → odd
parity( 6) → parity(0110b) → 2 bits set → even
parity( 5) → parity(0101b) → 2 bits set → even
parity( 4) → parity(0100b) → 1 bits set → odd
parity( 3) → parity(0011b) → 2 bits set → even
parity( 2) → parity(0010b) → 1 bits set → odd
parity( 1) → parity(0001b) → 1 bits set → odd
parity( 0) → parity(0000b) → 0 bits set → even

If we encode even by 0 and odd by 1 beginning with parity(15) then we get 0110 1001 0110 1001 = 0x6996,
which is the magic number found in line 7. The shift moves the relevant bit to bit 0. Then everything except
for bit 0 is masked out. In the end, we get 0 for even and 1 for odd, exactly as desird.

second algorithm (parity2, line 10 to line 17):

This time, parity is computed bottom-up instead of top-down.
Using a similar XOR technique, line 12 and line 13 compute the parity of each four-bit-block and put it
into to lowest bit of each block. All other bits become garbage and need to be cleared (line 14).

Multiplying by 0x01111111 has an interesting property if we label the four-bit-block A, B, C, D, E, F, G, H:
A, B, C, D, E, F, G, H → A+B+C+D+E+F+G+H, B+C+D+E+F+G+H, C+D+E+F+G+H, D+E+F+G+H, E+F+G+H, F+G+H, G+H, H
Obviously the highest four-bit-block is what we are looking for. If the sum of multiple parities is odd,
then the overall parity is odd as well (vice versa for even).
The right shift followed by AND (line 16) returns the lowest bit which gives us the parity.

Restrictions • designed for 32 bits
parity: insert "x ^= x >> 32;" before line 3 for 64 bits
parity: remove line 3 for 16 bits
parity: remove line 3 and line 4 for 8 bits
parity2: for 64 bits, replace 0x011111111 by 0x01111111111111111 and shift right by 60 instead of 28

These ads
help me to pay
my server bills

Performance • constant time, data independent

+ Intel® Pentium™ D:
parity: ≈ 13 cycles per number
parity2: ≈ 6 cycles per number

+ Intel® Core™ 2:
parity: ≈ 12 cycles per number
parity2: ≈ 5 cycles per number

+ Intel® Core™ i7:
parity: ≈ 11.5 cycles per number
parity2: ≈ 5.25 cycles per number
Performance Chart
CPU cycles
(full optimization,
lower values are better)

Assembler Output
parity:
compiler: Visual C++ 2008
in: eax ← x
out: eax → result
temp: ecx, edx
01:   ; x ^= x >> 16
02:   mov ecx, eax
03:   shr ecx, 16
04:   xor eax, ecx
05:   ; x ^= x >> 8
06:   mov edx, eax
07:   shr edx, 8
08:   xor eax, edx
09:   ; x ^= x >> 4
10:   mov ecx, eax
11:   shr ecx, 4
12:   xor ecx, eax
13:   ; x &= 0x0F
14:   and ecx, 0000000Fh
15:   ; return ((0x6996 >> x) & 1) != 0
16:   mov eax, 00006996h
17:   sar eax, cl
18:   and eax, 1

parity2:
compiler: Visual C++ 2008
in: ecx ← x
out: eax → result
01:   ; x ^= x >> 1
02:   mov eax, ecx
03:   shr eax, 1
04:   xor ecx, eax
05:   ; x ^= x >> 2
06:   mov eax, ecx
07:   shr eax, 2
08:   xor eax, ecx
09:   ; x &= 0x11111111
10:   and eax, 11111111h
11:   ; x *= 0x11111111
12:   imul eax, 11111111h
13:   ; return (x >> 28) & 1
14:   shr eax, 28
15:   and eax, 1
bits.stephan-brumme.com

Download The full source code including a test program is available for download.

References http://graphics.stanford.edu/~seander/bithacks.html

More Twiddled Bits
  1. Absolute value of a float
  2. Absolute value of an integer
  3. Approximative inverse of a float
  4. Bit manipulation basics
  5. Bit mask of lowest bit not set
  6. Count bits set in parallel a.k.a. Population Count
  7. Detects zero bytes inside a 32 bit integer
  8. Endianess
  9. Extend bit width
  10. Float inverse square root approximation
  11. Float square root approximation
  12. Is power of two
  13. Minimum / maximum of integers
  14. Parity
  15. Position of lowest bit set
  16. Round up to the next power of two
  17. Sign comparison
  18. Sign of a 32 bit integer
  19. Swap two values
Extra: Javascript bit manipulator

... or go to the index