
Parity 

Code 
01: bool parity(unsigned int x)
bits.stephanbrumme.com

Explanation 
Bit parity tells whether a given input contains an odd number of 1s. When the input is subdivided into several chunks, the total parity can be determined by computing the parity of the sub parities. Might sound weird, so here is a formula to make it clear: parity(A and B) = parity(parity(A) and parity(B)) If we break down the 32 bits into chunks of 4 bits then the problem is greatly simplified. XOR is the way to go: (0 XOR 0) → 0 (0 XOR 1) → 1 (1 XOR 0) → 1 (1 XOR 1) → 0 That basically means that the parity of two bits can be computed with a XOR operation. Even better, XOR can be applied to the full value to compute the parities of multiple bit pairs in parallel. first algorithm (parity, line 1 to line 8): The code first "merges" bits 0 − 15 with bits 16 − 31 using a right shift and XOR (line 3). Including line 4 and line 5 reduce the input width is reduced from 32 bits to 8 bits by repeatedly applying shifted XORs. We are only interested in the lower bits. That means, after step 1 the lower 16 bits are of importance, then the lower 8, then the lowest 4. After having applied XOR three times, we delete all higher bits (line 6). We are left with sixteen possible values for x. Their parity is: parity(15) → parity(1111b) → 4 bits set → even parity(14) → parity(1110b) → 3 bits set → odd parity(13) → parity(1101b) → 3 bits set → odd parity(12) → parity(1100b) → 2 bits set → even parity(11) → parity(1011b) → 3 bits set → odd parity(10) → parity(1010b) → 2 bits set → even parity( 9) → parity(1001b) → 2 bits set → even parity( 8) → parity(1000b) → 1 bits set → odd parity( 7) → parity(0111b) → 3 bits set → odd parity( 6) → parity(0110b) → 2 bits set → even parity( 5) → parity(0101b) → 2 bits set → even parity( 4) → parity(0100b) → 1 bits set → odd parity( 3) → parity(0011b) → 2 bits set → even parity( 2) → parity(0010b) → 1 bits set → odd parity( 1) → parity(0001b) → 1 bits set → odd parity( 0) → parity(0000b) → 0 bits set → even If we encode even by 0 and odd by 1 beginning with parity(15) then we get 0110 1001 0110 1001 = 0x6996, which is the magic number found in line 7. The shift moves the relevant bit to bit 0. Then everything except for bit 0 is masked out. In the end, we get 0 for even and 1 for odd, exactly as desird. second algorithm (parity2, line 10 to line 17): This time, parity is computed bottomup instead of topdown. Using a similar XOR technique, line 12 and line 13 compute the parity of each fourbitblock and put it into to lowest bit of each block. All other bits become garbage and need to be cleared (line 14). Multiplying by 0x011111111 has an interesting property if we label the fourbitblock A, B, C, D, E, F, G, H: A, B, C, D, E, F, G, H → A+B+C+D+E+F+G+H, B+C+D+E+F+G+H, C+D+E+F+G+H, D+E+F+G+H, E+F+G+H, F+G+H, G+H, H Obviously the highest fourbitblock is what we are looking for. If the sum of multiple parities is odd, then the overall parity is odd as well (vice versa for even). The right shift followed by AND (line 16) returns the lowest bit which gives us the parity. 
Restrictions 
• designed for 32 bits • parity: insert "x ^= x >> 32;" before line 3 for 64 bits • parity: remove line 3 for 16 bits • parity: remove line 3 and line 4 for 8 bits • parity2: for 64 bits, replace 0x011111111 by 0x01111111111111111 and shift right by 60 instead of 28 
These ads help me to pay my server bills 

Performance 
• constant time, data independent + Intel® Pentium™ D: • parity: ≈ 13 cycles per number • parity2: ≈ 6 cycles per number + Intel® Core™ 2: • parity: ≈ 12 cycles per number • parity2: ≈ 5 cycles per number + Intel® Core™ i7: • parity: ≈ 11.5 cycles per number • parity2: ≈ 5.25 cycles per number CPU cycles (full optimization, lower values are better) 
Assembler Output 
parity:
bits.stephanbrumme.com

Download  The full source code including a test program is available for download. 
References 
http://graphics.stanford.edu/~seander/bithacks.html 
More Twiddled Bits 
... or go to the index 