Minimum / maximum of integers

Minimum / maximum of integers
Code	01: int select(int x, int y, int ifXisSmaller, int ifYisSmaller) 02: { 03: int diff = x - y; 04: // sets bit31 to 0xFFFFFFFF if x<y, else 0x00000000 05: int bit31 = diff >> 31; 06: 07: // return ifXisSmaller if x is smaller than y, else y 08: return (bit31 & (ifXisSmaller ^ ifYisSmaller)) ^ ifYisSmaller; 09: } 10: 11: int minimum(int x, int y) 12: { 13: // if x<y then return x, else y 14: return select(x,y,x,y); 15: } 16: 17: int maximum(int x, int y) 18: { 19: // if x<y then return y, else x 20: return select(x,y,y,x); 21: } 22: 23: int minimumSimple(int x, int y) 24: { 25: if (x < y) 26: return x; 27: else 28: return y; 29: } bits.stephan-brumme.com

Explanation	A universal helper function select (line 1) sets bit31 (line 5) to either 0x00000000 or 0xFFFFFFFF. Then a special property of XOR is used: a XOR b XOR b = a; In plain English, applying XOR twice with the same number eliminates this number. If bit31 is 0x00000000, line 8 becomes (0x00000000 AND (ifXisSmaller XOR ifYisSmaller)) XOR ifYisSmaller and since (zero AND something) = zero, only zero XOR ifYisSmaller is left which is ifYisSmaller. if bit31 is 0xFFFFFFFF, line 8 becomes (0xFFFFFFFF AND (ifXisSmaller XOR ifYisSmaller)) XOR ifYisSmaller and since (allBitsSet AND something) = something, (ifXisSmaller XOR ifYisSmaller) XOR ifYisSmaller is left, which is ifXisSmaller as explained earlier. The select routine can be used to implement a branch-free version of min (line 11) and max (line 17). Even though more instructions are needed than if written using a simple "if", modern processor are much faster because the presented code is branch-free and avoids mis-predictions and pipeline stalls.

Restrictions	• designed for 32 bits • behavior of right shift is not clearly defined, may not work on certain platforms / compilers

These ads help me to pay my server bills

Performance	• constant time, data independent + Intel® Pentium™ D: • maximum: ≈ 3 cycles per number • minimum: ≈ 3 cycles per number • minimumSimple: ≈ 3.5 cycles per number + Intel® Core™ 2: • maximum: ≈ 2.5 cycles per number • minimum: ≈ 2.5 cycles per number • minimumSimple: ≈ 2.5 cycles per number + Intel® Core™ i7: • maximum: ≈ 3 cycles per number • minimum: ≈ 3 cycles per number • minimumSimple: ≈ 2.25 cycles per number CPU cycles (full optimization, lower values are better)

Assembler Output	maximum: 01: ; diff = x - y 02: mov eax, ecx 03: sub eax, edx 04: ; bit31 = diff >> 31 05: sar eax, 31 06: ; return (bit31 & (y ^ x)) ^ x 07: xor ecx, edx 08: and eax, ecx 09: xor eax, ecx minimum: 01: ; diff = x - y 02: mov eax, ecx 03: sub eax, edx 04: ; bit31 = diff >> 31 05: sar eax, 31 06: ; return (bit31 & (x ^ y)) ^ y 07: xor ecx, edx 08: and eax, ecx 09: xor eax, edx minimumSimple: 01: cmp ecx, eax 02: jge $done 03: mov eax, ecx 04: $done: bits.stephan-brumme.com

Download	The full source code including a test program is available for download.

References	unknown

More Twiddled Bits	Absolute value of a float Absolute value of an integer Approximative inverse of a float Bit manipulation basics Bit mask of lowest bit not set Count bits set in parallel a.k.a. Population Count Detects zero bytes inside a 32 bit integer Endianess Extend bit width Float inverse square root approximation Float square root approximation Is power of two Minimum / maximum of integers Parity Position of lowest bit set Round up to the next power of two Sign comparison Sign of a 32 bit integer Swap two values Extra: Javascript bit manipulator ... or go to the index