# Sign comparison

Code
``` 01: bool sameSign(int x, int y) 02: { 03:   return (x^y) >= 0; 04: } 05: 06: bool sameSignFast32(int x, int y) 07: { 08:   // XOR both numbers 09:   int temp = x^y; 10:   // shift sign bit to the lowest bit position and clear the rest 11:   temp >>= 31; // 64 bit: use 63 instead of 31 12:   temp  &=  1; 13:   // negate 14:   return !temp; 15: } 16: 17: bool sameSignFloat(float a, float b) 18: { 19:   int* x = (int*)&a; 20:   int* y = (int*)&b; 21:   return sameSignFast32(*x,*y); 22: } 23: 24: template <typename TypeX, typename TypeY> 25: bool sameSignSimple(TypeX x, TypeY y) 26: { 27:   return (x >= 0 && y >= 0) || (x < 0 && y < 0); 28: } ``` bits.stephan-brumme.com

Explanation The highest bit of an integer is called the sign bit.
It is set to 1 for all negative values or 0 for zero and all positive values.

For two values x and y, the following holds true: signX XOR signY is only 1 if their sign differs.
If XOR is applied to all bits of these values (not just the sign bits), we can check the sign of the result:
It is only positive if both have the same sign.

The obvious approach of sameSign suffers from the high latency of teh setge assembler instruction.
sameSignFast32 shifts the sign bit to the lowest bits and clears all other bits.
The resulting value (0 for same sign, 1 for different sign) must be negated before converting to a bool.

The performance figures for float values are identical to integers' if these floats are stored in main memory.

Restrictions sameSignFast32 designed for 32 bits, can be easily modified for 64 bits
sameSignFloat may not work with infinity and/or degenerated numbers

help me to pay
my server bills

Performance • constant execution time because branch free

+ Intel® Pentium™ D:
sameSignSimple: ≈ 5 cycles per comparison
sameSign: ≈ 10 cycles per comparison
sameSignFast32: ≈ 4 cycles per comparison

+ Intel® Core™ 2:
sameSignSimple: ≈ 10 cycles per comparison
sameSign: ≈ 3.75 cycles per comparison
sameSignFast32: ≈ 4 cycles per comparison

+ Intel® Core™ i7:
sameSignSimple: ≈ 7 cycles per comparison
sameSign: ≈ 4 cycles per comparison
sameSignFast32: ≈ 4 cycles per comparison

CPU cycles
(full optimization,
lower values are better)

Assembler Output
``` sameSign: compiler: Visual C++ 2008 in: eax ← x, ecx ← y out: eax → result 01:   ; set sign flag 02:   xor eax, ecx 03:   ; clear eax 04:   mov eax, 0 05:   ; al = sign flag 06:   setge al sameSignFast32: compiler: Visual C++ 2008 in: eax ← x, ecx ← y out: eax → result 01:   ; x = !x 02:   not eax 03:   ; x ^= y 04:   xor eax, ecx 05:   ; x >>= 31 06:   sar eax, 31 07:   ; x &= 1 08:   and eax, 1 sameSignSimple: compiler: Visual C++ 2008 in: ecx ← x, eax ← y out: eax → result 01:   ; x >= 0 02:   cmp ecx, 0 03:   jl \$negative 04:   ; y >= 0 05:   test eax, eax 06:   jge \$same 07:   \$different: 08:   ; return false 09:   xor eax, eax 10:   jmp \$finish 11:   \$negative: 12:   ; y < 0 ? (x is < 0) 13:   test eax, eax 14:   jge \$different 15:   \$same: 16:   ; return true 17:   mov eax, 1 18:   \$finish: ``` bits.stephan-brumme.com