Sign comparison

Code
01: bool sameSign(int x, int y)
02: {
03:   return (x^y) >= 0;
04: }
05:
06: bool sameSignFast32(int x, int y)
07: {
08:   // XOR both numbers
09:   int temp = x^y;
10:   // shift sign bit to the lowest bit position and clear the rest
11:   temp >>= 31; // 64 bit: use 63 instead of 31
12:   temp  &=  1;
13:   // negate
14:   return !temp;
15: }
16:
17: bool sameSignFloat(float a, float b)
18: {
19:   int* x = (int*)&a;
20:   int* y = (int*)&b;
21:   return sameSignFast32(*x,*y);
22: }
23:
24: template <typename TypeX, typename TypeY>
25: bool sameSignSimple(TypeX x, TypeY y)
26: {
27:   return (x >= 0 && y >= 0) || (x < 0 && y < 0);
28: }
bits.stephan-brumme.com

Explanation The highest bit of an integer is called the sign bit.
It is set to 1 for all negative values or 0 for zero and all positive values.

For two values x and y, the following holds true: signX XOR signY is only 1 if their sign differs.
If XOR is applied to all bits of these values (not just the sign bits), we can check the sign of the result:
It is only positive if both have the same sign.

The obvious approach of sameSign suffers from the high latency of teh setge assembler instruction.
sameSignFast32 shifts the sign bit to the lowest bits and clears all other bits.
The resulting value (0 for same sign, 1 for different sign) must be negated before converting to a bool.

The performance figures for float values are identical to integers' if these floats are stored in main memory.

Restrictions sameSignFast32 designed for 32 bits, can be easily modified for 64 bits
sameSignFloat may not work with infinity and/or degenerated numbers

These ads
help me to pay
my server bills

Performance • constant execution time because branch free

+ Intel® Pentium™ D:
sameSignSimple: ≈ 5 cycles per comparison
sameSign: ≈ 10 cycles per comparison
sameSignFast32: ≈ 4 cycles per comparison

+ Intel® Core™ 2:
sameSignSimple: ≈ 10 cycles per comparison
sameSign: ≈ 3.75 cycles per comparison
sameSignFast32: ≈ 4 cycles per comparison

+ Intel® Core™ i7:
sameSignSimple: ≈ 7 cycles per comparison
sameSign: ≈ 4 cycles per comparison
sameSignFast32: ≈ 4 cycles per comparison
Performance Chart
CPU cycles
(full optimization,
lower values are better)

Assembler Output
sameSign:
01:   ; set sign flag
02:   xor eax, ecx
03:   ; clear eax
04:   mov eax, 0
05:   ; al = sign flag
06:   setge al

sameSignFast32:
01:   ; x = !x
02:   not eax
03:   ; x ^= y
04:   xor eax, ecx
05:   ; x >>= 31
06:   sar eax, 31
07:   ; x &= 1
08:   and eax, 1

sameSignSimple:
01:   ; x >= 0
02:   cmp ecx, 0
03:   jl $negative
04:   ; y >= 0
05:   test eax, eax
06:   jge $same
07:   $different:
08:   ; return false
09:   xor eax, eax
10:   jmp $finish
11:   $negative:
12:   ; y < 0 ? (x is < 0)
13:   test eax, eax
14:   jge $different
15:   $same:
16:   ; return true
17:   mov eax, 1
18:   $finish:
bits.stephan-brumme.com

Download The full source code including a test program is available for download.

References multiple sources, author unknown

More Twiddled Bits
  1. Absolute value of a float
  2. Absolute value of an integer
  3. Approximative inverse of a float
  4. Bit manipulation basics
  5. Bit mask of lowest bit not set
  6. Count bits set in parallel a.k.a. Population Count
  7. Detects zero bytes inside a 32 bit integer
  8. Endianess
  9. Extend bit width
  10. Float inverse square root approximation
  11. Float square root approximation
  12. Is power of two
  13. Minimum / maximum of integers
  14. Parity
  15. Position of lowest bit set
  16. Round up to the next power of two
  17. Sign comparison
  18. Sign of a 32 bit integer
  19. Swap two values
Extra: Javascript bit manipulator

... or go to the index