Sign comparison

Sign comparison
Code	01: bool sameSign(int x, int y) 02: { 03: return (x^y) >= 0; 04: } 05: 06: bool sameSignFast32(int x, int y) 07: { 08: // XOR both numbers 09: int temp = x^y; 10: // shift sign bit to the lowest bit position and clear the rest 11: temp >>= 31; // 64 bit: use 63 instead of 31 12: temp &= 1; 13: // negate 14: return !temp; 15: } 16: 17: bool sameSignFloat(float a, float b) 18: { 19: int* x = (int)&a; 20: int y = (int)&b; 21: return sameSignFast32(x,*y); 22: } 23: 24: template <typename TypeX, typename TypeY> 25: bool sameSignSimple(TypeX x, TypeY y) 26: { 27: return (x >= 0 && y >= 0) \|\| (x < 0 && y < 0); 28: } bits.stephan-brumme.com

Explanation	The highest bit of an integer is called the sign bit. It is set to 1 for all negative values or 0 for zero and all positive values. For two values x and y, the following holds true: signX XOR signY is only 1 if their sign differs. If XOR is applied to all bits of these values (not just the sign bits), we can check the sign of the result: It is only positive if both have the same sign. The obvious approach of sameSign suffers from the high latency of teh setge assembler instruction. sameSignFast32 shifts the sign bit to the lowest bits and clears all other bits. The resulting value (0 for same sign, 1 for different sign) must be negated before converting to a bool. The performance figures for float values are identical to integers' if these floats are stored in main memory.

Restrictions	• sameSignFast32 designed for 32 bits, can be easily modified for 64 bits • sameSignFloat may not work with infinity and/or degenerated numbers

These ads help me to pay my server bills

Performance	• constant execution time because branch free + Intel® Pentium™ D: • sameSignSimple: ≈ 5 cycles per comparison • sameSign: ≈ 10 cycles per comparison • sameSignFast32: ≈ 4 cycles per comparison + Intel® Core™ 2: • sameSignSimple: ≈ 10 cycles per comparison • sameSign: ≈ 3.75 cycles per comparison • sameSignFast32: ≈ 4 cycles per comparison + Intel® Core™ i7: • sameSignSimple: ≈ 7 cycles per comparison • sameSign: ≈ 4 cycles per comparison • sameSignFast32: ≈ 4 cycles per comparison CPU cycles (full optimization, lower values are better)

Assembler Output	sameSign: 01: ; set sign flag 02: xor eax, ecx 03: ; clear eax 04: mov eax, 0 05: ; al = sign flag 06: setge al sameSignFast32: 01: ; x = !x 02: not eax 03: ; x ^= y 04: xor eax, ecx 05: ; x >>= 31 06: sar eax, 31 07: ; x &= 1 08: and eax, 1 sameSignSimple: 01: ; x >= 0 02: cmp ecx, 0 03: jl $negative 04: ; y >= 0 05: test eax, eax 06: jge $same 07: $different: 08: ; return false 09: xor eax, eax 10: jmp $finish 11: $negative: 12: ; y < 0 ? (x is < 0) 13: test eax, eax 14: jge $different 15: $same: 16: ; return true 17: mov eax, 1 18: $finish: bits.stephan-brumme.com

Download	The full source code including a test program is available for download.

References	multiple sources, author unknown

More Twiddled Bits	Absolute value of a float Absolute value of an integer Approximative inverse of a float Bit manipulation basics Bit mask of lowest bit not set Count bits set in parallel a.k.a. Population Count Detects zero bytes inside a 32 bit integer Endianess Extend bit width Float inverse square root approximation Float square root approximation Is power of two Minimum / maximum of integers Parity Position of lowest bit set Round up to the next power of two Sign comparison Sign of a 32 bit integer Swap two values Extra: Javascript bit manipulator ... or go to the index