|
Absolute value of an integer |
|
Code | |
Explanation |
All major processors represent negative numbers using the two's-complement which is defined as: for x ≥ 0 → x for x < 0 → NOT(x) + 1 On the lowest level, computers provide logical bit shifts and arithmetic bit shifts. Both shifts differ in handling how to fill the empty bits on the left side. Logical shifts insert zeros while arithmetic shifts replicate the formerly highest bit. Whereas signed integers are arithmetically shifted in C, unsigned integers are logically shifted. In our case x is shifted arithmetically 31 times to the right which basically erases its value and spreads the highest bit. That means, line 3 evaluates either to 0x00000000 (→ 0) or 0xFFFFFFFF (→ −1). Note: 32 bit systems require a shift by 31, 64 bit systems require a shift by 63 accordingly. Consequently, line 4 turns out to be (x XOR 0) − 0 for positive values of x (including x=0). x XOR 0 is still x and x − 0 remains x, too. So for positive x we get x ≥ 0 → x. We saw that for negative values of x, bit31 is set to 0xFFFFFFFF. Line 4 is then (x XOR 0xFFFFFFFF) − 0xFFFFFFFF. The bracketed XOR is equivalent to NOT(x) and the constant −0xFFFFFFFF turns out to be −(-1) = +1. In the end, the whole term is NOT(x) + 1, exactly what we wanted: for x < 0 → NOT(x) + 1 Note: Current C++ compilers (Microsoft, GCC and Intel) implemented a special assembler code sequence for abs which runs faster than the shown algorithm on x86 CPUs (see below for its source code). |
Restrictions |
• designed for 32 bits (simple modification for 64 bits possible) |
These ads help me to pay my server bills |
|
Performance |
• constant execution time because branch free + Intel® Pentium™ D: • myAbs: ≈ 2.25 cycles per number • abs: ≈ 1.75 cycles per number + Intel® Core™ 2: • myAbs: ≈ 2 cycles per number • abs: ≈ 1.5 cycles per number + Intel® Core™ i7: • myAbs: ≈ 1.75 cycles per number • abs: ≈ 2 cycles per number CPU cycles (full optimization, lower values are better) |
Assembler Output | |
Download | The full source code including a test program is available for download. |
References |
multiple sources, author unknown |
More Twiddled Bits |
... or go to the index |