Approximative inverse of a float

Approximative inverse of a float
Code	`01: float inverse(float x) 02: { 03: // re-interpret as a 32 bit integer 04: unsigned int i = (unsigned int)&x; 05: 06: // adjust exponent 07: i = 0x7F000000 - i; 08: //i = 0x7EEEEEEE - i; 09: 10: return x; 11: }` bits.stephan-brumme.com

Explanation	The standard IEEE-754 defines the binary layout of floating point numbers. All major CPUs (or better to say: FPUs) follow this standard. The data type "float" is a 32 bit wide floating point number consisting of a 23 bit mantissa, an 8 bit exponent and a sign bit. Because the exponent can be negative, too, it is implemented using a bias of 2⁸-1 = 127. That means, 127 is added to the actual exponent to keep it always positive. Written in a formula: x = sign * mantissa * 2^exponent . Note: To save one bit, the most significant bit of the mantissa is omitted because it is by definition always set to one. Note, too: Binary exponents below −127 or above +127 cannot be represented by "float" (≈ ±10³⁸ decimal). The inverse can be written as: inverse(a) → a^-1 or more specific: inverse(a^b) → a^-b . If speed is more important than precision, we can approximate x^-1 by computing sign * mantissa * 2^-exponent . The only difference to the original formula is the minus in front of exponent. Because the mantissa remains unadjusted, the result will slightly deviate from the true inverse value. In the end, a simple subtraction of the exponent bits, often performed in merely one CPU cycle, does the trick (line 7): exponent → 127 − exponent The constant 0x7F000000 represents 127 shifted left by 23 bits. A brute force search revealed that a few constants expose a better overall accuracy and cut down the error to about 4%. So far, 0x7EEEEEEE (line 8) seems to be the best candidate. My brute force search didn't analyze all possible numbers, that means, there might be even better constants. But you should keep in mind that only for 0x7F000000 we get inverse(1) = 1, which might be important for many algorithms. No other constant can produce the same result for inverse(1). We have to switch from IEEE-754 to integer arithmetic (line 4), of course, and back (line 10).

Restrictions	• designed for 32 bit IEEE floats • approximation, average deviation of ≈ 8%

These ads help me to pay my server bills

Performance	• constant time, data independent + Intel® Pentium™ D: • inverse/CPU: ≈ 1 cycle per number • inverse/FPU: ≈ 42 cycles per number + Intel® Core™ 2: • inverse/CPU: ≈ 0.75 cycle per number • inverse/FPU: ≈ 34 cycles per number + Intel® Core™ i7: • inverse/CPU: ≈ 0.75 cycle per number • inverse/FPU: ≈ 22 cycles per number CPU cycles (full optimization, lower values are better)

Assembler Output	`inverse: 01: mov eax, 7F000000H 02: sub eax, edx inverseFpu: 01: fld1 02: fdivrp ST(1), ST(0)` bits.stephan-brumme.com

Download	The full source code including a test program is available for download.

References	unknown

More Twiddled Bits	Absolute value of a float Absolute value of an integer Approximative inverse of a float Bit manipulation basics Bit mask of lowest bit not set Count bits set in parallel a.k.a. Population Count Detects zero bytes inside a 32 bit integer Endianess Extend bit width Float inverse square root approximation Float square root approximation Is power of two Minimum / maximum of integers Parity Position of lowest bit set Round up to the next power of two Sign comparison Sign of a 32 bit integer Swap two values Extra: Javascript bit manipulator ... or go to the index