Double trouble: an interesting puzzle

I’ve written before about some of the strange looking behavior you can find when working with numbers on a computer. As this article explains at length, this behavior is a completely logical consequence of the need to use a finite set of values to represent potentially infinitely many numbers.

Kathy Kam of the Microsoft CLR team recently posted another interesting piece of floating-point wierdness. The question boils down to this. When you run this code:

Console.WriteLine(4.170404 == Convert.ToDouble(“4.170404“));

why does it print false instead of the expected true?

The answer turns out to be not trivial and comes down to the different ways that the .NET framework and the C# compiler convert text strings to numbers.

The C# compiler uses the Windows API function VarR8FromStr to convert the literal value to a double. The documentation isn’t very precific about its inner workings. The C# spec says in section 9.4.4.3 that doubles are rounded using IEEE “round to nearest“ mode.

So what happens here? First, the number is ‘normalized’: the number is scaled by a power of two so that the number is between 1 and 2. The scale factor is moved to the exponent. Next, the number is rounded to double precision, which has 52 significant binary digits.

Here, 4.170404 is divided by 4, which gives 1.042601. The binary representation of this number is:

1.0000101011100111111001100010110111000110111000101…

The part that interests us starts at digit #52 after the “binary point,“ so let’s show everything after, say, the 40th digit:

4        5          6          7
1234567890 1234567890 1234567890
1110001010 1010000000 0000011001

To round to 52 binary digits, we need to round upwards:

4        5
1234567890 12
1110001010 11

This result is used to compose the double precsion number.

If we follow the path of the Convert.ToDouble call, we find that it passes its string argument on to Double.Parse. This method prepares an internal structure called a NumberBuffer and eventually calls a function named NumberBufferToDouble internal to the CLR. In the Shared Source CLI implementation (Rotor) code, we find the following comment for the function that does all the hard work:

    The internal integer representation of the float number is
    UINT64 mantisa + INT exponent. The mantisa is kept normalized
    ie with the most significant one being 63-th bit of UINT64.

This is good news - extra precision is used to ensure we get the correct result. Looking further, we find that this function uses a helper function appropriately called Mul64Lossy: which multiplies two 64bit values. In the comments, we find this:

    // it’s ok to losse some precision here - Mul64 will be called
    // at most twice during the conversion, so the error won’t propagate
    // to any of the 53 significant bits of the result

This is bad news. If you look at the binary representation of 4.170404/4 above, you’ll see that all digits from the 54th up to the 65th are zero. So it is very much possible that there was some loss of precision here, and that it did propagate to the last significant digit of the final result. The assumption made by the developer of this code is mostly right, but sometimes wrong.

But why risk loss of precision when it can be avoided? The (misguided) answer is: speed. The Rotor code contains a function, Mul64Precise which doesn’t suffer from this loss of precision. However, it does use a few extra instructions to do some more shifting and multiplying. The function is only used in debug mode to verify that some internal conversion tables are correct.

In the grand scheme of things, the few extra instructions that would be used to get a correct result have only a very small effect on performance. The Convert.ToDouble method that started it all ends up spending most of its time parsing according to the specified locale, checking for currency symbols, thousands separators, etc. Only a tiny fraction of the time is spent in the Mul64 functions.

Let’s estimate how common this error is. For a conversion error to occur, the 12 bits from the 53rd to the 64th must all be zero. That’s about 1 in 4000. Also, the rounding must be affected, which gives us another factor of 2 or 4. So, as many as 1 conversion out of every 10000 may suffer from this effect!

The moral of the story: Be very careful with your assumptions about how errors will propagate. Don’t compromise for the sake of a few CPU cycles, unless performance is absolutely critical and is more important than accuracy.