# Why am I surprised that people found this surprising?

Must be something in the water. I dunno.

I reported on a simple program written to demonstrate roundoff error accumulation for a class I taught (the day job does teach parallel programming with OpenMP/MPI/… as we are contracted to), and why people should pay very careful attention to it when using floating point arithmetic.

Before I go into this, it is worth stealing some text from one of my slides on this. Computers know integers. They do not know real numbers. They know floating point numbers, which are approximations to real numbers. This distinction is important, and if you don’t pay attention to it, it will bite you. Hard.

That said, it is time to go down this short rabbit hole. It seems to be surprising for a fair chunk of programmers, so it is worth a read just to see it if you haven’t. This is by the way, one of a group of potential pitfalls to be avoided in parallel programming, or any programming system that allows you to re-order the way you perform mathematical operations.

Lets start out with a very simple C program

```
#include
int main()
{
float x=0.0, y=0.0, one=1.0;
int i,N;

N = 100000000;

for(i=1;i< =N;i++)
{
x += one /(float)i;
}

for(i=N;i>=1;i--)
{
y += one /(float)i;
}

printf("[index increasing] sum = %10.7f\n",x);
printf("[index decreasing] sum = %10.7f\n",y);
}

```

Notice the two inner loops. The C syntax for the first one reads approximately

start at i=1, continue to loop while i is less than or equal to N, and increment i on each loop.

The second loop is the same as the first, though we reverse the index (sum it in the opposite order). Same series.

(Yes, it is a divergent series, but for these purposes, it should not matter as we are taking the same group of terms, and summing them … the grouping isn’t changing, but the order of summation is).

Now compile and run it with your favorite compiler.

```\$ gcc sum_bad.c -o sum_bad.exe
[index increasing] sum = 15.4036827
[index decreasing] sum = 18.8079185
```

Neat, huh.

This is roundoff error. Remember, floating point numbers are not real numbers. You can infinitely subdivide the real number line. It has “infinite” resolution. You cannot do that with floating point numbers. They have finite resolution.

All we did is change the order of summation. Which, when you write parallel code, you are effectively doing. Any time you change the order, you will impact roundoff error.

I have been showing this to researchers for years, and most have not seen/heard of it. Many people think that the numbers out of computers are exact (they aren’t). Many aren’t aware of the side effects of the design decisions that go into machines. Designs are compromises. It helps to have a really good sense of what is being compromized so you understand what you are dealing with.

These are not corner cases. These are cases people deal with on a daily basis.

Sure, some people will note that this is single precision calculation. What if you change it to double precision? The answers will be identical then … right? Just use double precision?

```
#include

int main()
{
double x=0.0, y=0.0, one=1.0;
int i,N;

N = 100000000;

for(i=1;i< =N;i++)
{
x += one /(double)i;
}

for(i=N;i>=1;i--)
{
y += one /(double)i;
}

printf("[index increasing] sum = %20.17f\n",x);
printf("[index decreasing] sum = %20.17f\n",y);
}

```

and

```\$ gcc sum_bad2.c -o sum_bad2.exe
[index increasing] sum = 18.99789641385255479
[index decreasing] sum = 18.99789641385344652
```

You still see the roundoff error, though it is much less of a factor than before. More precision does help, but it also helps to know something about why it helps.

This is not saying run everything double precision, rather, you need to pay attention to your algorithms, all of your algorithms, any time you change order of operations. The results can, and will surprise you.

Viewed 16464 times by 3751 viewers

## 5 thoughts on “Why am I surprised that people found this surprising?”

1. nice example. We study to expect and detect rounding off errors but nothing is as clear (and in your face) as this one.

Thank You! I am sure I will be using this in my slides soon. And I will refer to this wonderful blog.

2. What’s more, once you go parallel, even the elegant Kahan summation algorithm can’t easily help you– it has a loop-carried dependency! Bummer.

3. There are things you can do … we have known how to manipulate quantities with error for quite some time. Its just somewhat expensive to do it. No, not interval arithmetic, but actual real error analysis. Hamming and Hildebrandt’s books on Numerical Analysis (showing how to build algorithms that do a good job of not messing up) are good starting points. Bug me if you want full title/ISBN.

Unfortunately this stuff manifests itself in all manner of places. Anywhere you have something that looks like a Greens function (1/|r_i – r_j|) you can get some significant magnification of error just by having the denominator close to “0”. This does occur in molecular dynamics and some electrodynamics apps.

4. Anonymous says:

Is one of the two methods better than the other? Obviously using a regular float for something of that precision is wrong either way, but what about for double?

5. Generally yes. You want to pay careful attention to the relative sizes of the items you are combining. You can get significant error magnification via catastrophic cancellation when subtracting quantities of similar magnitude, or when multiplying quantities of very different magnitude. Summations should be constructed carefully, adding items that differ greatly in size will tend to swamp the error in one of them.

As for whether or not using a regular float is “wrong”, this is hard to accurately conclude, as we can construct similar examples in double precision which show similar catastrophic loss of accuracy. The issue is in how the tool (single precision in this case) is used more than anything else. For whatever it is worth, many simulations for engineering calculations are run in single precision due to performance issues … though some with known instabilities (various non-linear CFD) often resort to double precision.

Even there, one must be quite careful in their numerical algorithm construction. Double precision doesn’t solve everything, though it can help to “mask” problems for many users.

It’s not as cut and dry as one might think. If in doubt, double precision is good. Single precision could provide more speedup due to the native performance of SSE type systems, and that an SSE register can fit twice as many single precision quantities as double precision.