Integer Overflows

Wed Apr 11 15:55:39 UTC 2012

Integer Overflows

A funny tale from PHP development: apparently, preventing integer overflows is hard. So, let's try to write code to allocate a 2D matrix of double...

unsigned int rows, cols;
// ...
if(rows < 1 || cols < 1)
        errx(1, "Invalid size");
double **M = calloc(rows * cols * sizeof(double), 1);

The inherent problem here is that the goal is to detect a miscalculation–by doing calculations. And many attempts failed in the past. So, how to best do this?

Standard Method: pre-verification of each step

if(rows < 1 || cols < 1)
        errx(1, "Invalid size");
if(rows > SIZE_MAX / (size_t) cols)
        errx(1, "Integer overflow");
size_t rc = rows * cols;
if(rc > SIZE_MAX / sizeof(double))
        errx(1, "Integer overflow");
size_t n = rc * sizeof(double);
double **M = calloc(rc, 1);

What are we doing there? We basically verify before each calculation step that the calculation will neither overflow nor divide by zero. We use the fact that the division operator / rounds downwards. Also, an integer division cannot overflow. So the following equations hold:

rows        <=  SIZE_MAX / (size_t) cols
rows * cols <= (SIZE_MAX / (size_t) cols) * cols
rows * cols <=  SIZE_MAX - {value between 0 and cols-1}
rows * cols <=  SIZE_MAX

And thus is proven this step causes no integer overflow.

The problem with this method is that it is tedious and leads to quite complex code.

Interesting Method: pre-verification in a single step

if(SIZE_MAX / (size_t) rows / (size_t) cols / sizeof(double) < 1)
        errx(1, "Integer overflow");
double **M = calloc((size_t) rows * (size_t) cols * sizeof(double), 1);

So, what is this now? Why does this trick guarantee no overflow? Let's work on this equation too.

Theorems for integer division "/":

  a >= b   =>   a/c >= b/c

  Proof:

    Write a as (a/c) * c + (a%c) with (a%c) between 0 and c-1.
    Write b as (b/c) * c + (b%c) with (b%c) between 0 and c-1.

    Then it follows:

                      a >= b
      (a/c) * c + (a%c) >= (b/c) * c + (b%c)
    ((a/c) - (b/c)) * c >= (b%c) - (a%c)

    If the theorem were not true, then (a/c) < (b/c).
    This means that (a/c) < (b/c) <= -1. We get:

    -1 * c >= ((a/c) - (b/c)) * c >= (b%c) - (a%c)
        -c >= (b%c) - (a%c)
         c <= (a%c) - (b%c)

    However, this cannot be fulfilled for any (a%c) and (b%c) in the range from
    0 to c-1. q.e.d.

Furthermore:

  a < b*c  =>   a/c < b

  Proof:

    Let's prove the equivalent relation

    a/c >= b   =>   a >= b*c

    Write a as (a/c) * c + (a%c) with (a%c) between 0 and c-1.

    Then it follows:

                (a/c) >= b
            (a/c) * c >= b*c
    (a/c) * c + (a%c) >= b*c + (a%c)
                    a >= b*c + (a%c)
                    a >= b*c

    q.e.d.

// Let's define:
r := (size_t) rows
c := (size_t) cols
d := sizeof(double)

// Assuming no overflow takes place, then we get using the above mentioned theorems:
SIZE_MAX             >= d*c*r        | / r
SIZE_MAX / r         >= d*c          | / c
SIZE_MAX / r / c     >= d            | / d
SIZE_MAX / r / c / d >= 1

// Assuming an overflow takes place, then we get using the above mentioned theorems:
SIZE_MAX             <  d*c*r        | / r
SIZE_MAX / r         <  d*c          | / c
SIZE_MAX / r / c     <  d            | / d
SIZE_MAX / r / c / d <  1

Therefore, the check above always yields the correct result.

Ridiculous Method: floating point post-verification

What actually happens on overflow? The first bits get cut off. We easily see that the overflown value is at most half the correct value, as cutting off a binary number's first bit always reduces it to less than half. So, we can do this:

if(rows < 1 || cols < 1)
        errx(1, "Invalid size");
size_t ni = (size_t) rows * (size_t) cols * sizeof(double);
float nf = (float) rows * (float) cols * sizeof(double);
if(ni < 0.8f * nf)
        errx(1, "Integer overflow");
double **M = calloc(ni, 1);

We know that on overflow, ni will be smaller than 0.5 times the correct value. So, as long as our calculation of nf is accurate enough so that 0.8f * nf is between 0.5 times the correct value, and the correct value itself, this actually works!

By standard methods, you find that a single calculation step for float of addition, multiplication, division, or rounding an input introduces a relative error of at most 10^-6, as long as no negative values or denormals are involved anywhere. How many such errors do we need so that 0.8 becomes 1.0 or 0.5?

223143 of them. We'll never hit THAT in a buffer size calculation...


Posted by OpBaI | Permanent link | File under: code, c