Re: [RFC] Fix div/mul crash and more undefined behavior

Lists: pgsql-hackers
From: Xi Wang <xi(dot)wang(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: [RFC] Fix div/mul crash and more undefined behavior
Date: 2012-11-18 23:24:23
Message-ID: 50A96E27.6000404@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The INT_MIN / -1 crash problem was partially addressed in 2006 and
commit 9fc6f4e1ae107f44807c4906105e1f7eb052ecb1.

http://archives.postgresql.org/pgsql-patches/2006-06/msg00102.php

However, the fix is incomplete and incorrect for some cases.

64-bit crash
============

Below is an example that crashes PostgreSQL on Windows, using the
64-bit binary from http://www.postgresql.org/download/windows/.

SELECT ((-9223372036854775808)::int8) % (-1);

Note that the previous discussion concluded that int8 (64-bit)
division didn't crash, which is incorrect.

http://archives.postgresql.org/pgsql-patches/2006-06/msg00104.php

My guess is that the previous test was carried out on a 32-bit
Windows, where there's no 64-bit division instruction. In that
case, the generated code calls a runtime function (e.g., lldiv),
which doesn't trap. However, on a 64-bit system, the compiler
generates the idivq instruction, which leads to a crash.

Note that the problem is not limited to division. The following
multiplication crashes as well:

SELECT ((-9223372036854775808)::int8) * ((-1)::int8);

The reason is that the multiplication overflow check uses a division.

32-bit crash
============

The previous fix doesn't fix all possible crashes, even on a 32-bit
Windows. Below is an example to crash a 32-bit PostgreSQL:

SELECT ((-2147483648)::int4) % ((-1)::int2);

Portability
===========

The previous fix uses #ifdef WIN32 ... #endif, which is not portable,
as noted below:

http://archives.postgresql.org/pgsql-patches/2006-06/msg00103.php

Using a SIGFPE handler is also dangerous (e.g., causing an infinite
loop sometimes):

https://www.securecoding.cert.org/confluence/display/seccode/SIG35-C.+Do+not+return+from+SIGSEGV,+SIGILL,+or+SIGFPE+signal+handlers

Strictly speaking, using postcondition checking to detect signed
division overflows (actually, all signed integer overflows) violates
the C language standard, because signed integer overflow is undefined
behavior in C and we cannot first compute the result and then check
for overflow. Compilers can do a lot of funny and crazy things
(e.g., generating traps or even optimizing away overflow checks).

BTW, PostgreSQL currently uses gcc's workaround option -fwrapv to
disable offending optimizations based on integer overflows. The
problems are: (1) it doesn't always work (e.g., for division), and
(2) many other C compilers do not even support this workaround
option and can perform offending optimizations.

Patching
========

Below is a patch that fixes division crashes. It removes #ifdef WIN32
and tries to use portable checks. I'll send more (e.g., for fixing
multiplication crashes) if this looks good.

I understand that the existing integer overflow checks tried to
avoid dependency on constants like INT64_MIN. But I'm not sure how
to perform simpler and portable overflow checks without using such
constants.

Also, I could use the SHRT_MIN rather than introducing INT16_MIN; I just
feel like using INT16_MIN with int16 is clearer and less confusing.

diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 9f752ed..d7867cb 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -732,30 +732,18 @@ int4div(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}

-#ifdef WIN32
-
/*
- * Win32 doesn't throw a catchable exception for SELECT -2147483648 /
- * (-1); -- INT_MIN
+ * Overflow check. The only possible overflow case is for arg1 =
+ * INT32_MIN, arg2 = -1, where the correct result is -INT32_MIN, which
+ * can't be represented on a two's-complement machine.
*/
- if (arg2 == -1 && arg1 == INT_MIN)
+ if (arg1 == INT32_MIN && arg2 == -1)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("integer out of range")));
-#endif

result = arg1 / arg2;

- /*
- * Overflow check. The only possible overflow case is for arg1 = INT_MIN,
- * arg2 = -1, where the correct result is -INT_MIN, which can't be
- * represented on a two's-complement machine. Most machines produce
- * INT_MIN but it seems some produce zero.
- */
- if (arg2 == -1 && arg1 < 0 && result <= 0)
- ereport(ERROR,
- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("integer out of range")));
PG_RETURN_INT32(result);
}

@@ -877,18 +865,17 @@ int2div(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}

- result = arg1 / arg2;
-
- /*
- * Overflow check. The only possible overflow case is for arg1 =
- * SHRT_MIN, arg2 = -1, where the correct result is -SHRT_MIN, which can't
- * be represented on a two's-complement machine. Most machines produce
- * SHRT_MIN but it seems some produce zero.
+ /* Overflow check. The only possible overflow case is for arg1 =
+ * INT16_MIN, arg2 = -1, where the correct result is -INT16_MIN, which
+ * can't be represented on a two's-complement machine.
*/
- if (arg2 == -1 && arg1 < 0 && result <= 0)
+ if (arg1 == INT16_MIN && arg2 == -1)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("smallint out of range")));
+
+ result = arg1 / arg2;
+
PG_RETURN_INT16(result);
}

@@ -1065,18 +1052,18 @@ int42div(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}

- result = arg1 / arg2;
-
/*
- * Overflow check. The only possible overflow case is for arg1 = INT_MIN,
- * arg2 = -1, where the correct result is -INT_MIN, which can't be
- * represented on a two's-complement machine. Most machines produce
- * INT_MIN but it seems some produce zero.
+ * Overflow check. The only possible overflow case is for arg1 =
+ * INT32_MIN, arg2 = -1, where the correct result is -INT32_MIN, which
+ * can't be represented on a two's-complement machine.
*/
- if (arg2 == -1 && arg1 < 0 && result <= 0)
+ if (arg1 == INT32_MIN && arg2 == -1)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("integer out of range")));
+
+ result = arg1 / arg2;
+
PG_RETURN_INT32(result);
}

diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 59c110b..83531ad 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -598,18 +598,18 @@ int8div(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}

- result = arg1 / arg2;
-
/*
* Overflow check. The only possible overflow case is for arg1 =
* INT64_MIN, arg2 = -1, where the correct result is -INT64_MIN, which
- * can't be represented on a two's-complement machine. Most machines
- * produce INT64_MIN but it seems some produce zero.
+ * can't be represented on a two's-complement machine.
*/
- if (arg2 == -1 && arg1 < 0 && result <= 0)
+ if (arg1 == INT64_MIN && arg2 == -1)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("bigint out of range")));
+
+ result = arg1 / arg2;
+
PG_RETURN_INT64(result);
}

@@ -838,18 +838,18 @@ int84div(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}

- result = arg1 / arg2;
-
/*
* Overflow check. The only possible overflow case is for arg1 =
* INT64_MIN, arg2 = -1, where the correct result is -INT64_MIN, which
- * can't be represented on a two's-complement machine. Most machines
- * produce INT64_MIN but it seems some produce zero.
+ * can't be represented on a two's-complement machine.
*/
- if (arg2 == -1 && arg1 < 0 && result <= 0)
+ if (arg1 == INT64_MIN && arg2 == -1)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("bigint out of range")));
+
+ result = arg1 / arg2;
+
PG_RETURN_INT64(result);
}

@@ -1026,18 +1026,18 @@ int82div(PG_FUNCTION_ARGS)
PG_RETURN_NULL();
}

- result = arg1 / arg2;
-
/*
* Overflow check. The only possible overflow case is for arg1 =
* INT64_MIN, arg2 = -1, where the correct result is -INT64_MIN, which
- * can't be represented on a two's-complement machine. Most machines
- * produce INT64_MIN but it seems some produce zero.
+ * can't be represented on a two's-complement machine.
*/
- if (arg2 == -1 && arg1 < 0 && result <= 0)
+ if (arg1 == INT64_MIN && arg2 == -1)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("bigint out of range")));
+
+ result = arg1 / arg2;
+
PG_RETURN_INT64(result);
}

diff --git a/src/include/c.h b/src/include/c.h
index a6c0e6e..547a188 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -294,6 +294,29 @@ typedef unsigned long long int uint64;
#define UINT64CONST(x) ((uint64) x)
#endif

+#ifndef INT16_MAX
+#define INT16_MAX 32767
+#endif
+
+#ifndef INT16_MIN
+#define INT16_MIN (-INT16_MAX-1)
+#endif
+
+#ifndef INT32_MAX
+#define INT32_MAX 2147483647
+#endif
+
+#ifndef INT32_MIN
+#define INT32_MIN (-INT32_MAX-1)
+#endif
+
+#ifndef INT64_MAX
+#define INT64_MAX INT64CONST(9223372036854775807)
+#endif
+
+#ifndef INT64_MIN
+#define INT64_MIN (-INT64_MAX-1)
+#endif

/* Select timestamp representation (float8 or int64) */
#ifdef USE_INTEGER_DATETIMES


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Xi Wang <xi(dot)wang(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior
Date: 2012-11-18 23:47:58
Message-ID: 17897.1353282478@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Xi Wang <xi(dot)wang(at)gmail(dot)com> writes:
> [ patch adding a bunch of explicit INT_MIN/MAX constants ]

I was against this style of coding before, and I still am.
For one thing, it's just about certain to introduce conflicts
against system headers.

regards, tom lane


From: Xi Wang <xi(dot)wang(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior
Date: 2012-11-19 01:15:09
Message-ID: 50A9881D.3070806@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/18/12 6:47 PM, Tom Lane wrote:
> Xi Wang <xi(dot)wang(at)gmail(dot)com> writes:
>> [ patch adding a bunch of explicit INT_MIN/MAX constants ]
>
> I was against this style of coding before, and I still am.
> For one thing, it's just about certain to introduce conflicts
> against system headers.

I totally agree.

I would be happy to rewrite the integer overflow checks without
using these explicit constants, but it seems extremely tricky to
do so. One possible check without using INTn_MIN is:

if (arg1 < 0 && -arg1 < 0 && arg2 == -1) { ... }

Compared to (arg1 == INTn_MIN && arg2 == -1), the above check is
not only more confusing and difficult to understand, but it also
invokes undefined behavior (-INT_MIN overflow), which is dangerous:
many C compilers will optimize away the check. I've tried gcc,
clang, PathScale, and AMD's Open64, all of which perform such
optimizations.

Since INTn_MIN and INTn_MAX are standard macros from the C library,
can we assume that every C compiler should provide them in stdint.h?
At least this is true for gcc, clang, and Visual C++. Then we don't
have to define them and worry about possible conflicts (though I
think using #ifndef...#endif should be able to avoid conflicts).

- xi


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Xi Wang <xi(dot)wang(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior
Date: 2012-11-19 16:04:31
Message-ID: 2498.1353341071@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Xi Wang <xi(dot)wang(at)gmail(dot)com> writes:
> On 11/18/12 6:47 PM, Tom Lane wrote:
>> I was against this style of coding before, and I still am.
>> For one thing, it's just about certain to introduce conflicts
>> against system headers.

> I totally agree.

> I would be happy to rewrite the integer overflow checks without
> using these explicit constants, but it seems extremely tricky to
> do so.

I thought about this some more and realized that we can handle it
by realizing that division by -1 is the same as negation, and so
we can copy the method used in int4um. So the code would look like

if (arg2 == -1)
{
result = -arg1;
if (arg1 != 0 && SAMESIGN(result, arg1))
ereport(ERROR, ...);
PG_RETURN_INT32(result);
}

(with rather more comments than this, of course). This looks faster
than what's there now, as well as removing the need for use of
explicit INT_MIN constants.

> Compared to (arg1 == INTn_MIN && arg2 == -1), the above check is
> not only more confusing and difficult to understand, but it also
> invokes undefined behavior (-INT_MIN overflow), which is dangerous:
> many C compilers will optimize away the check.

They'd better not, else they'll break many of our overflow checks.
This is why we use -fwrapv with gcc, for example. Any other compiler
with similar optimizations needs to be invoked with a similar switch.

> Since INTn_MIN and INTn_MAX are standard macros from the C library,
> can we assume that every C compiler should provide them in stdint.h?

Not every C compiler provides stdint.h, unfortunately --- otherwise
I'd not be so resistant to depending on this.

regards, tom lane


From: Xi Wang <xi(dot)wang(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior
Date: 2012-11-19 16:27:23
Message-ID: 50AA5DEB.2090203@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/19/12 11:04 AM, Tom Lane wrote:
> I thought about this some more and realized that we can handle it
> by realizing that division by -1 is the same as negation, and so
> we can copy the method used in int4um. So the code would look like
>
> if (arg2 == -1)
> {
> result = -arg1;
> if (arg1 != 0 && SAMESIGN(result, arg1))
> ereport(ERROR, ...);
> PG_RETURN_INT32(result);
> }
>
> (with rather more comments than this, of course). This looks faster
> than what's there now, as well as removing the need for use of
> explicit INT_MIN constants.

Hah, I used exactly the same code in my initial patch. :)

See below.

> They'd better not, else they'll break many of our overflow checks.
> This is why we use -fwrapv with gcc, for example. Any other compiler
> with similar optimizations needs to be invoked with a similar switch.

Yes, that's the scary part.

Even with -fwrapv in gcc (I tried 4.6/4.7/4.8), it still optimizes away
my overflow check!

if (arg1 < 0 && -arg1 < 0) { ... }

Fortunately, it doesn't optimize way checks using SAMESIGN() for now.
Who knows what could happen in the next version..

Clang's -fwrapv works as expected.

PathScale and Open64 (and probably icc) don't support -fwrapv at all.
Not sure if they have similar options. They do such optimizations, too.

The reality is that C compilers are not friendly to postcondition
checking; they consider signed integer overflow as undefined behavior,
so they do whatever they want to do. Even workaround options like
-fwrapv are often broken, not to mention that they may not even have
those options.

That's why I am suggesting to avoid postcondition checking for signed
integer overflow.

> Not every C compiler provides stdint.h, unfortunately --- otherwise
> I'd not be so resistant to depending on this.

Sigh.. You are right.

- xi


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Xi Wang <xi(dot)wang(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior
Date: 2012-11-19 17:01:27
Message-ID: 4671.1353344487@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Xi Wang <xi(dot)wang(at)gmail(dot)com> writes:
> The reality is that C compilers are not friendly to postcondition
> checking; they consider signed integer overflow as undefined behavior,
> so they do whatever they want to do. Even workaround options like
> -fwrapv are often broken, not to mention that they may not even have
> those options.

I think it's probably past time that we stopped guessing about this
sort of thing and added some regression test cases for it. I'm
planning to add cases like this:

-- check sane handling of INT_MIN overflow cases
SELECT (-2147483648)::int4 * (-1)::int4;
SELECT (-2147483648)::int4 / (-1)::int4;
SELECT (-2147483648)::int4 % (-1)::int4;

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Xi Wang <xi(dot)wang(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior
Date: 2012-11-19 17:17:36
Message-ID: 20121119171736.GB13889@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-19 11:04:31 -0500, Tom Lane wrote:
> Xi Wang <xi(dot)wang(at)gmail(dot)com> writes:
> > Since INTn_MIN and INTn_MAX are standard macros from the C library,
> > can we assume that every C compiler should provide them in stdint.h?
>
> Not every C compiler provides stdint.h, unfortunately --- otherwise
> I'd not be so resistant to depending on this.

We already have hardcoded values for INT64_MIN in numutils.c's
pg_lltoa(). Given that I think we could just bite the apple and provide
our own values if the platform doesn't provide them.

Greetings,

Andres Freund