Re: unsafe floats

Lists: pgsql-hackers
From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: unsafe floats
Date: 2004-03-10 21:51:06
Message-ID: Pine.LNX.4.44.0403102232420.13979-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

When UNSAFE_FLOATS is defined there is a check that float results are
within the min and max limits, which excludes values like 'Infinity',
'-Infinity' and 'Nan'.

Is the above something from the SQL standard or just a bug?

The input rules for float8 accepts 'Infinity' as a value, and then it just
triggers a overflow error directly after (unless UNSAFE_FLOATS is
defined).

At first I thought it was a bug, but this function that checks for
overflow wouldn't even be needed if not to stop such values. Without the
check every possible value would be accepted (on normal IEEE math). I can
see a use in not accepting Infinity and Nan, but I would rather put that
as constraints if I needed that in my database.

Any thoughts? Should I go ahead and make it accept 'Infinity' and the
rest as numbers?

--
/Dennis Björklund


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org, Neil Conway <neilc(at)samurai(dot)com>
Subject: Re: unsafe floats
Date: 2004-03-10 22:13:28
Message-ID: 13346.1078956808@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> writes:
> When UNSAFE_FLOATS is defined there is a check that float results are
> within the min and max limits, which excludes values like 'Infinity',
> '-Infinity' and 'Nan'.

> Is the above something from the SQL standard or just a bug?

I think it was probably reasonable when the code was written (I'd guess
this goes back to very early 90s). Nowadays, IEEE float math is nearly
universal, and we would be offering better functionality if we allowed
access to Infinity and Nan by default. So I'd vote for ripping out the
range check, or at least reversing the default state of UNSAFE_FLOATS.

We might end up with two sets of regression expected files, one for
machines that do not support NaN, but that seems acceptable to me.

A variant idea is to try to get configure to detect Infinity/NaN support
and set UNSAFE_FLOATS only if it's there. But I don't know if we can do
that without a run-time check, which Peter probably will complain about...

regards, tom lane


From: Neil Conway <neilc(at)samurai(dot)com>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-10 22:21:14
Message-ID: 87llm8s5r9.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> writes:
> When UNSAFE_FLOATS is defined there is a check that float results
> are within the min and max limits, which excludes values like
> 'Infinity', '-Infinity' and 'Nan'.

No, 'NaN' is legal float4/float8/numeric input whether UNSAFE_FLOATS
is defined or not.

> At first I thought it was a bug, but this function that checks for
> overflow wouldn't even be needed if not to stop such values.

Well, CheckFloat4Val() is needed to ensure that the input fits in a
'float' (rather than just a 'double').

> Should I go ahead and make it accept 'Infinity' and the rest as
> numbers?

What number would you like 'Infinity'::float4 and 'Infinity'::float8
to produce? Is this actually useful functionality?

As for it being in the SQL standard, using Acrobat's "text search"
feature finds zero instances of "infinity" (case insensitive) in the
200x draft spec. I haven't checked any more thoroughly than that,
though.

My inclination is to get rid of UNSAFE_FLOATS entirely, and disallow
'Infinity' and '-Infinity' input to float8 (since it never worked to
begin with, and float4 doesn't accept those values either). But I'm
open to other comments.

-Neil


From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-10 22:36:35
Message-ID: 87ad2os51o.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Nowadays, IEEE float math is nearly universal, and we would be
> offering better functionality if we allowed access to Infinity and
> Nan by default.

This is faulty reasoning: we *do* allow NaN by default (although
you're correct that we reject Infinity in float8 input, but it seems
not by design).

> So I'd vote for ripping out the range check, or at least reversing
> the default state of UNSAFE_FLOATS.

This would surely be wrong. Defining UNSAFE_FLOATS will make
float4in() not check that its input fits into a 'float', rather than a
'double'.

> We might end up with two sets of regression expected files, one for
> machines that do not support NaN, but that seems acceptable to me.

Are there any modern machines that actually do not support NAN? There
are various places in the code that do

#ifndef NAN
#define NAN (0.0/0.0)
#endif

... and this hasn't caused any problems that I'm aware of.

-Neil


From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-10 22:37:38
Message-ID: Pine.LNX.4.44.0403102323430.13979-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 10 Mar 2004, Neil Conway wrote:

> No, 'NaN' is legal float4/float8/numeric input whether UNSAFE_FLOATS
> is defined or not.

Yes, the tests are:

if (fabs(val) > FLOAT8_MAX)
if (val != 0.0 && fabs(val) < FLOAT8_MIN)

and only infinity and not NaN will trigger the overflow. I read it wrong
first.

> Well, CheckFloat4Val() is needed to ensure that the input fits in a
> 'float' (rather than just a 'double').

Sure, for Float4 (maybe working with float in C instead of double and this
check would make a difference, but I havn't really thought about that).

> What number would you like 'Infinity'::float4 and 'Infinity'::float8
> to produce? Is this actually useful functionality?

I would like them to produce the IEEE 754 number 'infinity' (usually
writte 'Inf' in other languages).

> As for it being in the SQL standard, using Acrobat's "text search"
> feature finds zero instances of "infinity" (case insensitive) in the
> 200x draft spec. I haven't checked any more thoroughly than that,
> though.

If they say that it should use IEEE 754 math, then they do say that
Infinity is a number, just like it is in C and lots of other languages
with IEEE 754 math. Being as old as it is, I would guess that the standard
doesn't really say much at all about floats.

Why should pg not do the same as most (all?) other language that use IEEE
754 math?

--
/Dennis Björklund


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-10 22:47:43
Message-ID: 13606.1078958863@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway <neilc(at)samurai(dot)com> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> So I'd vote for ripping out the range check, or at least reversing
>> the default state of UNSAFE_FLOATS.

> This would surely be wrong. Defining UNSAFE_FLOATS will make
> float4in() not check that its input fits into a 'float', rather than a
> 'double'.

Possibly the appropriate test involves using isfinite() (apparently
spelled finite() some places, but the C99 spec says isfinite). If
that returns false, take the value as good without checking range.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-10 22:56:18
Message-ID: 13696.1078959378@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway <neilc(at)samurai(dot)com> writes:
> What number would you like 'Infinity'::float4 and 'Infinity'::float8
> to produce? Is this actually useful functionality?

On an IEEE-spec machine, it's highly useful functionality. +Infinity
and -Infinity are special values.

BTW the float4out and float8out routines already cope with NANs and
infinities, so ISTM the input routines should be able to reverse that.
(At the moment you might only be able to get an infinity inside the
system as the result of certain math functions.)

regards, tom lane


From: Neil Conway <neilc(at)samurai(dot)com>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-11 06:03:31
Message-ID: 87ptbjq5sc.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> writes:
> I would like them to produce the IEEE 754 number 'infinity' (usually
> writte 'Inf' in other languages).

Fair enough. Attached is a patch that implements this. I chose to
remove UNSAFE_FLOATS: if anyone thinks that is worth keeping, speak up
now.

Example behavior:

nconway=# select 'Infinity'::float4, '-Infinity'::float8;
float4 | float8
----------+-----------
Infinity | -Infinity
(1 row)

However, this patch causes the regression tests to fail. The reason
for that is the following change in behavior:

nconway=# select '-1.2345678901234e+200'::float8 * '1e200';
?column?
-----------
-Infinity
(1 row)

Without the patch, we bail out due to overflow:

nconway=# select '-1.2345678901234e+200'::float8 * '1e200';
ERROR: type "double precision" value out of range: overflow

The problem is that CheckFloat8Val() is used in two places: in
float8in() to check that the input fits inside a float8 (which is a
little silly, since strtod() by definition returns something that fits
inside a double), and after various float8 operations (including
multiplication).

As part of the patch, I modified CheckFloat8Val() to not reject
infinite FP values. What happens in the example above is that the C
multiplication operation performed by float8mul() returns '-Infinity'
-- prior to the patch, CheckFloat8Val() rejected that as an overflow,
but with the patch it no longer does.

So, what is the correct behavior: if you multiply two values and get a
result that exceeds the range of a float8, should you get
'Infinity'/'-Infinity', or an overflow error?

(Either policy is implementable: in the former case, we'd check for an
infinite input in float8in() but outside of CheckFloat8Val(), and in
the latter case we'd just remove the overflow checking for floating
point ops.)

Comments?

-Neil

Attachment Content-Type Size
float_infinite_input-2.patch text/x-patch 5.7 KB

From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-11 06:20:00
Message-ID: Pine.LNX.4.44.0403110717390.13979-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Mar 2004, Neil Conway wrote:

> So, what is the correct behavior: if you multiply two values and get a
> result that exceeds the range of a float8, should you get
> 'Infinity'/'-Infinity', or an overflow error?

That's the issue and I think we should allow infinity as a result of a
float operation (like you got in the example). It's part of IEEE 754
math, so not getting Infinity as a result would break that. If someone
does not want infinity stored in a column he/she could add a constraint to
disallow it.

--
/Dennis Björklund


From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-11 06:47:50
Message-ID: Pine.LNX.4.44.0403110724240.13979-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Mar 2004, Neil Conway wrote:

> Fair enough. Attached is a patch that implements this. I chose to
> remove UNSAFE_FLOATS: if anyone thinks that is worth keeping, speak up
> now.

I have one question about the use of HUGE_VAL in postgresql. I got the
impression that the whole thing was designed to work on computers and
compilers where there is no infinity value, and then HUGE_VAL is defined
as the biggest number and treated as a special value.

If that is the case then using isinf() would not work (unless we have our
own). Well, maybe it's not an issue at all. Everything is IEEE 754 anyway
today.

A more important question is if we should give errors or produce Infinity
and NaN on mathematical operations. That is, should operations like
sqrt(-1.0) produce NaN or give an error.

--
/Dennis Björklund


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-11 09:42:55
Message-ID: 200403111042.55294.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dennis Bjorklund wrote:
> A more important question is if we should give errors or produce
> Infinity and NaN on mathematical operations. That is, should
> operations like sqrt(-1.0) produce NaN or give an error.

This needs to be definable by the user. IEEE 754 compliant systems
should provide a way to set the behavior in case of exception
conditions such as these. Maybe the C library provides one, or we need
to catch this ourselves.


From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Neil Conway <neilc(at)samurai(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unsafe floats
Date: 2004-03-11 10:46:19
Message-ID: Pine.LNX.4.44.0403111144210.13979-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Mar 2004, Peter Eisentraut wrote:

> This needs to be definable by the user. IEEE 754 compliant systems
> should provide a way to set the behavior in case of exception
> conditions such as these. Maybe the C library provides one, or we need
> to catch this ourselves.

In C one can set a signal handler to catch floating point exceptions
(SIGFPE). Without a handler you can get NaN and Infinity as the result of
mathematical operations.

--
/Dennis Björklund


From: Neil Conway <neilc(at)samurai(dot)com>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unsafe floats
Date: 2004-03-11 21:45:02
Message-ID: 871xnzm529.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> writes:
> In C one can set a signal handler to catch floating point exceptions
> (SIGFPE). Without a handler you can get NaN and Infinity as the
> result of mathematical operations.

Okay, I think this would be a reasonable set of behavior:

- define a new GUC var that controls how exceptional floating
point values (NaN, inf, -inf) are handled.

- by default, we still raise an error when a floating point
operation results in NaN / inf / etc.; if the GUC var is toggled
from its default value, no error is raised. This preserves
backward compatibility with applications that expect floating
point overflow to be reported, for example.

- that also means that, by default, we should reject 'NaN',
'Infinity', and '-Infinity' as input to float4/float8: if these
values are illegal as the result of FP operations by default, it
seems only logical to disallow them as input to the FP types.

Does that sound ok?

Unfortunately, I have more important things that I need to get wrapped
up in time for 7.5, so I can't finish this now. Dennis, would you like
to implement this?

Finally, I've attached a revised patch for 'Infinity' input to float4
and float8: this avoids breaking the regression tests by only allowing
'Infinity' and '-Infinity' as input, not as a legal result of FP
operations. This is obviously incomplete, as discussed above, but it
might be a good starting point. Should I commit this?

-Neil

Attachment Content-Type Size
float_infinite_input-3.patch text/x-patch 10.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Cc: Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-11 22:38:54
Message-ID: 2394.1079044734@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> writes:
> On Thu, 11 Mar 2004, Neil Conway wrote:
>> So, what is the correct behavior: if you multiply two values and get a
>> result that exceeds the range of a float8, should you get
>> 'Infinity'/'-Infinity', or an overflow error?

> That's the issue and I think we should allow infinity as a result of a
> float operation (like you got in the example). It's part of IEEE 754
> math, so not getting Infinity as a result would break that.

This would still be infinitely (ahem) better than the behavior you get
when an integer operation overflows. We return whatever the hardware
gives in such cases. Returning whatever the hardware gives for a float
overflow doesn't seem out of line, particularly if it's a well-defined
behavior.

I am somewhat concerned about the prospect of implementation-dependent
results --- machines that do not implement IEEE-style math are going to
return something other than an Infinity. However, I think that
providing access to the IEEE semantics is more useful than a (rather
vain) attempt to impose uniformity across IEEE and non-IEEE machines.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-11 22:57:45
Message-ID: 2548.1079045865@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway <neilc(at)samurai(dot)com> writes:
> Okay, I think this would be a reasonable set of behavior:

> - define a new GUC var that controls how exceptional floating
> point values (NaN, inf, -inf) are handled.

> - by default, we still raise an error when a floating point
> operation results in NaN / inf / etc.; if the GUC var is toggled
> from its default value, no error is raised. This preserves
> backward compatibility with applications that expect floating
> point overflow to be reported, for example.

That sounds okay. Also we might want to distinguish NaN from Infinity
--- I would expect most people to want zero-divide to continue to get
reported, for instance, even if they want to get Infinity for overflow.

> - that also means that, by default, we should reject 'NaN',
> 'Infinity', and '-Infinity' as input to float4/float8: if these
> values are illegal as the result of FP operations by default, it
> seems only logical to disallow them as input to the FP types.

This I disagree with. It would mean, for example, that you could not
dump and reload columns containing such data unless you remembered to
switch the variable first. If you did this then I'd insist on pg_dump
scripts setting the variable to the permissive state. In any case,
I don't see why a restriction that certain operations can't produce a
certain value should render the value illegal overall.

regards, tom lane


From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unsafe floats
Date: 2004-03-12 00:47:37
Message-ID: 87llm6lwly.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> That sounds okay. Also we might want to distinguish NaN from
> Infinity --- I would expect most people to want zero-divide to
> continue to get reported, for instance, even if they want to get
> Infinity for overflow.

Yeah, good point.

> This I disagree with. It would mean, for example, that you could not
> dump and reload columns containing such data unless you remembered to
> switch the variable first.

Hmmm... on reflection, you're probably correct.

Since that removes the potential objection to the previous patch, I've
applied it to CVS.

-Neil