BUG #6217: to_char() gives incorrect output for very small float values

Lists: pgsql-bugs
From: "Chris Gernon" <kabigon(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 17:17:47
Message-ID: 201109201717.p8KHHlZU087195@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 6217
Logged by: Chris Gernon
Email address: kabigon(at)gmail(dot)com
PostgreSQL version: 9.1.0
Operating system: Windows XP
Description: to_char() gives incorrect output for very small float
values
Details:

The to_char() function gives incorrect output for float values whose decimal
expansion has several digits (more than somewhere around 14-15) after the
decimal point.

To reproduce:

CREATE TABLE t (
id serial,
f double precision,
CONSTRAINT t_pk PRIMARY KEY (id)
);

INSERT INTO t (f) VALUES (0.0000000000000000000000000000000563219288);

----------------------------------------

SELECT to_char(f,
'FM999990.99999999999999999999999999999999999999999999999999') FROM t WHERE
id = 1;

Expected Output:
0.0000000000000000000000000000000563219288

Actual Output:
0.

----------------------------------------

SELECT to_char(f,
'999990.99999999999999999999999999999999999999999999999999') FROM t WHERE id
= 1;

Expected Output:
0.00000000000000000000000000000005632192880000000000

Actual Output:
0.00000000000000

----------------------------------------


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Chris Gernon" <kabigon(at)gmail(dot)com>,<pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 17:39:57
Message-ID: 4E78899D020000250004140B@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Chris Gernon" <kabigon(at)gmail(dot)com> wrote:

> The to_char() function gives incorrect output for float values
> whose decimal expansion has several digits (more than somewhere
> around 14-15) after the decimal point.

These are approximate data types. On what basis do you think the
values returned in your examples are wrong? The demonstrated
results don't surprise me, given that they match to about the limits
of the approximate data type involved. It also wouldn't surprise me
to see slightly different results on different architectures or
operating systems. If you want exact values, you should use a type
which supports that, like numeric.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Chris Gernon" <kabigon(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 17:49:28
Message-ID: 11125.1316540968@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Chris Gernon" <kabigon(at)gmail(dot)com> writes:
> CREATE TABLE t (
> id serial,
> f double precision,
> CONSTRAINT t_pk PRIMARY KEY (id)
> );

> INSERT INTO t (f) VALUES (0.0000000000000000000000000000000563219288);

> ----------------------------------------

> SELECT to_char(f,
> 'FM999990.99999999999999999999999999999999999999999999999999') FROM t WHERE
> id = 1;

> Expected Output:
> 0.0000000000000000000000000000000563219288

> Actual Output:
> 0.

My immediate reaction to that is that float8 values don't have 57 digits
of precision. If you are expecting that format string to do something
useful you should be applying it to a numeric column not a double
precision one.

It's possible that we can kluge things to make this particular case work
like you are expecting, but there are always going to be similar-looking
cases that can't work because the precision just isn't there.

(In a quick look at the code, the reason you just get "0." is that it's
rounding off after 15 digits to ensure it doesn't print garbage. Maybe
it could be a bit smarter for cases where the value is very much smaller
than 1, but it wouldn't be a simple change.)

regards, tom lane


From: Christopher Gernon <kabigon(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 17:49:50
Message-ID: CAKwX5KiRDRkWKwvPq5XQy43_LdF+1CR51vzGy9MmbVikmYORew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Sep 20, 2011 at 1:39 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> These are approximate data types.  On what basis do you think the
> values returned in your examples are wrong?  The demonstrated

Because PostgreSQL still has access to all the significant digits:

test1=# SELECT f FROM t WHERE id = 1;
f
-----------------
5.63219288e-032
(1 row)

Since floats are stored with a significand and an exponent, to_char()
should be able to convert 5.6e-32 to text just as easily as it can
convert 5.6e-3. For some reason, it doesn't.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Christopher Gernon" <kabigon(at)gmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 18:29:12
Message-ID: 4E789528020000250004141A@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Christopher Gernon <kabigon(at)gmail(dot)com> wrote:

> to_char() should be able to convert 5.6e-32 to text just as easily
> as it can convert 5.6e-3. For some reason, it doesn't.

Oh, I see your point now, and I agree with you.

We should probably at least put this on the TODO list, I think. Any
objections?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Christopher Gernon" <kabigon(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 19:05:19
Message-ID: 12259.1316545519@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Christopher Gernon <kabigon(at)gmail(dot)com> wrote:
>> to_char() should be able to convert 5.6e-32 to text just as easily
>> as it can convert 5.6e-3. For some reason, it doesn't.

> Oh, I see your point now, and I agree with you.

> We should probably at least put this on the TODO list, I think. Any
> objections?

If we're gonna fix it, we should just fix it, I think. I was
considering taking a stab at it, but if someone else would like to,
that's fine too.

One other thing I notice in the same area is that the handling of NaNs
and infinities seems a bit incomplete. There's an explicit special case
for them in the EEEE-format code path, but not otherwise, and I think
that the results you get for other formats will vary depending on what
the local implementation of snprintf does. What *should* the output be,
if the input is NaN or Inf?

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Christopher Gernon" <kabigon(at)gmail(dot)com>, <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 19:43:21
Message-ID: 4E78A6890200002500041430@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> Christopher Gernon <kabigon(at)gmail(dot)com> wrote:
>>> to_char() should be able to convert 5.6e-32 to text just as
>>> easily as it can convert 5.6e-3. For some reason, it doesn't.
>
>> Oh, I see your point now, and I agree with you.
>
>> We should probably at least put this on the TODO list, I think.
>> Any objections?
>
> If we're gonna fix it, we should just fix it, I think. I was
> considering taking a stab at it, but if someone else would like
> to, that's fine too.

I wouldn't mind doing it, but not until after the CF wraps. On the
other hand, isn't this is one of those compatibility functions?
Perhaps it would best be done by someone who has familiarity with,
and access to, a database with which we're trying to be compatible.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Christopher Gernon" <kabigon(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 19:52:20
Message-ID: 13033.1316548340@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> If we're gonna fix it, we should just fix it, I think. I was
>> considering taking a stab at it, but if someone else would like
>> to, that's fine too.

> I wouldn't mind doing it, but not until after the CF wraps. On the
> other hand, isn't this is one of those compatibility functions?
> Perhaps it would best be done by someone who has familiarity with,
> and access to, a database with which we're trying to be compatible.

Chris already stated that the case gives the answer he expects in
several other DBs, so I don't seem much need for further compatibility
checking on the "don't round off prematurely" angle. However, it would
be interesting to know what Oracle etc do with NaN and Infinity,
assuming they even support such numbers.

Currently what our code does for the format-with-EEEE case is to output
"#" in all digit positions. It would be plausible for that to happen
in the non-EEEE cases too, but whether that's actually what happens in
other systems is something I don't know.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Christopher Gernon" <kabigon(at)gmail(dot)com>, <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 20:08:08
Message-ID: 4E78AC580200002500041436@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> it would be interesting to know what Oracle etc do with NaN and
> Infinity, assuming they even support such numbers.
>
> Currently what our code does for the format-with-EEEE case is to
> output "#" in all digit positions. It would be plausible for that
> to happen in the non-EEEE cases too, but whether that's actually
> what happens in other systems is something I don't know.

>From a quick web search, it looks like '#' filling is the typical
approach for infinity and NaN.

-Kevin


From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Christopher Gernon <kabigon(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 21:25:54
Message-ID: CAM-w4HPEucRFT7PYWKDDmeW9RmOamKmar8GzNrD=-mKQUW0faA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Sep 20, 2011 at 8:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> However, it would
> be interesting to know what Oracle etc do with NaN and Infinity,
> assuming they even support such numbers.

Note that it looks like NUMBER cannot store either Infinity or NaN.
They can only occur in BINARY_FLOAT and BINARY_DOUBLE. From the docs:

> If a BINARY_FLOAT or BINARY_DOUBLE value is converted to CHAR or NCHAR, and the input is either infinity or NaN (not a number), then Oracle always returns the pound signs to replace the value.

And testing shows:

SQL> select to_char(cast('NAN' as binary_float), 'FM9999.9999') from dual;

TO_CHAR(CA
----------
##########

SQL> select to_char(cast('-Inf' as binary_float), 'FM9999.9999') from dual;

TO_CHAR(CA
----------
##########

SQL> select to_char(cast('+Inf' as binary_float), 'FM9999.9999') from dual;

TO_CHAR(CA
----------
##########

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Christopher Gernon <kabigon(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6217: to_char() gives incorrect output for very small float values
Date: 2011-09-20 21:57:10
Message-ID: 28405.1316555830@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Greg Stark <stark(at)mit(dot)edu> writes:
> On Tue, Sep 20, 2011 at 8:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> However, it would
>> be interesting to know what Oracle etc do with NaN and Infinity,
>> assuming they even support such numbers.

> Note that it looks like NUMBER cannot store either Infinity or NaN.
> They can only occur in BINARY_FLOAT and BINARY_DOUBLE. From the docs:

>> If a BINARY_FLOAT or BINARY_DOUBLE value is converted to CHAR or NCHAR, and the input is either infinity or NaN (not a number), then Oracle always returns the pound signs to replace the value.

> And testing shows:

> SQL> select to_char(cast('NAN' as binary_float), 'FM9999.9999') from dual;

> TO_CHAR(CA
> ----------
> ##########

Hmm, interesting. They replace the whole field with '#', not just the
digit positions? Because that's not what is happening in our code at
the moment, for the one case where we consider this at all:

regression=# select to_char('nan'::float8, '9999.9999EEEE');
to_char
----------------
####.########
(1 row)

The EEEE path seems rather broken in some other ways as well:

regression=# select to_char('43.5'::float8, '9999.9999EEEE');
to_char
-------------
4.3500e+01
(1 row)

Since I did not say FM, why is it suppressing leading spaces here?

I'm starting to think that that code needs a wholesale rewrite
(not but what that's true of just about every part of formatting.c).

regards, tom lane