Re: About numeric division again

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: About numeric division again
Date: 2008-04-04 01:04:23
Message-ID: 29619.1207271063@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> The code that's currently in there sometimes has to propagate rounding
>> to the left, meaning that you can never be certain whether all of the
>> digits you have so far are good, and that means that it can sometimes
>> generate an incorrect truncated output. This leads to the bugs cited
>> in the above message.

> The case I looked into could be traced specifically to the fact that it
> calculates an intermediate value which is the reciprocal of the divisor and
> multiplies by that.

Perhaps, but no matter how you slice it, that algorithm sometimes needs
to go back and "fix" digits it previously emitted. That's okay for
approximate answers and not okay for exact ones, because you never
know if you're done fixing the last digit you need to return.

I fooled around with the idea of using the existing fast division code
in the sqrt/exp/log/power functions, and got these timings for the
numeric_big regression test:

Xeon HPPA

CVS HEAD 5.446s 2m10.97s
Knuth division only 11.574s 3m9.29s
old code for trans. fns 5.477s 2m15.97s

The HPPA is an old machine with excruciatingly slow integer division,
so it's probably about the worst case for the Knuth-based code.

Now you can certainly object that numeric_big might not be
representative of real-world applications, but these results seem to
me to justify implementing both functions. wc says that numeric.c is

5457 15914 122662 CVS HEAD
5466 16050 123401 Knuth only
5732 17202 130764 both

A couple hundred more lines seems like an acceptable price.

Also, I remembered the reason for proposing an explicit "integer
division" function: you can't always get the right answer from
trunc(x/y). The / operator will round its output at some number of
fractional digits chosen by itself, and it's possible that it rounds
.999... up to the next integer, whereupon your trunc() result is simply
wrong. (Indeed this is just the same problem at the SQL level as mod
was facing at the C level in one of the bugs that started this
discussion.) So if you want to do any high-precision integer math,
this is really not a negotiable feature.

So IMHO the only question left is what to call it at the SQL level.
A few options:

Expose it as a function:

div(x, y)
quotient(x, y)

Expose it as an operator:

x // y

There are probably better ideas I haven't thought of. Comments?

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sam Mason 2008-04-04 01:20:08 Re: COPY Transform support
Previous Message Sam Mason 2008-04-04 01:01:57 Re: [GENERAL] SHA1 on postgres 8.3