Re: [HACKERS] NUMERIC needs OID's

Lists: pgsql-hackers
From: jwieck(at)debis(dot)com (Jan Wieck)
To: pgsql-hackers(at)postgreSQL(dot)org (PostgreSQL HACKERS)
Subject: NUMERIC needs OID's
Date: 1998-12-30 02:24:14
Message-ID: m0zvBJ1-000EBPC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

the exact NUMERIC datatype materializes more and more. What I
got so far are the four arithmetic base operators, all six
comparision operators and these functions:

ABS(n)
CEIL(n)
EXP(n)
FLOOR(n)
LN(n)
MOD(m,n)
ROUND(n[,scale])
SIGN(n)
SQRT(n)
TRUNC(n[,scale])

The trigonometric ones I left out for now, but since SQRT(),
EXP() and LN() work, it wouldn't be that hard to do them
later (the former produce the same results as bc(1) does -
shortly tested up to 400 digits after the decimal point).

The speed of the complex functions is IMHO acceptable. For
example EXP() is 25% better than bc(1) on small numbers but
needs up to 3 times on very big ones (999). Who ever needs
EXP(999) or more? The result is about 7.2e433! SQRT() is a
bit slow - so be it. Postgres shouldn't become a substitute
for arbitrary precision calculators.

So I think it's time now to move the stuff into the backend.
Therefor I first need a bunch of OID's (about 30 C functions,
20 SQL functions and 10 operators for now). Should I fill up
all the holes?

For the next step then I need some help in the parser. I
think it would be good to have anything that looks like a
double/float and any integer that doesn't fit into an int4
parsed into a NUMERIC, and if it really should be one of the
others type-coerced back later. This is the only way to get
rid of the single quotes for literal NUMERIC values without
the danger of loosing digits (due to double's limited
precision). But I'm not sure about all the things that might
break if doing so.

Rereading this mail I see that I forgot the aggregates. Just
some more OID's that I need :-).

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck(at)debis(dot)com (Jan Wieck) #


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: jwieck(at)debis(dot)com
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] NUMERIC needs OID's
Date: 1998-12-30 04:52:31
Message-ID: 199812300452.XAA07861@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Hi all,
>
> the exact NUMERIC datatype materializes more and more. What I
> got so far are the four arithmetic base operators, all six
> comparision operators and these functions:
>
> ABS(n)
> CEIL(n)
> EXP(n)
> FLOOR(n)
> LN(n)
> MOD(m,n)
> ROUND(n[,scale])
> SIGN(n)
> SQRT(n)
> TRUNC(n[,scale])
>
> The trigonometric ones I left out for now, but since SQRT(),
> EXP() and LN() work, it wouldn't be that hard to do them
> later (the former produce the same results as bc(1) does -
> shortly tested up to 400 digits after the decimal point).
>
> The speed of the complex functions is IMHO acceptable. For
> example EXP() is 25% better than bc(1) on small numbers but
> needs up to 3 times on very big ones (999). Who ever needs
> EXP(999) or more? The result is about 7.2e433! SQRT() is a
> bit slow - so be it. Postgres shouldn't become a substitute
> for arbitrary precision calculators.
>
> So I think it's time now to move the stuff into the backend.
> Therefor I first need a bunch of OID's (about 30 C functions,
> 20 SQL functions and 10 operators for now). Should I fill up
> all the holes?

No, I don't recommend it. I recommend getting a range of oid's. Oids
are confusing enough, without trying to collect them scattered all over
a range of values.

The problem is that we don't have a range of 30 left anymore, and the
needs of future development are surely going to eat up the rest. One
idea is to remove some of the conversion functions we have defined that
are rarely used. We don't NEED them with Thomas's conversion stuff.
Thomas says native conversion is faster, but if we find some that we
almost never use, we could rip them out and use those.

However, my recommendation is that we have to start thinking about
increasing the maximum allowable system oid.

1 - 10
100 - 101
754
842
949
1288 - 1295
1597 - 1599
1608 - 1610
1619 - 1639
1644 -

The max system oid is stored in transam.h as:

/* ----------
* note: we reserve the first 16384 object ids for internal use.
* oid's less than this appear in the .bki files. the choice of
* 16384 is completely arbitrary.
* ----------
*/
#define BootstrapObjectIdData 16384

This is 2^14.

We can increase this to 32k without any noticable problem, as far as I
know. That will give us a nice range of availble oids, and allow
renumbering if people want to clean up some of current oid mess.

The only problem is that loading a pg_dump -o is going to probably cause
rows with oids in the 16k-32k range to duplicate those in the system
tables. Is that a problem? I am not sure. As far as I know, there is
no reason oid's have to be unique, especially if they are in different
tables. contrib/findoidjoins will get messed up by this, but I am not
sure if that is a serious problem.

I can't figure out another way around it. We could expand by going very
high, near 2 billion, assuming no one is up there yet, but the code will
get very messy doing that, and I don't recommend it.

I think going to 2^15 is going to become necessary someday. The
question is, do we do it for 6.5, and if so, how will the duplicate oids
affect our users?

A new system would start assigning rows with oids > 2^15, so only system
table oids installed via pg_dump -o or COPY WITH OIDS FROM would be
in the range 2^14-2^15.

pg_dump uses the max system oid to determine of a function is a user
function, but because it gets the max oid from the template1 table oid,
this should be portable across the two oid systems.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: jwieck(at)debis(dot)com, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] NUMERIC needs OID's
Date: 1998-12-30 06:23:45
Message-ID: 3689C6F1.E5739D7A@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neat stuff!

> > So I think it's time now to move the stuff into the backend.
> > Therefor I first need a bunch of OID's (about 30 C functions,
> > 20 SQL functions and 10 operators for now). Should I fill up
> > all the holes?
> No, I don't recommend it. I recommend getting a range of oid's. Oids
> are confusing enough, without trying to collect them scattered all
> over a range of values.
> The problem is that we don't have a range of 30 left anymore, and the
> needs of future development are surely going to eat up the rest. One
> idea is to remove some of the conversion functions we have defined
> that are rarely used. We don't NEED them with Thomas's conversion
> stuff. Thomas says native conversion is faster, but if we find some
> that we almost never use, we could rip them out and use those.

That won't get us very many OIDs, so doesn't solve the problem in the
long run.

> However, my recommendation is that we have to start thinking about
> increasing the maximum allowable system oid.
> The max system oid is stored in transam.h as:
> #define BootstrapObjectIdData 16384

Jan is asking for 20-50 OIDs. That is not enough to significantly affect
the size of the available "reserved OID" space. I'd recommend taking a
block of OIDs from somewhere just above the largest current pre-assigned
OID.

> I think going to 2^15 is going to become necessary someday. The
> question is, do we do it for 6.5, and if so, how will the duplicate
> oids affect our users?

This seems like a good subject for v7.0, when users might expect a more
troublesome transition. Anyway, by then we might have a different
solution.

- Tom


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Jan Wieck <jwieck(at)debis(dot)com>
Cc: PostgreSQL HACKERS <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] NUMERIC needs OID's
Date: 1998-12-30 14:55:44
Message-ID: 368A3EF0.8EF42938@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> For the next step then I need some help in the parser. I
> think it would be good to have anything that looks like a
> double/float and any integer that doesn't fit into an int4
> parsed into a NUMERIC, and if it really should be one of the
> others type-coerced back later. This is the only way to get
> rid of the single quotes for literal NUMERIC values without
> the danger of loosing digits (due to double's limited
> precision). But I'm not sure about all the things that might
> break if doing so.

I can help with this. I would suggest starting conservatively,
converting to NUMERIC only when one of the other existing types does not
work. Later, we can more tightly couple the numeric types to make this
more transparent.

When you have the numeric stuff ready to go, commit it to the source
tree and we can start working on the parsing issues.

- Tom


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas G(dot) Lockhart)
Cc: jwieck(at)debis(dot)com, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] NUMERIC needs OID's
Date: 1998-12-30 16:53:08
Message-ID: 199812301653.LAA18333@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > The problem is that we don't have a range of 30 left anymore, and the
> > needs of future development are surely going to eat up the rest. One
> > idea is to remove some of the conversion functions we have defined
> > that are rarely used. We don't NEED them with Thomas's conversion
> > stuff. Thomas says native conversion is faster, but if we find some
> > that we almost never use, we could rip them out and use those.
>
> That won't get us very many OIDs, so doesn't solve the problem in the
> long run.
>
> > However, my recommendation is that we have to start thinking about
> > increasing the maximum allowable system oid.
> > The max system oid is stored in transam.h as:
> > #define BootstrapObjectIdData 16384

I am a dope. The first user oid is 16k, and unused_oids says:

$ unused_oids
1 - 10
100 - 101
754
842
949
1288 - 1295
1597 - 1599
1608 - 1610
1619 - 1639
1644 -

There is nothing magical about 1644. We have always been packing the
oids into that range, while we have never used oids over 1644. Just
start at 1650 and head toward 16384. We have roughly 16200 unused
system oids left.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: jwieck(at)debis(dot)com (Jan Wieck)
To: maillist(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian)
Cc: lockhart(at)alumni(dot)caltech(dot)edu, jwieck(at)debis(dot)com, pgsql-hackers(at)postgreSQL(dot)org
Subject: NUMERIC stuff committed
Date: 1998-12-30 20:29:13
Message-ID: m0zvSF0-000EBPC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:

> [...]
> oids into that range, while we have never used oids over 1644. Just
> start at 1650 and head toward 16384. We have roughly 16200 unused
> system oids left.

I've just committed the NUMERIC stuff to the CURRENT tree.
Used OID's 1700-1767 up to now (some more will follow when
implementing the trigonometric functions).

Well folks - it's there now and can be stressed! Don't forget
to initdb.

I'll add a first precision check using some of the complex
math functions with scale 1000 to the regression soon.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck(at)debis(dot)com (Jan Wieck) #