Re: [RFC] Unsigned integer support.

Lists: pgsql-hackers
From: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [RFC] Unsigned integer support.
Date: 2008-07-25 04:20:07
Message-ID: e739902b0807242120v2b4e5b14w80f3d8b9ecd44dea@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello hackers,

I know the development community is in the middle of the July 2008
commit-fest, so I apologize if this design proposals are in
appropriate at this time.

I am looking to take advantage of PostgreSQL extensible type system
and implement unsigned integer support. The data I am dealing with
contains
many unsigned data types and I am planning on using this type to
reduce storage consumption. I am initially looking to add support for
16-bit and
32-bit unsigned integers with the potential to add 8-bit and 64-bit
unsigned integers if needed or desired by the community.

Searching through the list archives, I found two objections raised in the past:

1. Implicit casts between different data types.

I am hoping the removal of many of the implicit casts in
PostgreSQL 8.3 will simplify this task to where this objection can be
removed.

My plan (without much experimentation) is to have maybe a handful
of casts (something like):
* uint4 -> integer
* integer -> uint4
* bigint -> uint4
* integer -> uint2
* uint2 -> smallint

and then provide operators to provide a reasonable set of
functionality. My initial thought for this functionality is to
provide default operators on any
type that is not implicitly casted on the psql command-line.

As an example, I am planning for the following SQL statements to
work correctly:

3000000000::uint4 + 10 and
3000000000::uint4 > 10

My understanding is the SQL standard does not provide support for
unsigned integers, so I am planning on making all casts from unsigned
integers
to other data types explicit. Is this acceptable to the community?

Another question for the community is should we allow the following cast?
-1::uint4

Even though this is acceptable c-code, I am leaning towards
throwing an out-of-range error when this occurs.

Are there some areas I am missing or should investigate further
before working on this project?

2. There is not much demand for unsigned integer types.

Not much I can do about that :) I am willing to post my work as
a PgFoundry project.

PgFoundry already has an uint project:
http://pgfoundry.org/projects/uint/

Unfortunately this project seems to have not gone anywhere. Last
activity was late 2006 and there are not any files checked into the
SCM repository.
Is it acceptable to hijack this PgFoundry project? Or should I
start a new project (assuming there is any interest in publishing this
work).

Although I am not targeting inclusion for this type in the core
PostgreSQL code, I would like to post code for review and receive
feedback from the
community on this work. As I understand this RFC is the first step in
the process :) Once I have some code ready for review, is it
acceptable to use the
commit-fest wiki for this project?

Thanks much for your time!

- Ryan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 05:10:51
Message-ID: 19734.1216962651@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
> I am looking to take advantage of PostgreSQL extensible type system
> and implement unsigned integer support.

This has been proposed before, and foundered before on the question
of implicit coercions. If you're willing to make all coercions *to*
unsigned types be explicit (or at most assignment), then I think
it can be made to work without breaking anything. But usually the
folk who ask for this feature are hoping that bare integer literals
like "42" will get interpreted as unsigned when they want them to be.
The problem with that wish is illustrated by

select 1500000000 + 1500000000;

These literals might be either int4 or uint4, therefore this command
might yield either an integer-overflow error or 3000000000::uint4.
That's not a distinction you can fuzz over --- it's got to be one
or the other, and backwards compatibility says it'd better be the
first.

> I am hoping the removal of many of the implicit casts in
> PostgreSQL 8.3 will simplify this task to where this objection can be
> removed.

The implicit casts we removed were cross-type-category cases.
If you hope for unsigned types to be considered part of the numeric
category, there's no guidance for you there. In fact, the real nub
of the problem is what type shall be initially assigned to an
integer-looking literal, and how will you get things to behave sanely
if that initial choice wasn't what was desired. We still have some
issues around the fact that "42" isn't considered a smallint. Throwing
in another possible meaning isn't going to help.

> My understanding is the SQL standard does not provide support for
> unsigned integers, so I am planning on making all casts from unsigned
> integers to other data types explicit.

It's really the other direction that would be contentious ...

regards, tom lane


From: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 05:46:56
Message-ID: e739902b0807242246r5b8a18eahd73d16a9564767ba@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Tom,

On Thu, Jul 24, 2008 at 10:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
>> I am looking to take advantage of PostgreSQL extensible type system
>> and implement unsigned integer support.
>
> This has been proposed before, and foundered before on the question
> of implicit coercions. If you're willing to make all coercions *to*
> unsigned types be explicit (or at most assignment), then I think
> it can be made to work without breaking anything. But usually the
> folk who ask for this feature are hoping that bare integer literals
> like "42" will get interpreted as unsigned when they want them to be.
> The problem with that wish is illustrated by
>
> select 1500000000 + 1500000000;
>
> These literals might be either int4 or uint4, therefore this command
> might yield either an integer-overflow error or 3000000000::uint4.
> That's not a distinction you can fuzz over --- it's got to be one
> or the other, and backwards compatibility says it'd better be the
> first.

I am in agreement with you on this. Since SQL does not specify
unsigned types, I was assuming only explicit and assignment casts.
I should have probably mentioned that in the RFC. Thanks for
pointing this out.

My main goal for this type is the reduced storage space. I am fine
with people needing to cast to the unsigned types to benefit from the
reduced storage space.

My plans for the example above would be:

1. SELECT 1500000000 + 1500000000 --> Throws overflow error.
2. SELECT 1500000000::uint4 + 1500000000 --> Returns 3000000000::uint4.

>
>> I am hoping the removal of many of the implicit casts in
>> PostgreSQL 8.3 will simplify this task to where this objection can be
>> removed.
>
> The implicit casts we removed were cross-type-category cases.
> If you hope for unsigned types to be considered part of the numeric
> category, there's no guidance for you there. In fact, the real nub
> of the problem is what type shall be initially assigned to an
> integer-looking literal, and how will you get things to behave sanely
> if that initial choice wasn't what was desired. We still have some
> issues around the fact that "42" isn't considered a smallint. Throwing
> in another possible meaning isn't going to help.
>
>> My understanding is the SQL standard does not provide support for
>> unsigned integers, so I am planning on making all casts from unsigned
>> integers to other data types explicit.
>
> It's really the other direction that would be contentious ...
>
> regards, tom lane

Thanks for your comments! I have already started to play around a bit with
the types and will hopefully have some code ready for review / feedback soon.

- Ryan


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 10:57:34
Message-ID: 87sktykz8h.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:

> My plans for the example above would be:
>
> 1. SELECT 1500000000 + 1500000000 --> Throws overflow error.
> 2. SELECT 1500000000::uint4 + 1500000000 --> Returns 3000000000::uint4.

I think that wouldn't actually work. Postgres's parser immediately assigns a
type to the bare unquoted integral constant so it would end up with a int4
type. Then when it has to pick an operator for uint4+int4 it wouldn't be able
to cast the int4 to uint4 because there would be no implicit cast.

You could make it work by having a uint4+int4 operator which returns uint4 but
then you're going to need a *lot* of operators....

One other idea that's been mentioned before is treating integral constants
like 150000 as type "unknown" like the quoted '150000' constant is. That way
the parser would see uint4+unknown and could pick the uint4 operator. But that
would be a pretty massive semantics change.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 12:14:45
Message-ID: 200807251514.46474.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Friday, 25. July 2008 schrieb Ryan Bradetich:
> PgFoundry already has an uint project:
>         http://pgfoundry.org/projects/uint/
>
>     Unfortunately this project seems to have not gone anywhere.  Last
> activity was late 2006 and there are not any files checked into the
> SCM repository.
>     Is it acceptable to hijack this PgFoundry project?  Or should I
> start a new project (assuming there is any interest in publishing this
> work).

Please hijack the project and develop your code there. Of course you can
always ask for advice here.


From: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 15:23:57
Message-ID: e739902b0807250823o5562d137tbd5c52301cec806d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Greg,

On Fri, Jul 25, 2008 at 3:57 AM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
> "Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
>
>> My plans for the example above would be:
>>
>> 1. SELECT 1500000000 + 1500000000 --> Throws overflow error.
>> 2. SELECT 1500000000::uint4 + 1500000000 --> Returns 3000000000::uint4.
>
> I think that wouldn't actually work. Postgres's parser immediately assigns a
> type to the bare unquoted integral constant so it would end up with a int4
> type. Then when it has to pick an operator for uint4+int4 it wouldn't be able
> to cast the int4 to uint4 because there would be no implicit cast.
>
> You could make it work by having a uint4+int4 operator which returns uint4 but
> then you're going to need a *lot* of operators....

This was my plan. I performed some testing last night to verify that
bare literals
are considered plain integers and would not be implicitly casted to a
different type
(i.e. smallint or bigint). I am seeing three operators for most operations:

1. uint4 -> uint4 = uint4
2. int4 -> uint4 = uint4
3. uint4 -> int4 = uint4

Is there something I need to watch out for when adding this number of
operators (i.e.
performance impact, etc)? Some tests I should be running to measure the impact
of adding these operators?

> One other idea that's been mentioned before is treating integral constants
> like 150000 as type "unknown" like the quoted '150000' constant is. That way
> the parser would see uint4+unknown and could pick the uint4 operator. But that
> would be a pretty massive semantics change.

This would require changes to the core PostgreSQL code correct? My
goal for this
type was to have it as an external project on PgFoundry since there
does not appear
to be much demand for it and unsigned types are not specified in the
SQL standard.
If the community decides this support would be better in core
PostgreSQL code, then
I am willing to help with that work, but I will need a significant
amount of guidance :)

With my limited knowledge, the best (and easiest) path seems to take
advantage of
the extensible type system in PostgreSQL and support unsigned integers as a
PgFoundry project.

Thanks for your review and comments!

- Ryan

> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
> Ask me about EnterpriseDB's 24x7 Postgres support!


From: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 15:25:48
Message-ID: e739902b0807250825n3c8cf05fu22f42539dd9dba63@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Peter,

On Fri, Jul 25, 2008 at 5:14 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Am Friday, 25. July 2008 schrieb Ryan Bradetich:
>> PgFoundry already has an uint project:
>> http://pgfoundry.org/projects/uint/
>>
>> Unfortunately this project seems to have not gone anywhere. Last
>> activity was late 2006 and there are not any files checked into the
>> SCM repository.
>> Is it acceptable to hijack this PgFoundry project? Or should I
>> start a new project (assuming there is any interest in publishing this
>> work).
>
> Please hijack the project and develop your code there. Of course you can
> always ask for advice here.

I will work on getting the PgFoundry project setup.

Thanks!

- Ryan


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Ryan Bradetich <rbradetich(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 15:42:53
Message-ID: 20080725154253.GM9891@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark escribió:

> One other idea that's been mentioned before is treating integral constants
> like 150000 as type "unknown" like the quoted '150000' constant is. That way
> the parser would see uint4+unknown and could pick the uint4 operator. But that
> would be a pretty massive semantics change.

Hmm, if we do that, how would the system resolve something like this?

select 1000 + 1000

There would be no clue as to what + operator to pick, since both
operands are unknown. This is in fact what happens today with

alvherre=# select '100' + '100';
ERROR: operator is not unique: unknown + unknown at character 14
HINT: Could not choose a best candidate operator. You might need to add explicit type casts.
STATEMENT: select '100' + '100';

I think this is a nonstarter.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 16:28:05
Message-ID: 87y73qgc8a.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Alvaro Herrera" <alvherre(at)commandprompt(dot)com> writes:

> Hmm, if we do that, how would the system resolve something like this?
>
> select 1000 + 1000

Well we have the same problem with 'foo' || 'bar'. The question I think is
whether the solution there scales to having two different fallback types.

> There would be no clue as to what + operator to pick, since both
> operands are unknown. This is in fact what happens today with
>
> alvherre=# select '100' + '100';
> ERROR: operator is not unique: unknown + unknown at character 14
> HINT: Could not choose a best candidate operator. You might need to add explicit type casts.
> STATEMENT: select '100' + '100';

Perhaps we could kill two birds with one stone...

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Ryan Bradetich <rbradetich(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 16:44:08
Message-ID: 20080725164408.GN9891@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark escribió:
> "Alvaro Herrera" <alvherre(at)commandprompt(dot)com> writes:
>
> > Hmm, if we do that, how would the system resolve something like this?
> >
> > select 1000 + 1000
>
> Well we have the same problem with 'foo' || 'bar'. The question I think is
> whether the solution there scales to having two different fallback types.

Hmm, right. But you need more than two: consider

alvherre=# select 0.42 + 1;
?column?
----------
1.42
(1 ligne)

However, it would be neat if this behaved the same as

alvherre=# select '0.42' + 1;
ERROR: invalid input syntax for integer: "0.42"
STATEMENT: select '0.42' + 1;

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 17:20:37
Message-ID: 4889C515.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:

> consider
>
> alvherre=# select 0.42 + 1;
> ?column?
> ----------
> 1.42
> (1 ligne)
>
> However, it would be neat if this behaved the same as
>
> alvherre=# select '0.42' + 1;
> ERROR: invalid input syntax for integer: "0.42"
> STATEMENT: select '0.42' + 1;

I wouldn't want the former to fail.

I also wouldn't like these to fail:

select 5000000000 + 1;
select 'abc'::text || 'def'::varchar(3);

-Kevin


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Ryan Bradetich <rbradetich(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 17:24:25
Message-ID: 20080725172425.GO9891@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner escribió:
> >>> Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
>
> > consider
> >
> > alvherre=# select 0.42 + 1;
> > ?column?
> > ----------
> > 1.42
> > (1 ligne)
> >
> > However, it would be neat if this behaved the same as
> >
> > alvherre=# select '0.42' + 1;
> > ERROR: invalid input syntax for integer: "0.42"
> > STATEMENT: select '0.42' + 1;
>
> I wouldn't want the former to fail.

Sorry, I was unclear. What I meant was that both 0.42 + 1 and
'0.42' + 1 should be treated the same, and they should both produce a
numeric output.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 19:32:06
Message-ID: 29540.1217014326@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
> On Fri, Jul 25, 2008 at 3:57 AM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>> "Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
>>> My plans for the example above would be:
>>>
>>> 1. SELECT 1500000000 + 1500000000 --> Throws overflow error.
>>> 2. SELECT 1500000000::uint4 + 1500000000 --> Returns 3000000000::uint4.
>>
>> You could make it work by having a uint4+int4 operator which returns uint4 but
>> then you're going to need a *lot* of operators....

> This was my plan.

Like he says, it's a *lot* of operators, and the point doesn't seem
entirely clear to me. You'll still have overflow cases, they'll just be
in different places.

Consider the idea of not having any uint4-specific arithmetic operators,
but instead providing the following:

* assignment casts from int4 and int8 to uint4
(these throw error if out of range, of course)
* implicit cast from uint4 to int8 (can never fail)

The effect of providing the latter cast would be that any arithmetic
involving a uint4 column would automatically be done in int8. Which
would make it a shade slower than a native implementation, but probably
not enough slower to be a problem --- and you'd avoid having to write
dozens of operators and underlying support functions. Storing into the
uint4 column would work fine with no extra notation because of the
assignment casts.

Moreover, you'd avoid cluttering the system with a pile of cross-type
operators, which we have recently realized are not a good thing, because
they increase the likelihood of "ambiguous operator" problems --- see
http://archives.postgresql.org/pgsql-hackers/2008-06/msg00750.php

For uint8 you'd have to promote to numeric to guarantee no failure
in the implicit cast; which is going to be a rather bigger performance
hit, but I don't really see uint8 as being a type with huge demand.

Now you probably *will* want cross-type comparison operators, if you
are going to support indexing of unsigned columns, so that something
like
uint4col > 42
can be indexed without any casting. But limiting yourself to the six
basic comparison operators certainly makes it a much less bulky project.

regards, tom lane


From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 20:06:49
Message-ID: D425483C2C5C9F49B5B7A41F8944154701000FAB@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Tom Lane
> Sent: Friday, July 25, 2008 12:32 PM
> To: Ryan Bradetich
> Cc: Gregory Stark; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] [RFC] Unsigned integer support.
>
> "Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
> > On Fri, Jul 25, 2008 at 3:57 AM, Gregory Stark
> <stark(at)enterprisedb(dot)com> wrote:
> >> "Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
> >>> My plans for the example above would be:
> >>>
> >>> 1. SELECT 1500000000 + 1500000000 --> Throws overflow error.
> >>> 2. SELECT 1500000000::uint4 + 1500000000 --> Returns
> 3000000000::uint4.
> >>
> >> You could make it work by having a uint4+int4 operator which
returns
> uint4 but
> >> then you're going to need a *lot* of operators....
>
> > This was my plan.
>
> Like he says, it's a *lot* of operators, and the point doesn't seem
> entirely clear to me. You'll still have overflow cases, they'll just
> be
> in different places.
>
> Consider the idea of not having any uint4-specific arithmetic
> operators,
> but instead providing the following:
>
> * assignment casts from int4 and int8 to uint4
> (these throw error if out of range, of course)
> * implicit cast from uint4 to int8 (can never fail)
>
> The effect of providing the latter cast would be that any arithmetic
> involving a uint4 column would automatically be done in int8. Which
> would make it a shade slower than a native implementation, but
probably
> not enough slower to be a problem --- and you'd avoid having to write
> dozens of operators and underlying support functions. Storing into
the
> uint4 column would work fine with no extra notation because of the
> assignment casts.
>
> Moreover, you'd avoid cluttering the system with a pile of cross-type
> operators, which we have recently realized are not a good thing,
> because
> they increase the likelihood of "ambiguous operator" problems --- see
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg00750.php
>
> For uint8 you'd have to promote to numeric to guarantee no failure
> in the implicit cast; which is going to be a rather bigger performance
> hit, but I don't really see uint8 as being a type with huge demand.
>
> Now you probably *will* want cross-type comparison operators, if you
> are going to support indexing of unsigned columns, so that something
> like
> uint4col > 42
> can be indexed without any casting. But limiting yourself to the six
> basic comparison operators certainly makes it a much less bulky
> project.

At the cost of one bit of storage, you have compatible types using
CREATE DOMAIN:

CREATE DOMAIN name [ AS ] data_type
[ DEFAULT expression ]
[ constraint [ ... ] ]

where constraint is:

[ CONSTRAINT constraint_name ]
{ NOT NULL | NULL | CHECK (expression) }

More specifically:

CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE > 0);
CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE > 0);
CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE > 0);
CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE > 0);

Seems like a heck of a lot less work to me. Not to mention very easy to
use.

C:\Program Files (x86)\PostgreSQL\8.3\bin>psql -h localhost -U postgres
domaintest
Password for user postgres:
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

Warning: Console code page (437) differs from Windows code page (1252)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.

domaintest=# CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE > 0);
CREATE DOMAIN
domaintest=# CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE > 0);
CREATE DOMAIN
domaintest=# CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE > 0);
CREATE DOMAIN
domaintest=# CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE > 0);
CREATE DOMAIN
domaintest=#
domaintest=# create table integer_types (
domaintest(# usCol usmallint,
domaintest(# sCol smallint,
domaintest(# uiCol uinteger,
domaintest(# iCol integer,
domaintest(# ubCol ubigint,
domaintest(# bCol bigint,
domaintest(# unCol unumeric,
domaintest(# nCol numeric
domaintest(# );
CREATE TABLE
domaintest=# create index i1 on integer_types(usCol);
CREATE INDEX
domaintest=# create index i2 on integer_types(sCol);
CREATE INDEX
domaintest=# create index i3 on integer_types(uiCol);
CREATE INDEX
domaintest=# create index i4 on integer_types(iCol);
CREATE INDEX
domaintest=# create index i5 on integer_types(ubCol);
CREATE INDEX
domaintest=# create index i6 on integer_types(bCol);
CREATE INDEX
domaintest=# create index i7 on integer_types(unCol);
CREATE INDEX
domaintest=# create index i8 on integer_types(nCol);
CREATE INDEX
domaintest=# insert into integer_types values(1,1,1,1,1,1,1,1);
INSERT 0 1
domaintest=# select * from integer_types;
uscol | scol | uicol | icol | ubcol | bcol | uncol | ncol
-------+------+-------+------+-------+------+-------+------
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1
(1 row)

domaintest=# insert into integer_types (usCol) values (-1);
ERROR: value for domain usmallint violates check constraint
"usmallint_check"
domaintest=#


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Dann Corbit <DCorbit(at)connx(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ryan Bradetich <rbradetich(at)gmail(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 20:11:17
Message-ID: 488A3365.1020208@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dann Corbit wrote:
>
> CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE > 0);
> CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE > 0);
> CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE > 0);
> CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE > 0);
>
>
>

s/>/>=/g

cheers

andrew


From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Ryan Bradetich" <rbradetich(at)gmail(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 20:14:09
Message-ID: D425483C2C5C9F49B5B7A41F8944154701000FAC@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: Andrew Dunstan [mailto:andrew(at)dunslane(dot)net]
> Sent: Friday, July 25, 2008 1:11 PM
> To: Dann Corbit
> Cc: Tom Lane; Ryan Bradetich; Gregory Stark; pgsql-
> hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] [RFC] Unsigned integer support.
>
>
>
> Dann Corbit wrote:
> >
> > CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE > 0);
> > CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE > 0);
> > CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE > 0);
> > CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE > 0);
> >
> >
> >
>
> s/>/>=/g

I turned off the default "option" to remove extra line breaks.
Future posts should not be quite as even and bletcherous.
God willing, and the crick don't rise.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Dann Corbit <DCorbit(at)connx(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ryan Bradetich <rbradetich(at)gmail(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 20:28:10
Message-ID: 488A375A.1080405@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dann Corbit wrote:
>> -----Original Message-----
>> From: Andrew Dunstan [mailto:andrew(at)dunslane(dot)net]
>> Sent: Friday, July 25, 2008 1:11 PM
>> To: Dann Corbit
>> Cc: Tom Lane; Ryan Bradetich; Gregory Stark; pgsql-
>> hackers(at)postgresql(dot)org
>> Subject: Re: [HACKERS] [RFC] Unsigned integer support.
>>
>>
>>
>> Dann Corbit wrote:
>>
>>> CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE > 0);
>>> CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE > 0);
>>> CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE > 0);
>>> CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE > 0);
>>>
>>>
>>>
>>>
>> s/>/>=/g
>>
>
> I turned off the default "option" to remove extra line breaks.
> Future posts should not be quite as even and bletcherous.
> God willing, and the crick don't rise.
>
>

I suspect you're missing my point, namely that 0 should be an allowed
value for unsigned types.

cheers

andrew


From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Ryan Bradetich" <rbradetich(at)gmail(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-25 20:29:32
Message-ID: D425483C2C5C9F49B5B7A41F8944154701000FAE@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: Andrew Dunstan [mailto:andrew(at)dunslane(dot)net]
> Sent: Friday, July 25, 2008 1:28 PM
> To: Dann Corbit
> Cc: Tom Lane; Ryan Bradetich; Gregory Stark; pgsql-
> hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] [RFC] Unsigned integer support.
>
>
>
> Dann Corbit wrote:
> >> -----Original Message-----
> >> From: Andrew Dunstan [mailto:andrew(at)dunslane(dot)net]
> >> Sent: Friday, July 25, 2008 1:11 PM
> >> To: Dann Corbit
> >> Cc: Tom Lane; Ryan Bradetich; Gregory Stark; pgsql-
> >> hackers(at)postgresql(dot)org
> >> Subject: Re: [HACKERS] [RFC] Unsigned integer support.
> >>
> >>
> >>
> >> Dann Corbit wrote:
> >>
> >>> CREATE DOMAIN usmallint AS SMALLINT CHECK(VALUE > 0);
> >>> CREATE DOMAIN uinteger AS INTEGER CHECK(VALUE > 0);
> >>> CREATE DOMAIN ubigint AS BIGINT CHECK(VALUE > 0);
> >>> CREATE DOMAIN unumeric AS NUMERIC CHECK(VALUE > 0);
> >>>
> >>>
> >>>
> >>>
> >> s/>/>=/g
> >>
> >
> > I turned off the default "option" to remove extra line breaks.
> > Future posts should not be quite as even and bletcherous.
> > God willing, and the crick don't rise.
> >
> >
>
> I suspect you're missing my point, namely that 0 should be an allowed
> value for unsigned types.

Quite right. The domains I created were really the 'natural numbers'
rather than unsigned types.


From: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-26 03:41:57
Message-ID: e739902b0807252041k5c1def40va54f5196a8207076@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

On Fri, Jul 25, 2008 at 12:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Consider the idea of not having any uint4-specific arithmetic operators,
> but instead providing the following:
>
> * assignment casts from int4 and int8 to uint4
> (these throw error if out of range, of course)
> * implicit cast from uint4 to int8 (can never fail)
>
> The effect of providing the latter cast would be that any arithmetic
> involving a uint4 column would automatically be done in int8. Which
> would make it a shade slower than a native implementation, but probably
> not enough slower to be a problem --- and you'd avoid having to write
> dozens of operators and underlying support functions. Storing into the
> uint4 column would work fine with no extra notation because of the
> assignment casts.

This is an interesting idea that I will test out tonight. I did have
the following
concern looking through src/backend/utils/adt/int8.c: There is code that is
optionally compiled based on the INT64_IS_BUSTED pre-processor define.
Is this pre-processor define something I should worry about for portability
with this plan?

After I get uint types implemented, for fun I might try some benchmarks
to see if I can detect the int8 overhead on a 32-bit system.

> Moreover, you'd avoid cluttering the system with a pile of cross-type
> operators, which we have recently realized are not a good thing, because
> they increase the likelihood of "ambiguous operator" problems --- see
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg00750.php

Good to know. Thanks for the link.

> For uint8 you'd have to promote to numeric to guarantee no failure
> in the implicit cast; which is going to be a rather bigger performance
> hit, but I don't really see uint8 as being a type with huge demand.

Hopefully I will not need the uint8 type. Right now for a project I am
looking at I need the uint2 and uint4 types. uint8 support can come
later if it is needed or requested.

> Now you probably *will* want cross-type comparison operators, if you
> are going to support indexing of unsigned columns, so that something
> like
> uint4col > 42
> can be indexed without any casting. But limiting yourself to the six
> basic comparison operators certainly makes it a much less bulky project.

This sounds excellent! Hopefully by using these operators I will be able to
avoid most of the casting to int8 for my use, while still providing the
complete functionality for this type.

Thanks again for your review and feedback!

- Ryan


From: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
To: "Dann Corbit" <DCorbit(at)connx(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-26 03:50:02
Message-ID: e739902b0807252050k49a0635dj5ced5ba1fd45e657@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Dann,

On Fri, Jul 25, 2008 at 1:06 PM, Dann Corbit <DCorbit(at)connx(dot)com> wrote:

> At the cost of one bit of storage, you have compatible types using

Thanks for your review and feedback! Unfortunately, I do need the full range
of the unsigned types for the project I am looking at. The reason I started
working on these types is because it seemed wasteful to use the next size
larger signed integer for the storage type of the unsigned integer.

Thanks for the suggestion!

- Ryan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Ryan Bradetich" <rbradetich(at)gmail(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-26 04:03:28
Message-ID: 7782.1217045008@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Ryan Bradetich" <rbradetich(at)gmail(dot)com> writes:
> ... I did have the following
> concern looking through src/backend/utils/adt/int8.c: There is code that is
> optionally compiled based on the INT64_IS_BUSTED pre-processor define.
> Is this pre-processor define something I should worry about for portability
> with this plan?

I wouldn't worry, really ;-). Five or more years ago, it seemed
important for PG to work on machines without functional int64 support,
but there is little if any evidence that anyone is using current PG
releases on such platforms. I might well be the last active PG hacker
who gives a damn about that case at all, and even I long ago stopped
expecting anything beyond core functionality to work on such a machine.
Since your proposed unsigned types certainly aren't core functionality,
I see no reason that they should need to work on INT64_IS_BUSTED
platforms.

> After I get uint types implemented, for fun I might try some benchmarks
> to see if I can detect the int8 overhead on a 32-bit system.

Right, you need to check that before drinking the kool-aid ...

regards, tom lane


From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Ryan Bradetich <rbradetich(at)gmail(dot)com>
Cc: Dann Corbit <DCorbit(at)connx(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-26 16:06:21
Message-ID: 20080726160621.GA25131@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Ryan,

I agree, I have had applications use uint types to avoid using
a larger data type. I have actually had to patch an application
developed for MySQL uint8 to signed int8 on PostgreSQL. In that
case, the only operations that were performed where assignment
and lookup. If we need to use the numeric type for calculations,
that would be okay as long as the data is actually stored as
8-bytes, not numeric. It would certainly increase the ease of
moving an application from MySQL to PostgreSQL.

Cheers,
Ken

On Fri, Jul 25, 2008 at 08:50:02PM -0700, Ryan Bradetich wrote:
> Hello Dann,
>
> On Fri, Jul 25, 2008 at 1:06 PM, Dann Corbit <DCorbit(at)connx(dot)com> wrote:
>
> > At the cost of one bit of storage, you have compatible types using
>
> Thanks for your review and feedback! Unfortunately, I do need the full range
> of the unsigned types for the project I am looking at. The reason I started
> working on these types is because it seemed wasteful to use the next size
> larger signed integer for the storage type of the unsigned integer.
>
> Thanks for the suggestion!
>
> - Ryan
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Decibel! <decibel(at)decibel(dot)org>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Ryan Bradetich <rbradetich(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-28 21:29:48
Message-ID: 2D6769F9-2315-4C49-A076-961C20518D49@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jul 25, 2008, at 11:44 AM, Alvaro Herrera wrote:
> However, it would be neat if this behaved the same as
>
> alvherre=# select '0.42' + 1;
> ERROR: invalid input syntax for integer: "0.42"
> STATEMENT: select '0.42' + 1;

Do we really want to be making it easier for people to wrap numbers
in quotes?
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Decibel!" <decibel(at)decibel(dot)org>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Ryan Bradetich" <rbradetich(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-28 22:54:28
Message-ID: 87fxptsjq3.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Decibel!" <decibel(at)decibel(dot)org> writes:

> On Jul 25, 2008, at 11:44 AM, Alvaro Herrera wrote:
>> However, it would be neat if this behaved the same as
>>
>> alvherre=# select '0.42' + 1;
>> ERROR: invalid input syntax for integer: "0.42"
>> STATEMENT: select '0.42' + 1;
>
>
> Do we really want to be making it easier for people to wrap numbers in quotes?

Currently wrapping numbers in quotes is really the way Postgres expects to get
them. Quoted constants in Postgres are just "unknown" type which are parsed
according to context. Unquoted constants in Postgres are assigned a type by
the parser based on what they look like and then have to be cast to something
else if that turns out to be the wrong data type.

I think people from other languages expect quoted constants to be strings and
they often go out of their way to avoid them (causing themselves headaches
when they then need to deal with casting rules).

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Decibel!" <decibel(at)decibel(dot)org>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Ryan Bradetich" <rbradetich(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC] Unsigned integer support.
Date: 2008-07-28 23:13:20
Message-ID: 338.1217286800@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Decibel!" <decibel(at)decibel(dot)org> writes:
>> Do we really want to be making it easier for people to wrap numbers in quotes?

> Currently wrapping numbers in quotes is really the way Postgres expects to get
> them.

Really?

regression=# select '2' + '2';
ERROR: operator is not unique: unknown + unknown
LINE 1: select '2' + '2';
^
HINT: Could not choose a best candidate operator. You might need to add explicit type casts.

We've worked really hard to get to a point where interpretation of
literals mostly behaves unsurprisingly. I think most people would
find it surprising to think that quoting a number is the preferred
way to represent it.

regards, tom lane