Re: WIP: RangeTypes

Lists: pgsql-hackers
From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: WIP: RangeTypes
Date: 2011-01-11 09:16:47
Message-ID: 1294737407.18031.3651.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ok, I have made some progress. This is still a proof-of-concept patch,
but the important pieces are working together.

Synopsis:

CREATE TYPE numrange AS RANGE (SUBTYPE=numeric,
SUBTYPE_CMP=numeric_cmp);

SELECT range_eq('[1,2.2)'::numrange,'[1,2.2]');
SELECT range_lbound('(3.7,9]'::numrange);
SELECT range(6.7);
SELECT '-'::numrange; -- empty
SELECT '[1, NULL]'::numrange; -- ] will become )
SELECT '(INF, 3)'::numrange;

I haven't completed many of the other generic functions, because I'd
like to make sure I'm on the right track first. The important thing
about the functions above is that they show ANYRANGE working in
conjunction with ANYELEMENT in various combinations, which was a
significant part of this patch.

Here are the open items:

1. Generic functions -- most of which are fairly obvious. However, I
want to make sure I'm on the right track first.

2. GiST -- I'll need a mechanism to implement the "penalty" function,
and perhaps I'll also need additional support for the picksplit
function. For the "penalty" function, I think I'll need to require a
function to convert the subtype into a float, and I can use that to find
a distance (which can be the penalty). That should also satisfy anything
that picksplit might need.

3. Typmod -- There is still one annoyance about typmod remaining. I need
to treat it like an array in find_typmod_coercion_function(), and then
create a coercion expression. Is it worth it? Would typmod on a range be
confusing, or should I just finish this item up?

4. Docs

5. Tests

6. pg_dump -- should be pretty easy; I just want to settle some of the
other stuff first.

7. Right now the parse function is quite dumb. Is there some example
code I should follow to make sure I get this right?

8. In order to properly support the various combinations of ANYRANGE and
ANYELEMENT in a function definition (which are all important), we need
to be able to determine the range type given a subtype. That means that
each subtype can only have one associated range, which sounds somewhat
limiting, but it can be worked around by using domains. I don't think
this is a major limitation. Comments?

9. Representation -- right now I store the OID of the range type in the
range itself, much like arrays, in order to call the find the functions
to operate on the subtype. Robert has some justifiable concerns about
that 4-byte overhead. Possible ideas:

* Forget about ANYRANGE altogether, and generate new catalog entries
for the generic functions for each new range type defined. I don't
particularly like this approach because it makes it very difficult to
define new generic functions.

* Somehow fix the type system so that we know the specific types of
arguments in all situations. I don't know if this is feasible.

* Store a 8- or 16-bit unique number in pg_range, and store that
number in the representation. That would be pretty ugly, and limit the
total possible range types defined at once, but it saves a couple bytes
per value.

* Try to somehow mimic what records do. Records use a global array and
use the typmod as an index into that array. It looks like a hack to me,
but might be worth borrowing anyway.

Also related to representation:

* Right now I always align the subtypes within the range according to
typalign. I could avoid that by packing the bytes tightly, and then
copying them around later. Suggestions? And what should the overall
alignment of the range type be?

* If it's a fixed-length type, we can save the varlena header byte on
the overall range; but we lose the ability to save space when one of the
boundaries of the range is missing (NULL or INF), and it would
complicate the code a little. Thoughts?

Regards,
Jeff Davis

Attachment Content-Type Size
rangetypes-20110110.patch.gz application/x-gzip 15.5 KB

From: David Fetter <david(at)fetter(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-11 19:13:34
Message-ID: 20110111191334.GB11603@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 11, 2011 at 01:16:47AM -0800, Jeff Davis wrote:
> Ok, I have made some progress. This is still a proof-of-concept patch,
> but the important pieces are working together.
>
> Synopsis:
>
> CREATE TYPE numrange AS RANGE (SUBTYPE=numeric,
> SUBTYPE_CMP=numeric_cmp);
>
> SELECT range_eq('[1,2.2)'::numrange,'[1,2.2]');
> SELECT range_lbound('(3.7,9]'::numrange);
> SELECT range(6.7);
> SELECT '-'::numrange; -- empty
> SELECT '[1, NULL]'::numrange; -- ] will become )
> SELECT '(INF, 3)'::numrange;
>
> I haven't completed many of the other generic functions, because I'd
> like to make sure I'm on the right track first. The important thing
> about the functions above is that they show ANYRANGE working in
> conjunction with ANYELEMENT in various combinations, which was a
> significant part of this patch.
>
> Here are the open items:
>
> 1. Generic functions -- most of which are fairly obvious. However, I
> want to make sure I'm on the right track first.
>
> 2. GiST -- I'll need a mechanism to implement the "penalty" function,
> and perhaps I'll also need additional support for the picksplit
> function. For the "penalty" function, I think I'll need to require a
> function to convert the subtype into a float, and I can use that to find
> a distance (which can be the penalty). That should also satisfy anything
> that picksplit might need.
>
> 3. Typmod -- There is still one annoyance about typmod remaining. I need
> to treat it like an array in find_typmod_coercion_function(), and then
> create a coercion expression. Is it worth it? Would typmod on a range be
> confusing, or should I just finish this item up?

Probably not worth it for the first round.

> 4. Docs

Happy to help evenings this week :)

> 5. Tests

Same. What do you have so far?

> 6. pg_dump -- should be pretty easy; I just want to settle some of the
> other stuff first.
>
> 7. Right now the parse function is quite dumb. Is there some example
> code I should follow to make sure I get this right?

KISS is a fine principle. Do you really need it smart on the first
round? :)

> 8. In order to properly support the various combinations of ANYRANGE and
> ANYELEMENT in a function definition (which are all important), we need
> to be able to determine the range type given a subtype. That means that
> each subtype can only have one associated range, which sounds somewhat
> limiting, but it can be worked around by using domains. I don't think
> this is a major limitation. Comments?

As we get a more nuanced type system, this is one of the things that
will need to get reworked, so I'd say it's better not to put too much
effort into things that a refactor of the type system
<http://wiki.postgresql.org/wiki/Refactor_Type_System> would make much
better, at least right now.

> Also related to representation:
>
> * Right now I always align the subtypes within the range according to
> typalign. I could avoid that by packing the bytes tightly, and then
> copying them around later. Suggestions? And what should the overall
> alignment of the range type be?

For the first cut, the simplest possible.

> * If it's a fixed-length type, we can save the varlena header byte on
> the overall range; but we lose the ability to save space when one of the
> boundaries of the range is missing (NULL or INF), and it would
> complicate the code a little. Thoughts?

Probably not worth complicating the code at this stage. KISS again :)

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-12 17:19:26
Message-ID: 1294852766.18031.3682.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2011-01-11 at 11:13 -0800, David Fetter wrote:
> > 3. Typmod -- There is still one annoyance about typmod remaining. I need
> > to treat it like an array in find_typmod_coercion_function(), and then
> > create a coercion expression. Is it worth it? Would typmod on a range be
> > confusing, or should I just finish this item up?
>
> Probably not worth it for the first round.

OK, I'll block typmods for range types for now.

> > 4. Docs
>
> Happy to help evenings this week :)
>
> > 5. Tests
>
> Same. What do you have so far?

Great!

I think the best tests would be around the ANYRANGE type mechanism to
see if anything seems wrong or limiting. Particularly, its interaction
with ANYELEMENT.

> > 7. Right now the parse function is quite dumb. Is there some example
> > code I should follow to make sure I get this right?
>
> KISS is a fine principle. Do you really need it smart on the first
> round? :)

Well, it needs to be correct ;)

Specifically, I think there will be a problem if there is a multibyte
character following a backslash. There may be other problems, as well. I
could probably get these fixed, but it might be better to follow
patterns in other code. I'll look into it.

> > 8. In order to properly support the various combinations of ANYRANGE and
> > ANYELEMENT in a function definition (which are all important), we need
> > to be able to determine the range type given a subtype. That means that
> > each subtype can only have one associated range, which sounds somewhat
> > limiting, but it can be worked around by using domains. I don't think
> > this is a major limitation. Comments?
>
> As we get a more nuanced type system, this is one of the things that
> will need to get reworked, so I'd say it's better not to put too much
> effort into things that a refactor of the type system
> <http://wiki.postgresql.org/wiki/Refactor_Type_System> would make much
> better, at least right now.

Sounds good. I don't think this is an actual problem, so I'll consider
this a non-issue unless someone else has a comment.

> > Also related to representation:
> >
> > * Right now I always align the subtypes within the range according to
> > typalign. I could avoid that by packing the bytes tightly, and then
> > copying them around later. Suggestions? And what should the overall
> > alignment of the range type be?
>
> For the first cut, the simplest possible.

OK. It's already about as simple as it can get, but might be fairly
wasteful.

> > * If it's a fixed-length type, we can save the varlena header byte on
> > the overall range; but we lose the ability to save space when one of the
> > boundaries of the range is missing (NULL or INF), and it would
> > complicate the code a little. Thoughts?
>
> Probably not worth complicating the code at this stage. KISS again :)

OK.

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-14 08:27:04
Message-ID: 1294993624.18031.3696.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Updated patch.

Summary of changes:

* More generic functions

* pg_dump support

* remove typmod support until it can be done correctly

* added some tests

There is still quite a bit left, including (numbers match up with
previous TODO list):

1. Generic functions -- still more work to do here. Handling the
combination of continuous range semantics with NULLs requires quite a
lot of special cases, because it's hard to share code among functions.
Even something as simple as "equals" is not as trivial as it sounds.
Perhaps I'm missing some cleaner abstractions, or perhaps I'm
over-thinking the null semantics.

3. perhaps fix typmod

4. documentation

5. more tests

7. better parser

Regards,
Jeff Davis

Attachment Content-Type Size
rangetypes-20110114.patch.gz application/x-gzip 18.7 KB

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-17 21:09:26
Message-ID: 1295298566.11513.27.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

When defining generic range functions, there is quite a bit of extra
complexity needed to handle special cases.

The special cases are due to:
* empty ranges
* ranges with infinite boundaries
* ranges with NULL boundaries
* ranges with exclusive bounds (e.g. "(" or ")").

Infinite bounds, and exclusive bounds can both be handled somewhat
reasonably, and the complexity can be somewhat hidden. Empty ranges are
a special case, but can be handled at the top of the generic function in
a straightforward way.

NULL bounds, however, have been causing me a little frustration. A
reasonable interpretation of boolean operators that operate on ranges
might be: "true or false if we can prove it from only the inputs; else
NULL". This gets a little interesting because a NULL value as a range
boundary isn't 100% unknown: it's known to be on one side of the other
bound (assuming that the other side is known). This is similar to how
AND and OR behave for NULL. For instance, take the simple definition of
"contains":

r1.a <= r2.a AND r1.b >= r2.b

(where "a" is the lower bound and "b" is the upper)

Consider r1: [NULL, 10], r2: [20, NULL]. Contains should return "false"
according to our rule above, because no matter what the values of r1.a
and r2.b, the ranges can't possibly overlap.

So, now, more complexity needs to be added. We can be more redundant and
do:

r1.a <= r2.a AND r1.b <= r2 AND r1.a <= r2.b AND r1.b >= r2.a

That seems a little error-prone and harder to understand.

Then, when we have functions that operate on ranges and return ranges,
we're not dealing with 3VL exactly, but some other intuition about what
NULL should do. The semantics get a lot more complicated and hard to
reason about. For instance, what about:
(NULL, 5) INTERSECT (3, NULL)
Should that evaluate to NULL, (NULL, NULL), or throw an error? What
about:
(NULL, 5) MINUS (NULL, 7)
(NULL, 5) MINUS (3, NULL)

I feel like I'm making this too complicated. Should I just scope out
NULL range boundaries for the first cut, and leave room in the
representation so that it can be added when there is a more thorough
proposal for NULL range boundaries?

Regards,
Jeff Davis


From: David Fetter <david(at)fetter(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-17 21:23:31
Message-ID: 20110117212331.GE18561@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 17, 2011 at 01:09:26PM -0800, Jeff Davis wrote:
> When defining generic range functions, there is quite a bit of extra
> complexity needed to handle special cases.
>
> The special cases are due to:
> * empty ranges
> * ranges with infinite boundaries
> * ranges with NULL boundaries
> * ranges with exclusive bounds (e.g. "(" or ")").
>
> Infinite bounds, and exclusive bounds can both be handled somewhat
> reasonably, and the complexity can be somewhat hidden. Empty ranges
> are a special case, but can be handled at the top of the generic
> function in a straightforward way.
>
> NULL bounds, however, have been causing me a little frustration.
> [Explanation and illustrations].

In that case, let's leave them out for this cut.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-17 23:59:14
Message-ID: 24200.1295308754@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> I feel like I'm making this too complicated. Should I just scope out
> NULL range boundaries for the first cut, and leave room in the
> representation so that it can be added when there is a more thorough
> proposal for NULL range boundaries?

+1. I'm far from convinced that a null boundary is sane at all.
If you don't know the value, how do you know it's greater/less than the
other bound?

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-18 00:09:12
Message-ID: 4D34DA28.4000703@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/17/11 1:09 PM, Jeff Davis wrote:
> I feel like I'm making this too complicated. Should I just scope out
> NULL range boundaries for the first cut, and leave room in the
> representation so that it can be added when there is a more thorough
> proposal for NULL range boundaries?

Well, NULL range boundaries aren't usable with Temporal, and yet I wrote
a whole scheduling application around it. So I think it's OK to have
them as a TODO and raise an error for now. Heck, we had arrays which
didn't accept NULLs for years.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-20 09:29:06
Message-ID: 1295515746.11513.100.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

New patch. I added a lot of generic range functions, and a lot of
operators.

There is still more work to do, this is just an updated patch. The
latest can be seen on the git repository, as well:

http://git.postgresql.org/gitweb?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes

Regards,
Jeff Davis

Attachment Content-Type Size
rangetypes-20110119.gz application/x-gzip 21.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-21 17:31:43
Message-ID: AANLkTi=Uf25iex0AQc0Sfb3WHzbE13KL0p21js6+zkyi@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 4:29 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> New patch. I added a lot of generic range functions, and a lot of
> operators.
>
> There is still more work to do, this is just an updated patch. The
> latest can be seen on the git repository, as well:

So is this 9.2 material at this point?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-21 19:30:42
Message-ID: 1295638242.4680.14.camel@jdavis-ux.asterdata.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2011-01-21 at 12:31 -0500, Robert Haas wrote:
> On Thu, Jan 20, 2011 at 4:29 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> > New patch. I added a lot of generic range functions, and a lot of
> > operators.
> >
> > There is still more work to do, this is just an updated patch. The
> > latest can be seen on the git repository, as well:
>
> So is this 9.2 material at this point?

Regardless of whether it's eligible to be in 9.1, I plan to keep working
on it.

I would appreciate some overall feedback during this commitfest. Much of
the code is there, so it would be helpful if we could settle issues like
representation, functionality, interface, catalog, API, grammar, and
naming. Otherwise, those issues will just be a reason to bounce it from
commitfest-next, as well.

Regards,
Jeff Davis


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-22 04:28:18
Message-ID: AANLkTin7wff8Z5US70TM2fprk=1o_kRVd1WmxpKd1beh@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 21, 2011 at 2:30 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Fri, 2011-01-21 at 12:31 -0500, Robert Haas wrote:
>> On Thu, Jan 20, 2011 at 4:29 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> > New patch. I added a lot of generic range functions, and a lot of
>> > operators.
>> >
>> > There is still more work to do, this is just an updated patch. The
>> > latest can be seen on the git repository, as well:
>>
>> So is this 9.2 material at this point?
>
> Regardless of whether it's eligible to be in 9.1, I plan to keep working
> on it.
>
> I would appreciate some overall feedback during this commitfest. Much of
> the code is there, so it would be helpful if we could settle issues like
> representation, functionality, interface, catalog, API, grammar, and
> naming. Otherwise, those issues will just be a reason to bounce it from
> commitfest-next, as well.

Agreed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 07:45:30
Message-ID: 1296200730.11513.467.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Updated patch.

Changes:

* Documentation for operators/functions
* a comprehensive set of operators and functions
* BTree opclass
* Hash opclass
* built-in range types:
- PERIOD (timestamp)
- PERIODTZ (timestamptz)
- DATERANGE (date)
- INTRANGE (int4)
- NUMRANGE (numeric)
* added subtype float function to the API, which will be useful for
GiST
* created canonical functions for intrange and daterange, so that:
'[1,5]'::intrange = '[1,6)'::intrange
* added length() function, written in SQL as:
select upper($1) - lower($1)
which uses polymorphic "-" operator to avoid the need to
give the subtype subtract function and return type to the generic
API

Open items:

* More documentation work
* Settle any representation/alignment concerns
* Should the new length() function be marked as immutable, stable,
or volatile? It uses the polymorphic "-" operator, and I suppose
someone could define a non-immutable version of that before calling
length(). Then again, it is likely to be inlined anyway, right?
* GiST
- docs
- catalog work
- implementation
* typmod support (optional)

This is nearing completion. GiST is by far the most amount of effort
remaining that I'm aware of. Comments about the API, naming,
representation, interface, funcationality, grammar, etc. are welcome.

Regards,
Jeff Davis

Attachment Content-Type Size
rangetypes-20110127.patch.gz application/x-gzip 33.8 KB

From: David Fetter <david(at)fetter(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 17:17:16
Message-ID: 20110128171716.GC16469@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 27, 2011 at 11:45:30PM -0800, Jeff Davis wrote:
> Updated patch.
>
> Changes:
>
> * Documentation for operators/functions
> * a comprehensive set of operators and functions
> * BTree opclass

Yay!

> * Hash opclass
> * built-in range types:
> - PERIOD (timestamp)
> - PERIODTZ (timestamptz)

For consistency, and in order not to continue our atrocious naming
tradition, I'd like to propose that the above be named timestamprange
(tsrange for short) and timestamptzrange (tstzrange for short).

> - DATERANGE (date)

Yay!

> - INTRANGE (int4)

int4range/intrange and the missing bigintrange/int8range

> - NUMRANGE (numeric)

numericrange/numrange.

Should there also be a timerange and a timetzrange?

> * added subtype float function to the API, which will be useful for
> GiST

w00t!

> * created canonical functions for intrange and daterange, so that:
> '[1,5]'::intrange = '[1,6)'::intrange

Excellent!

> * added length() function, written in SQL as:
> select upper($1) - lower($1)
> which uses polymorphic "-" operator to avoid the need to
> give the subtype subtract function and return type to the generic
> API
>
> Open items:
>
> * More documentation work
> * Settle any representation/alignment concerns
> * Should the new length() function be marked as immutable, stable,
> or volatile? It uses the polymorphic "-" operator, and I suppose
> someone could define a non-immutable version of that before calling
> length(). Then again, it is likely to be inlined anyway, right?
> * GiST
> - docs
> - catalog work
> - implementation
> * typmod support (optional)
>
> This is nearing completion. GiST is by far the most amount of effort
> remaining that I'm aware of. Comments about the API, naming,
> representation, interface, funcationality, grammar, etc. are welcome.
>
> Regards,
> Jeff Davis

I'd offer to help, but personal matters press this weekend :)

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 17:48:52
Message-ID: 1296236932.11513.496.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2011-01-28 at 09:17 -0800, David Fetter wrote:
> For consistency, and in order not to continue our atrocious naming
> tradition, I'd like to propose that the above be named timestamprange
> (tsrange for short) and timestamptzrange (tstzrange for short).

No real objection, but I'd like to see if someone else will second it.

Also, I don't think aliases are very easy to define. They appear to all
be special cases in the backend code, without catalog support. Should I
use domains? If not, I think we'll have to stick to one name.

> > - INTRANGE (int4)
>
> int4range/intrange and the missing bigintrange/int8range

I thought about adding int8range, and the first time around that's what
I tried. But then I realized that the literal "4" is interpreted as an
int4, meaning that "range(1,10)" would be interpreted as int4range, so
int8range was slightly annoying to use because you have to cast the
literals.

Also, the storage is not particularly efficient right now anyway, so if
you need int8range, you could probably use numrange instead.

I don't mind either way. If you think someone will use it, I'll add it.

> Should there also be a timerange and a timetzrange?

I thought about it, and I realized that I've never seen the "time" type
used. Again, I'll add it if someone will use it.

Keep in mind that it's fairly easy for people to add their own range
types. The most difficult part is defining the "canonical" function if
it is applicable, and the "subtype_float" function which is necessary
for GiST.

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 18:29:03
Message-ID: 22773.1296239343@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Fri, 2011-01-28 at 09:17 -0800, David Fetter wrote:
>> For consistency, and in order not to continue our atrocious naming
>> tradition, I'd like to propose that the above be named timestamprange
>> (tsrange for short) and timestamptzrange (tstzrange for short).

> No real objection, but I'd like to see if someone else will second it.

> Also, I don't think aliases are very easy to define.

They are not, and should be avoided. I don't think we have *any*
typename aliases except for cases required by SQL standard.

>> Should there also be a timerange and a timetzrange?

> I thought about it, and I realized that I've never seen the "time" type
> used. Again, I'll add it if someone will use it.

I have no idea what the semantics of timetzrange would be. Even
timerange would be a bit funny --- is 11PM before or after 1AM?

regards, tom lane


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 18:41:35
Message-ID: A217BFF8-8E8B-4056-954E-15FE5D7D2985@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan 28, 2011, at 9:48 AM, Jeff Davis wrote:

> On Fri, 2011-01-28 at 09:17 -0800, David Fetter wrote:
>> For consistency, and in order not to continue our atrocious naming
>> tradition, I'd like to propose that the above be named timestamprange
>> (tsrange for short) and timestamptzrange (tstzrange for short).
>
> No real objection, but I'd like to see if someone else will second it.

+1 in principal. I think we should try to avoid the user of the term "period" if possible, and I see definite benefits to a simple model of $typename . 'range';

> Keep in mind that it's fairly easy for people to add their own range
> types. The most difficult part is defining the "canonical" function if
> it is applicable, and the "subtype_float" function which is necessary
> for GiST.

Is there GIN support? GIN seems to be the preferred index type for this sort of thing, no?

Best,

David


From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 19:15:04
Message-ID: 87oc70n8lj.fsf@cbbrowne.afilias-int.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

pgsql(at)j-davis(dot)com (Jeff Davis) writes:

> On Fri, 2011-01-28 at 09:17 -0800, David Fetter wrote:
>> For consistency, and in order not to continue our atrocious naming
>> tradition, I'd like to propose that the above be named timestamprange
>> (tsrange for short) and timestamptzrange (tstzrange for short).
>
> No real objection, but I'd like to see if someone else will second it.
>
> Also, I don't think aliases are very easy to define. They appear to all
> be special cases in the backend code, without catalog support. Should I
> use domains? If not, I think we'll have to stick to one name.

Somehow, rangets, rangetstz seem better to me, but that's not a deep
issue. I'm not certain of the basis for *truly* preferring an ordering
of the components (ts/timestamp, tz, range). As long as it's rational,
and not too terribly inconsistent with other prefix/suffix handlings,
I'm fine with it.

Mind you, timestamptzrange seems a mite *long* to me.

>> > - INTRANGE (int4)
>>
>> int4range/intrange and the missing bigintrange/int8range
>
> I thought about adding int8range, and the first time around that's what
> I tried. But then I realized that the literal "4" is interpreted as an
> int4, meaning that "range(1,10)" would be interpreted as int4range, so
> int8range was slightly annoying to use because you have to cast the
> literals.
>
> Also, the storage is not particularly efficient right now anyway, so if
> you need int8range, you could probably use numrange instead.
>
> I don't mind either way. If you think someone will use it, I'll add it.

Making sure it's consistent with int4, int8, bigint sure seems like a
good idea.

>> Should there also be a timerange and a timetzrange?
>
> I thought about it, and I realized that I've never seen the "time" type
> used. Again, I'll add it if someone will use it.
>
> Keep in mind that it's fairly easy for people to add their own range
> types. The most difficult part is defining the "canonical" function if
> it is applicable, and the "subtype_float" function which is necessary
> for GiST.

I don't see much use for "time"; it is *so* likely that you'll need date
overlaps that it's difficult for it to be useful without making it
extremely magical (e.g. - stowing a lot of logic inside that adds in
date information behind the scenes).

FYI, it's compiling and testing fine for me. This one strikes me as an
exciting change, once GIST is in place. Well, actually, even without it :-).

postgres(at)localhost-> insert into foo (dr) values ('[2010-01-01,2011-12-31)');
INSERT 0 1
postgres(at)localhost-> select * from foo;
id | dr
----+----------------------------
1 | [ 2010-01-01, 2011-12-31 )
(1 row)
--
let name="cbbrowne" and tld="gmail.com" in String.concat "@" [name;tld];;
http://linuxfinances.info/info/rdbms.html
If vegetarians eat vegetables, what do humanitarians eat?


From: Thom Brown <thom(at)linux(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 20:28:55
Message-ID: AANLkTimxBpR2CT8uqxyV0oJ6DOmVvi4EkfMruyjk89e5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 January 2011 07:45, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> Updated patch.
>
> Changes:
>
>  * Documentation for operators/functions
>  * a comprehensive set of operators and functions
>  * BTree opclass
>  * Hash opclass
>  * built-in range types:
>    - PERIOD (timestamp)
>    - PERIODTZ (timestamptz)
>    - DATERANGE (date)
>    - INTRANGE (int4)
>    - NUMRANGE (numeric)
>  * added subtype float function to the API, which will be useful for
>    GiST
>  * created canonical functions for intrange and daterange, so that:
>      '[1,5]'::intrange = '[1,6)'::intrange
>  * added length() function, written in SQL as:
>      select upper($1) - lower($1)
>    which uses polymorphic "-" operator to avoid the need to
>    give the subtype subtract function and return type to the generic
>    API
>
> Open items:
>
>  * More documentation work
>  * Settle any representation/alignment concerns
>  * Should the new length() function be marked as immutable, stable,
>    or volatile? It uses the polymorphic "-" operator, and I suppose
>    someone could define a non-immutable version of that before calling
>    length(). Then again, it is likely to be inlined anyway, right?
>  * GiST
>    - docs
>    - catalog work
>    - implementation
>  * typmod support (optional)
>
> This is nearing completion. GiST is by far the most amount of effort
> remaining that I'm aware of. Comments about the API, naming,
> representation, interface, funcationality, grammar, etc. are welcome.
>
> Regards,
>        Jeff Davis

Very nice work Jeff!

This is not very graceful:

postgres=# CREATE TYPE numrange AS RANGE (SUBTYPE=numeric,
SUBTYPE_CMP=numeric_cmp);
ERROR: duplicate key value violates unique constraint
"pg_range_rgnsubtype_index"
DETAIL: Key (rngsubtype)=(1700) already exists.

Also, if I try the same, but with a different name for the type, I get
the same error. Why does that restriction exist? Can't you have
types which happen to use the exact same subtype?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Thom Brown <thom(at)linux(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-28 21:52:29
Message-ID: AANLkTi=SFcJS4e30b6UPfv_J6qKeUeLXfBcZ8rSXO_dK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 January 2011 20:28, Thom Brown <thom(at)linux(dot)com> wrote:
> On 28 January 2011 07:45, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> Updated patch.
>>
>> Changes:
>>
>>  * Documentation for operators/functions
>>  * a comprehensive set of operators and functions
>>  * BTree opclass
>>  * Hash opclass
>>  * built-in range types:
>>    - PERIOD (timestamp)
>>    - PERIODTZ (timestamptz)
>>    - DATERANGE (date)
>>    - INTRANGE (int4)
>>    - NUMRANGE (numeric)
>>  * added subtype float function to the API, which will be useful for
>>    GiST
>>  * created canonical functions for intrange and daterange, so that:
>>      '[1,5]'::intrange = '[1,6)'::intrange
>>  * added length() function, written in SQL as:
>>      select upper($1) - lower($1)
>>    which uses polymorphic "-" operator to avoid the need to
>>    give the subtype subtract function and return type to the generic
>>    API
>>
>> Open items:
>>
>>  * More documentation work
>>  * Settle any representation/alignment concerns
>>  * Should the new length() function be marked as immutable, stable,
>>    or volatile? It uses the polymorphic "-" operator, and I suppose
>>    someone could define a non-immutable version of that before calling
>>    length(). Then again, it is likely to be inlined anyway, right?
>>  * GiST
>>    - docs
>>    - catalog work
>>    - implementation
>>  * typmod support (optional)
>>
>> This is nearing completion. GiST is by far the most amount of effort
>> remaining that I'm aware of. Comments about the API, naming,
>> representation, interface, funcationality, grammar, etc. are welcome.
>>
>> Regards,
>>        Jeff Davis
>
> Very nice work Jeff!
>
> This is not very graceful:
>
> postgres=#  CREATE TYPE numrange AS RANGE (SUBTYPE=numeric,
>   SUBTYPE_CMP=numeric_cmp);
> ERROR:  duplicate key value violates unique constraint
> "pg_range_rgnsubtype_index"
> DETAIL:  Key (rngsubtype)=(1700) already exists.
>
> Also, if I try the same, but with a different name for the type, I get
> the same error.  Why does that restriction exist?  Can't you have
> types which happen to use the exact same subtype?

Also, how do you remove a range type which coincides with a system
range type. For example:

postgres=# CREATE TYPE numrange AS RANGE (SUBTYPE=interval,
SUBTYPE_CMP=interval_cmp);
CREATE TYPE
postgres=# drop type numrange;
ERROR: cannot drop type numrange because it is required by the database system

Is this because I shouldn't have been able to create this type in the
first place?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 18:52:09
Message-ID: 1296327129.11513.504.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2011-01-28 at 21:52 +0000, Thom Brown wrote:
> > This is not very graceful:
> >
> > postgres=# CREATE TYPE numrange AS RANGE (SUBTYPE=numeric,
> > SUBTYPE_CMP=numeric_cmp);
> > ERROR: duplicate key value violates unique constraint
> > "pg_range_rgnsubtype_index"
> > DETAIL: Key (rngsubtype)=(1700) already exists.

You're right, that should be a much nicer error message.

> > Also, if I try the same, but with a different name for the type, I get
> > the same error. Why does that restriction exist? Can't you have
> > types which happen to use the exact same subtype?

At first, that's how I designed it. Then, I realized that the type
system needs to know the range type from the element type in order for
something like ANYRANGE to work.

There's a workaround though: create a domain over numeric, and then
create a range over mynumeric.

=# create domain mynumeric as numeric;
CREATE DOMAIN
=# create type numrange2 as range (subtype=numeric,
subtype_cmp=numeric_cmp);
ERROR: duplicate key value violates unique constraint
"pg_range_rgnsubtype_index"
DETAIL: Key (rngsubtype)=(1700) already exists.
=# create type numrange2 as range (subtype=mynumeric,
subtype_cmp=numeric_cmp);
CREATE TYPE
=# select range(1.1::mynumeric,2.2::mynumeric);
range
--------------
[ 1.1, 2.2 )
(1 row)

> Also, how do you remove a range type which coincides with a system
> range type. For example:
>
> postgres=# CREATE TYPE numrange AS RANGE (SUBTYPE=interval,
> SUBTYPE_CMP=interval_cmp);
> CREATE TYPE
> postgres=# drop type numrange;
> ERROR: cannot drop type numrange because it is required by the database system
>
> Is this because I shouldn't have been able to create this type in the
> first place?

The types are in two different schemas. It's just as though you created
a table called pg_class.

To drop the one you created, do:
DROP TYPE public.numrange;

Regards,
Jeff Davis


From: Thom Brown <thom(at)linux(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 18:57:33
Message-ID: AANLkTin_VUG3a8PTmpM653kyzEB0YESiRznW7CV2EHdx@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 January 2011 18:52, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Fri, 2011-01-28 at 21:52 +0000, Thom Brown wrote:
>> Also, how do you remove a range type which coincides with a system
>> range type.  For example:
>>
>> postgres=#  CREATE TYPE numrange AS RANGE (SUBTYPE=interval,
>>    SUBTYPE_CMP=interval_cmp);
>> CREATE TYPE
>> postgres=# drop type numrange;
>> ERROR:  cannot drop type numrange because it is required by the database system
>>
>> Is this because I shouldn't have been able to create this type in the
>> first place?
>
> The types are in two different schemas. It's just as though you created
> a table called pg_class.
>
> To drop the one you created, do:
>  DROP TYPE public.numrange;

*facepalm* Of course. :) My bad.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 18:57:54
Message-ID: 1296327474.11513.509.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2011-01-28 at 10:41 -0800, David E. Wheeler wrote:
> +1 in principal. I think we should try to avoid the user of the term
> "period" if possible, and I see definite benefits to a simple model of
> $typename . 'range';

Interesting, I didn't realize that PERIOD was such an undesirable type
name.

> Is there GIN support? GIN seems to be the preferred index type for
> this sort of thing, no?

GiST is the natural index access method if we approach ranges as a
spatial type. I don't quite know what you have in mind for GIN; what
keys would you extract from the value '[1.23,4.56)' ?

Regards,
Jeff Davis


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 19:00:42
Message-ID: 9B3EF226-7C00-4216-8CEB-7CFEE68DBDAE@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan 29, 2011, at 10:57 AM, Jeff Davis wrote:

> On Fri, 2011-01-28 at 10:41 -0800, David E. Wheeler wrote:
>> +1 in principal. I think we should try to avoid the user of the term
>> "period" if possible, and I see definite benefits to a simple model of
>> $typename . 'range';
>
> Interesting, I didn't realize that PERIOD was such an undesirable type
> name.

It's not *hugely* undesirable. I just tend to think that "range" is more so.

>> Is there GIN support? GIN seems to be the preferred index type for
>> this sort of thing, no?
>
> GiST is the natural index access method if we approach ranges as a
> spatial type. I don't quite know what you have in mind for GIN; what
> keys would you extract from the value '[1.23,4.56)' ?

I think I'm just revealing my ignorance of these index types and what they're good for. My impression has been that GIN was a better but less-full-featured alternative to GiST and getting better with Tom's recent fixes for its handling of NULLs. But, uh, obviously not.

Best,

David


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 19:05:22
Message-ID: 1296327922.11513.517.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2011-01-28 at 14:15 -0500, Chris Browne wrote:
> Mind you, timestamptzrange seems a mite *long* to me.

Right. I think we might need to compromise here an use some shorter
names. tsrange/tstzrange/numrange seem reasonable to me.

> Making sure it's consistent with int4, int8, bigint sure seems like a
> good idea.

OK, I'll change intrange to int4range, and add int8range. int2range
doesn't seem useful, though.

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 19:10:37
Message-ID: 1296328237.11513.522.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2011-01-29 at 11:00 -0800, David E. Wheeler wrote:
> I think I'm just revealing my ignorance of these index types and what
> they're good for. My impression has been that GIN was a better but
> less-full-featured alternative to GiST and getting better with Tom's
> recent fixes for its handling of NULLs. But, uh, obviously not.

The idea of GIN is that you store multiple entries for each tuple you
insert. So, inserting a tuple containing the document 'hello world'
would store the keys "hello" and "world" both pointing back to that
tuple. It also makes sense for arrays.

But ranges are arbitrarily long, and don't have any defined "step", so
that means an infinite number of keys. GiST works better for that.

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 19:42:13
Message-ID: 14950.1296330133@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Fri, 2011-01-28 at 21:52 +0000, Thom Brown wrote:
> Also, if I try the same, but with a different name for the type, I get
> the same error. Why does that restriction exist? Can't you have
> types which happen to use the exact same subtype?

> At first, that's how I designed it. Then, I realized that the type
> system needs to know the range type from the element type in order for
> something like ANYRANGE to work.

That seems like a fairly bad restriction. In a datatype with multiple
useful sort orderings, it'd be desirable to be able to create a range
type for each such ordering, no? I'd be inclined to think of a range
type as being defined by element type plus a btree opfamily. Maybe it'd
be okay to insist on that combination as being unique.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thom Brown <thom(at)linux(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-29 19:53:14
Message-ID: 1296330794.11513.527.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2011-01-29 at 14:42 -0500, Tom Lane wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> > On Fri, 2011-01-28 at 21:52 +0000, Thom Brown wrote:
> > Also, if I try the same, but with a different name for the type, I get
> > the same error. Why does that restriction exist? Can't you have
> > types which happen to use the exact same subtype?
>
> > At first, that's how I designed it. Then, I realized that the type
> > system needs to know the range type from the element type in order for
> > something like ANYRANGE to work.
>
> That seems like a fairly bad restriction. In a datatype with multiple
> useful sort orderings, it'd be desirable to be able to create a range
> type for each such ordering, no? I'd be inclined to think of a range
> type as being defined by element type plus a btree opfamily. Maybe it'd
> be okay to insist on that combination as being unique.

I couldn't find another way to make a function with a definition like:

range(ANYELEMENT, ANYELEMENT) returns ANYRANGE

work. And it seemed worse to live without a constructor like that.
Ideas?

Also, it's not based on the btree opfamily right now. It's just based on
a user-supplied compare function. I think I could change it to store the
opfamily instead, if you think that's a better idea.

Regards,
Jeff Davis


From: Thom Brown <thom(at)linux(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-30 02:55:19
Message-ID: AANLkTi=FcLmw8U9jM6FA-3N5SNzbbEbZje0RAoGZTn6h@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 January 2011 19:53, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Sat, 2011-01-29 at 14:42 -0500, Tom Lane wrote:
>> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> > On Fri, 2011-01-28 at 21:52 +0000, Thom Brown wrote:
>> > Also, if I try the same, but with a different name for the type, I get
>> > the same error.  Why does that restriction exist?  Can't you have
>> > types which happen to use the exact same subtype?
>>
>> > At first, that's how I designed it. Then, I realized that the type
>> > system needs to know the range type from the element type in order for
>> > something like ANYRANGE to work.
>>
>> That seems like a fairly bad restriction.  In a datatype with multiple
>> useful sort orderings, it'd be desirable to be able to create a range
>> type for each such ordering, no?  I'd be inclined to think of a range
>> type as being defined by element type plus a btree opfamily.  Maybe it'd
>> be okay to insist on that combination as being unique.
>
> I couldn't find another way to make a function with a definition like:
>
>  range(ANYELEMENT, ANYELEMENT) returns ANYRANGE
>
> work. And it seemed worse to live without a constructor like that.
> Ideas?
>
> Also, it's not based on the btree opfamily right now. It's just based on
> a user-supplied compare function. I think I could change it to store the
> opfamily instead, if you think that's a better idea.

Probably ignorance here, but why does the following not work?

postgres=# select '[18,20]'::numrange @> 19;
ERROR: operator does not exist: numrange @> integer
LINE 1: select '[18,20]'::numrange @> 19;
^
HINT: No operator matches the given name and argument type(s). You
might need to add explicit type casts.

I can see both the wiki page on range types and the pg_operator table
appear to indicate this should work:

postgres=# select o.oprname, tl.typname as lefttype, tr.typname as
righttype from pg_operator o left join pg_type tl on o.oprleft =
tl.oid left join pg_type tr on o.oprright = tr.oid where 'anyrange' in
(tl.typname, tr.typname) and oprname = '@>';
oprname | lefttype | righttype
---------+----------+-------------
@> | anyrange | anynonarray
@> | anyrange | anyrange
(2 rows)

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Thom Brown <thom(at)linux(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-30 03:29:41
Message-ID: AANLkTim4Zd6AxCS-focuPr6D+dqcri84QezMxtdfgz_a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30 January 2011 02:55, Thom Brown <thom(at)linux(dot)com> wrote:
> On 29 January 2011 19:53, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> On Sat, 2011-01-29 at 14:42 -0500, Tom Lane wrote:
>>> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>>> > On Fri, 2011-01-28 at 21:52 +0000, Thom Brown wrote:
>>> > Also, if I try the same, but with a different name for the type, I get
>>> > the same error.  Why does that restriction exist?  Can't you have
>>> > types which happen to use the exact same subtype?
>>>
>>> > At first, that's how I designed it. Then, I realized that the type
>>> > system needs to know the range type from the element type in order for
>>> > something like ANYRANGE to work.
>>>
>>> That seems like a fairly bad restriction.  In a datatype with multiple
>>> useful sort orderings, it'd be desirable to be able to create a range
>>> type for each such ordering, no?  I'd be inclined to think of a range
>>> type as being defined by element type plus a btree opfamily.  Maybe it'd
>>> be okay to insist on that combination as being unique.
>>
>> I couldn't find another way to make a function with a definition like:
>>
>>  range(ANYELEMENT, ANYELEMENT) returns ANYRANGE
>>
>> work. And it seemed worse to live without a constructor like that.
>> Ideas?
>>
>> Also, it's not based on the btree opfamily right now. It's just based on
>> a user-supplied compare function. I think I could change it to store the
>> opfamily instead, if you think that's a better idea.
>
> Probably ignorance here, but why does the following not work?
>
> postgres=# select '[18,20]'::numrange @> 19;
> ERROR:  operator does not exist: numrange @> integer
> LINE 1: select '[18,20]'::numrange @> 19;
>                                   ^
> HINT:  No operator matches the given name and argument type(s). You
> might need to add explicit type casts.
>
>
> I can see both the wiki page on range types and the pg_operator table
> appear to indicate this should work:
>
> postgres=# select o.oprname, tl.typname as lefttype, tr.typname as
> righttype from pg_operator o left join pg_type tl on o.oprleft =
> tl.oid left join pg_type tr on o.oprright = tr.oid where 'anyrange' in
> (tl.typname, tr.typname) and oprname = '@>';
>  oprname | lefttype |  righttype
> ---------+----------+-------------
>  @>      | anyrange | anynonarray
>  @>      | anyrange | anyrange
> (2 rows)

As for docs, anyrange will need mentioning as part of the information
about polymorphic types:
http://developer.postgresql.org/pgdocs/postgres/extend-type-system.html

And on the pseudo-types page:
http://developer.postgresql.org/pgdocs/postgres/datatype-pseudo.html

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-30 21:33:07
Message-ID: 1296423187.11513.608.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

[ trying a third time to send this message, apparently there were
infrastructure problems before ]

On Sun, 2011-01-30 at 02:55 +0000, Thom Brown wrote:
> postgres=# select '[18,20]'::numrange @> 19;
> ERROR: operator does not exist: numrange @> integer
> LINE 1: select '[18,20]'::numrange @> 19;
> ^
> HINT: No operator matches the given name and argument type(s). You
> might need to add explicit type casts.

It's because it doesn't know the type on the right side, and assumes
it's an int4.

select '[18,20]'::numrange @> 19.0;

works.

Regards,
Jeff Davis


From: Thom Brown <thom(at)linux(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-30 22:07:08
Message-ID: AANLkTi=JaHDNLh2RGudXWE9pAmdn9-RS2FQ5N4ooWnt9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30 January 2011 21:33, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> [ trying a third time to send this message, apparently there were
> infrastructure problems before ]
>
> On Sun, 2011-01-30 at 02:55 +0000, Thom Brown wrote:
>> postgres=# select '[18,20]'::numrange @> 19;
>> ERROR:  operator does not exist: numrange @> integer
>> LINE 1: select '[18,20]'::numrange @> 19;
>>                                    ^
>> HINT:  No operator matches the given name and argument type(s). You
>> might need to add explicit type casts.
>
> It's because it doesn't know the type on the right side, and assumes
> it's an int4.
>
>    select '[18,20]'::numrange @> 19.0;
>
> works.

My misapprehension stems from the assumption that the
anyrange,anynonarray entry for the @> operator, and the
contains(anyrange, anynonarray) function would resolve since numrange
is a subset of anyrange and int4 is a subset of anynonarray.
Obviously it shouldn't work as the underlying type of the range isn't
an integer, but just trying to understand how the error message came
about.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-01-30 22:14:50
Message-ID: 21519.1296425690@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Sun, 2011-01-30 at 02:55 +0000, Thom Brown wrote:
>> postgres=# select '[18,20]'::numrange @> 19;
>> ERROR: operator does not exist: numrange @> integer
>> LINE 1: select '[18,20]'::numrange @> 19;
>> ^
>> HINT: No operator matches the given name and argument type(s). You
>> might need to add explicit type casts.

> It's because it doesn't know the type on the right side, and assumes
> it's an int4.

Well, yeah, it is an int4. The question ought to be phrased "why does
the parser fail to promote the int4 to numeric?". There might be some
excuse for an "operator is not unique" here, but I don't understand the
above failure --- it should be able to use an implicit coercion from
int4 to numeric.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: WIP: RangeTypes
Date: 2011-01-30 22:52:01
Message-ID: 1296427921.11513.628.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Another updated patch.

Improvements:

* Full GiST support
- Thanks to Alexander Korotkov for sending me a new picksplit
algorithm for my "temporal" project on pgfoundry. I modified it for use
with range types, including a (hopefully) intelligent way of handling
empty and unbounded ranges.

* Quite a few tests added, some cleanup done

Open items:

* naming issues:
- period -> tsrange ?
- periodtz -> tstzrange ?
- intrange -> int4range
* add int8range
* Documentation improvements
- CREATE TYPE
- ANYRANGE
- Data Types section
* Thom Brown and Tom Lane pointed out that the type inferencing
should be able to promote int4 to numeric for queries like:
select '[18,20]'::numrange @> 19;
* Should the SQL function length(), which relies on polymorphic "-",
be marked immutable, stable, or volatile?
* representation or alignment issues
* parser should be improved to handle spaces and quoting better
* Should btree_gist be pulled into core to make it easier to use
exclusion constraints with range types?
* Typmod (optional)

Regards,
Jeff Davis

Attachment Content-Type Size
rangetypes-20110130.patch.gz application/x-gzip 37.8 KB

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: RangeTypes
Date: 2011-02-02 17:55:34
Message-ID: 1296669334.11513.843.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2011-01-30 at 17:14 -0500, Tom Lane wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> > On Sun, 2011-01-30 at 02:55 +0000, Thom Brown wrote:
> >> postgres=# select '[18,20]'::numrange @> 19;
> >> ERROR: operator does not exist: numrange @> integer
> >> LINE 1: select '[18,20]'::numrange @> 19;
> >> ^
> >> HINT: No operator matches the given name and argument type(s). You
> >> might need to add explicit type casts.
>
> > It's because it doesn't know the type on the right side, and assumes
> > it's an int4.
>
> Well, yeah, it is an int4. The question ought to be phrased "why does
> the parser fail to promote the int4 to numeric?". There might be some
> excuse for an "operator is not unique" here, but I don't understand the
> above failure --- it should be able to use an implicit coercion from
> int4 to numeric.

The problem exists for arrays, as well, so I think this is just a
limitation of the type system.

Regards,
Jeff Davis

postgres=# select ARRAY[1.4,1.5,1.6]::numeric[] || 5.0;
?column?
-------------------
{1.4,1.5,1.6,5.0}
(1 row)

postgres=# select ARRAY[1.4,1.5,1.6]::numeric[] || 5;
ERROR: operator does not exist: numeric[] || integer
LINE 1: select ARRAY[1.4,1.5,1.6]::numeric[] || 5;
^
HINT: No operator matches the given name and argument type(s). You
might need to add explicit type casts.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: WIP: RangeTypes
Date: 2011-02-07 18:32:40
Message-ID: 1297103561.2099.4.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On sön, 2011-01-30 at 14:52 -0800, Jeff Davis wrote:
> * naming issues:
> - period -> tsrange ?
> - periodtz -> tstzrange ?
> - intrange -> int4range

Have you considered a grammar approach like for arrays, so that you
would write something like

CREATE TABLE ... (
foo RANGE OF int
);

instead of explicitly creating a range type for every scalar type in
existence? I think that that might be easier to use in the common case.

I guess the trick might be how to store and pass the operator class and
some other parameters.


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: WIP: RangeTypes
Date: 2011-02-08 16:21:27
Message-ID: 1297182087.27157.178.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2011-02-07 at 20:32 +0200, Peter Eisentraut wrote:
> Have you considered a grammar approach like for arrays, so that you
> would write something like
>
> CREATE TABLE ... (
> foo RANGE OF int
> );
>
> instead of explicitly creating a range type for every scalar type in
> existence? I think that that might be easier to use in the common case.

It would be nice, but the type system just isn't powerful enough to
express things like that right now, as far as I can tell.

That works for arrays because every type in PG has a second pg_type
entry for the array type. I don't think we want to do something similar
for range types -- especially if there are alternative range types for a
given base type.

Regards,
Jeff Davis