Re: Range types

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Scott Bailey <artacus(at)comcast(dot)net>, hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Range types
Date: 2009-12-16 20:35:38
Message-ID: 1260995738.13414.2369.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2009-12-16 at 13:59 -0500, Tom Lane wrote:
> The argument for having
> granularity wired into the datatype seems to boil down to just space
> savings. I don't find that compelling enough to justify code
> contortions and user-visible restrictions on functionality.

The argument (at least from me) is that discrete ranges have better
semantics. The counterargument was that the granularity of a timestamp
is an implementation detail. So I countered by making it explicit.

Space savings is not crucial, but it would be frustrating to needlessly
waste space.

I still have not seen an answer to the problem of changing the
representation of a continuous range. If you have the continuous range
[5, 10], you're pretty much stuck with that representation, even if the
application is expecting things in the form [ ).

>From an application's standpoint, you probably want to get the
information about a range as separate columns (as opposed to parsing a
string). So an application expecting data in [) format might use a query
like:

select ..., first(myperiod), next(myperiod) from mytable;

That gives the application complete information about the range. You can
even make a view over a table like that to make it even more transparent
to the application. It's not entirely unreasonable that many such
applications exist; there are many presentations and tutorials that have
been telling people to use a "start" and "end" column, and assume that
the start is inclusive and the end is exclusive.

If there is some other application that expects data in (] format, you
just use the query:

select ..., prior(myperiod), last(myperiod) from mytable;

With discrete ranges, that all just works (barring overflow or special
values).

With continuous ranges, first() or next() might fail on some values that
were produced by some other application. Really, for continuous ranges,
you'd need to have a query like:

select ..., start(myperiod), start_inclusive(myperiod),
end(myperiod), end_inclusive(myperiod) from mytable;

in order to have all of the information. And that means that the
application needs a full implementation of a range type to understand
the inclusivity and produce a correct result.

And to further make the case for allowing user-defined discrete ranges,
what about ip4r? That's a discrete range, and it's user-defined. And
that probably means other useful discrete ranges will be invented, as
well.

If we want to say that there is no discrete TIMESTAMP range by default,
and that the superuser has to define it, that's one thing. But if we say
that the only acceptable base types for discrete ranges will be
hard-wired, that's way too restrictive. If nothing else, it makes some
system types "special" which we have not done very many other places.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2009-12-16 20:41:31 Re: Range types
Previous Message Tom Lane 2009-12-16 20:31:11 Re: Does "verbose" Need to be Reserved?