Re: Range types

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Scott Bailey <artacus(at)comcast(dot)net>, hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Range types
Date: 2009-12-16 19:29:55
Message-ID: 1260991795.13414.2305.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2009-12-16 at 12:50 -0500, Tom Lane wrote:
> I'm still not exactly clear on what the use-case is for discrete
> timestamp ranges, and I wonder how many people are going to be happy
> with a representation that can't handle a range that's open-ended
> on the left.

Huh? We're miscommunicating somewhere. Discrete ranges are values, and
those values can be displayed a number of different ways. That's one of
the biggest advantages.

The very same discrete range can be displayed as open-open, closed-open,
open-closed, or closed-closed. There are edge cases, like how infinity
is never closed, and overflow conditions. But generally speaking, you
have more options for presenting a discrete range than a continuous
range.

The range [5, 7) is equal to the set {5, 6} and equal to the ranges:
(4,7), (4,6], [5,7), and [5,6]. One application can insert it as [5,7)
and another can read it as (4,6].

That's the use case: the application's preferences don't have to match.
It's OK to mix various representation preferences, because you can
convert between them. The on disk format happens to hint at one
particular canonical form, but doesn't enforce that on anyone.

> Huh? You're not going to be able to have a special case data
> representation for one or two data types at the same time as you have a
> function-based datatype-independent concept of a parameterized range
> type. Well, maybe you could have special code paths for just date and
> timestamp but it'd be horrid.

They aren't supposed to be exactly the same API, I said that from the
start. There are API differences between continuous and discrete ranges,
and we shouldn't ignore them.

One important differences is that (barring overflow conditions and
special values) prior, first, last, and next are defined for all
discrete range values, but not for all continuous range values.

For instance, the discrete range [5,7) has prior=4, first=5, last=6,
next=7.

Whereas the continuous range [5,7) has prior=undef, first=5, last=undef,
next=7.

We could define one API, that treats discrete and continuous ranges
differently. But you'll never be able to transform a continuous range to
a different representation, while you can do so with a discrete range.

> More importantly, the notion of a representation granule is still 100%
> wishful thinking for any inexact-representation datatype, which is going
> to be a severe crimp in getting this accepted for timestamp, let alone
> defining it in a way that would allow users to try to apply it to
> floats. Float timestamps might not be the default case anymore but they
> are still supported.

If the only objection is that a superuser can confuse the system by
poorly defining a range type on a non-default build, I think that
objection can be overcome.

> I think you should let go of the feeling that you have to shave bytes
> off the storage format. You're creating a whole lot of work for
> yourself and a whole lot of user-visible corner cases in return for
> what ultimately isn't much.

This isn't just to shave bytes. It's also because I like the semantics
of discrete ranges for many cases.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2009-12-16 19:34:33 PATCH: Spurious "22" in hstore.sgml
Previous Message Markus Wanner 2009-12-16 19:28:26 pinging Dano