Re: Range Types and extensions

From: Darren Duncan <darren(at)darrenduncan(dot)net>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Range Types and extensions
Date: 2011-06-07 18:28:51
Message-ID: 4DEE6DE3.4090308@darrenduncan.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Davis wrote:
> On Tue, 2011-06-07 at 11:15 -0400, Tom Lane wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> right. hm -- can you have multiple range type definitions for a
>>> particular type?
>> In principle, sure, if the type has multiple useful sort orderings.
>
> Right. Additionally, you might want to use different "canonical"
> functions for the same subtype.
>
>> I don't immediately see any core types for which we'd bother.
>
> Agreed.
>
>> BTW, Jeff, have you worked out the implications of collations for
>> textual range types?
>
> Well, "it seems to work" is about as far as I've gotten.
>
> As far as the implications, I'll need to do a little more research and
> thinking. But I don't immediately see anything too worrisome.

I would expect ranges to have exactly the same semantics as ORDER BY or "<" etc
with respect to collations for textual range types.

If collation is an attribute of a textual type, meaning that the textual type or
its values have a sense of their collation built-in, then ranges for those
textual types should "just work" without any extra range-specific syntax, same
as you could say ORDER BY without any further qualifiers.

If collation is not an attribute of a textual type, meaning that you normally
have to qualify the desired collation for each order-sensitive operation using
it (even if that can be defined by a session/etc setting which still just
ultimately works at the operator rather than type level), or if a textual type
can have it built in but it is overridable per operator, then either ranges
should have an extra attribute saying what collation (or other type-specific
order-determining function) to use, or all range operators take the optional
collation parameter like with ORDER BY.

Personally, I think it is a more elegant programming language design for an
ordered type to have its own sense of a one true canonical ordering of its
values, and where one could conceptually have multiple orderings, there would be
a separate data type for each one. That is, while you probably only need a
single type with respect to ordering for any real numeric type, for textual
types you could have a separate textual type for each collation.

In particular, I say separate type because a collation can sometimes affect
differently what text values compare as "same", as far as I know.

On a tangent, I believe that various insensitive comparisons or sortings are
very reasonably expressed as collations rather than some other mechanism, eg if
you wanted sortings that compare different letter case as same or not, or with
or without accents as same or not.

So under this "elegant" system, there is no need to ever specify collation at
the operator level (which could become quite verbose and unweildy), but instead
you can cast data types if you want to change their sense of canonical ordering.

Now if the various text-specific operators are polymorphic across these text
type variants, users don't generally have to know the difference except when it
matters.

On a tangent, I believe that the best definition of "equal" or "same" in a type
system is global substitutability. Ignoring implementation details, if a
program ever finds that 2 operands to the generic "=" (equality test) operator
result in TRUE, then the program should feel free to replace all occurrences of
one operand in the program with occurrences of the other, for optimization,
because generic "=" returning TRUE means one is just as good as the other. This
assumes generally that we're dealing with immutable value types.

-- Darren Duncan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-06-07 18:38:56 Re: SIREAD lock versus ACCESS EXCLUSIVE lock
Previous Message Stephen Frost 2011-06-07 18:22:17 Re: reducing the overhead of frequent table locks - now, with WIP patch