Rethinking user-defined-typmod before it's too late

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Rethinking user-defined-typmod before it's too late
Date: 2007-06-15 16:14:45
Message-ID: 5146.1181924085@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The current discussion about the tsearch-in-core patch has convinced me
that there are plausible use-cases for typmod values that aren't simple
integers. For instance it could be sane for a type to want a locale or
language selection as a typmod, eg tsvector('ru') or tsvector('sv').
(I'm not saying we are actually going to do that to tsvector, just that
it's now clear to me that there are use-cases for such things.)

Teodor's work a few months ago generalized things enough so that
something like this is within reach. The grammar will actually allow
darn near anything for a typmod, since the grammar production is
expr_list to avoid shift/reduce conflict with the very similar-looking
productions for function calls. The only place where we are
constraining what a typmod can be is that the defined API for
user-written typmodin functions is "integer array".

At the time that patch was being worked on, I think I argued that
integer typmods were enough because you'd have to pack them into such a
small output representation anyway. The hole in that logic is that you
might have a fairly small enumerated set of possibilities, but that
doesn't mean you want to make the user use a numeric code for them.
You could even make the typmod be an integer key for a lookup table,
if the set of possibilities is not hardwired.

Since this code hasn't been released yet, the API isn't set in stone
... but as soon as we ship 8.3, it will be, or at least changing it will
be orders of magnitude more painful than it is today. So, late as this
is in the devel cycle, I think now is the time to reconsider.

I propose changing the typmodin signature to "typmodin(cstring[]) returns
int4", that is, the typmods will be passed as strings not integers. This
will incur a bit of extra conversion overhead for the normal uses where
the typmods are integers, but I think the gain in flexibility is worth
it. I'm inclined to make the code in parse_type.c take either integer
constants, simple string literals, or unqualified names as input ---
so you could write either tsvector('ru') or tsvector(ru) when using a
type that wants a nonintegral typmod.

Note that the typmodout side is already OK since it is defined to return
a string.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Umar Farooq 2007-06-15 16:22:22 Performance Monitoring
Previous Message Teodor Sigaev 2007-06-15 16:07:34 Re: How does the tsearch configuration get selected?