Re: ALTER TYPE 3: add facility to identify further no-work cases

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: ALTER TYPE 3: add facility to identify further no-work cases
Date: 2011-01-25 00:18:47
Message-ID: AANLkTimKmCPCjjk-SyE1+ymbwy7FNnfzBcN1A5KT7UY5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 9, 2011 at 5:03 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> Here I add the notion of an "exemptor function", a property of a cast that
> determines when calls to the cast would be superfluous.  Such calls can be
> removed, reduced to RelabelType expressions, or annotated (via a new field in
> FuncExpr) with the applicable exemptions.  I modify various parse_coerce.c
> functions to retrieve, call, and act on these exemptor functions; this includes
> GetCoerceExemptions() from the last patch.  I did opt to make
> find_typmod_coercion_function return COERCION_PATH_RELABELTYPE when no work is
> needed, rather than COERCION_PATH_NONE; this makes it consistent with
> find_coercion_pathway's use of that enumeration.
>
> To demonstrate the functionality, I add exemptor functions for varchar and xml.
> Originally I was only going to start with varchar, but xml tests the other major
> code path, and the exemptor function for xml is dead simple.
>
> This helps on conversions like varchar(4)->varchar(8) and text->xml.

I've read through this patch somewhat. As I believe Tom also
commented previously, exemptor is a pretty bad choice of name. More
generally, I think this patch is a case of coding something
complicated without first achieving consensus on what the behavior
should be. I think we need to address that problem first, and then
decide what code needs to be written, and then write the code.

It seems to me that patch two in this series has the right idea: skip
rewriting the table in cases where the types are binary coercible
(WITHOUT FUNCTION) or one is an unconstrained domain over the other.
IIUC, the problem this patch is trying to address is that our current
CAST infrastructure is insufficient to identify all cases where this
is true - in particular, it makes the decision solely based on the
type OIDs, without consideration of the typmods. In general, varchar
-> varchar is not binary coercible, but varchar(4) -> varchar(8) is
binary coercible. I think what this patch is trying to do is tag the
call to the cast function with the information that we might not need
to call it, but ISTM it would be better to just notice that the
function call is unnecessary and insert a RelabelType node instead.

*reads archives*

In fact, Tom already made pretty much exactly this suggestion, on a
thread entitled "Avoiding rewrite in ALTER TABLE ALTER TYPE". The
only possible objection I can see to that line of attack is that it's
not clear what the API should be, or whether there might be too much
runtime overhead for the amount of benefit that can be obtained.

This is, incidentally, another problem with this patch - if you want
to make wide-ranging changes to the system like this, you need to
provide a convincing argument that it's not going to compromise
correctness or performance. Just submitting the patch without making
an argument for why it's correct is no good, unless it's so obviously
correct as to be unquestionable, which is certainly not the case here.
You've got to write up some kind of submission notes so that the
reviewer isn't trying to guess why you think this is the right
approach, or spending a lot of time thinking through approaches you've
discarded for good reason. If there is a danger of performance
regression, you need to refute it, preferably with actual benchmarks.
A particular aspect I'm concerned about with this patch in particular:
it strikes me that the overhead of calling the exemptor functions in
the ALTER TABLE case is negligible, but that this might not be true in
other contexts where some of this logic may get invoked - it appears
to implicate the casting machinery much more generally.

I think it's a mistake to merge the handling of the rewrite-vs-noop
case with the check-vs-noop case. They're fundamentally dissimilar.
As noted above, being able to notice the noop case seems like it could
save run-time work during ordinary query execution. But detecting the
"check" case is useless work except in the specific case of a table
rewrite. If we're going to do that at all, it seems to me that it
needs to be done via a code-path that's only going to get invoked in
the case of ALTER TABLE, not something that's going to happen every
time we parse an expression tree. But it seems to me that it might be
better to take the suggestion I made before: forget about the
check-only case for now, and focus on the do-nothing case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-01-25 00:19:45 Re: Use of O_DIRECT only for open_* sync options
Previous Message Noah Misch 2011-01-25 00:10:44 Re: ALTER TYPE 2: skip already-provable no-work rewrites