Re: Identifying no-op length coercions

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alexey Klyukin <alexk(at)commandprompt(dot)com>
Subject: Re: Identifying no-op length coercions
Date: 2011-06-11 17:40:54
Message-ID: 20110611174054.GA20846@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 02, 2011 at 03:22:49PM -0400, Noah Misch wrote:
> On Thu, Jun 02, 2011 at 05:08:51PM +0300, Alexey Klyukin wrote:
> > Looks like this thread has silently died out. Is there an agreement on the
> > syntax and implementation part? We (CMD) have a customer, who is interested in
> > pushing this through, so, if we have a patch, I'd be happy to assist in
> > reviewing it.
>
> I think we have a consensus on the implementation. We didn't totally lock down
> the syntax. Tom and I seem happy to have no SQL exposure at all, so that's what
> I'm planning to submit. However, we were pretty close to a syntax consensus in
> the event that it becomes desirable to do otherwise.

Here's the patch. It contains the core support for transform functions and
applies that support to the varchar() length coercion.

We originally discussed having the transform function take and return Expr
nodes. It turns out that simplify_function() does not have the Expr, probably
because the particular choice of node type among the many that can convey a
function call does not matter to it. The same should be true of a transform
function -- it should do the same thing whether the subject call arrives from
a FuncExpr or an OpExpr. Therefore, I changed the transform function
signature to "Expr *protransform(List *args)".

Previously, ALTER TABLE did not plan its expressions until ATRewriteTable.
Since we need to make a decision on whether to rewrite based on examination of
a fully simplified expression, I moved the planning earlier. This means
planning now happens before the catalog updates associated with the ALTER
TABLE. This is at least as safe as what we did before, because the heap
available to expressions during ATRewriteTable is the pre-rewrite heap. I
wouldn't be surprised if there was and is some way to get into trouble with a
USING expression that refers back to the current table indirectly, but I did
not explore that in any detail.

The large pg_proc.h diff mostly just adds protransform = 0 to every function.
Due to the resulting patch size, I've compressed it. There are new/otherwise
changed DATA lines for OIDs 669 and 3097 only.

Any transform function for a length coercion cast will need code to strip
subsidiary RelabelType nodes and apply a new one that changes only the typmod.
I've gone ahead and split that out into a function in parse_clause.c.

I verified the patch's behavior using pg_regress tests I developed previously.
The tests rely on already-rejected instrumentation, so I don't include them in
the patch. If anyone would like an updated copy of the tests patch to aid
your review, let me know.

This will obviously require a catversion bump.

I looked for a README file that might be suitable for a brief note on how to
use the feature. src/backend/optimizer/README seemed closest, but it's all
much higher-level. Perhaps, like most optimization microfeatures, this
requires no particular discussion outside the code implementing it. I would
include approximately this README text if anyone feels it's useful:

pg_proc.protransform functions
------------------------------

Some functions calls can be simplified at plan time based on properties
specific to the function. For example, "int4mul(n, 1)" simplifies to "n", and
"varchar(s::varchar(4), 8, true)" simplifies to "s::varchar(4)". To define
such function-specific optimizations, write a "transform function" and store
its OID in the pg_proc.protransform of the primary function. This transform
function has the signature "protransform(internal) RETURNS internal", mapping
internally as "Expr *protransform(List *args)". If the transform function's
study of the argument Nodes proves that a simplified Expr substitutes for
every call represented by such arguments, return that Expr. Otherwise, return
the NULL pointer. We make no guarantee that PostgreSQL will never call the
main function in cases that the transform function would simplify. Ensure
rigorous equivalence between the simplified expression and an actual call to
the main function.

Thanks,
nm

Attachment Content-Type Size
noop-length-coercion-v1.patch.gz application/x-gunzip 92.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-06-11 18:11:55 Re: Identifying no-op length coercions
Previous Message Joshua D. Drake 2011-06-11 17:37:38 Re: procpid?