Re: Caching for stable expressions with constant arguments v6

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Caching for stable expressions with constant arguments v6
Date: 2012-03-10 00:05:28
Message-ID: 8448.1331337928@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Marti Raudsepp <marti(at)juffo(dot)org> writes:
> [ cacheexpr-v8.patch ]

A few comments:

* I believe the unconditional datumCopy call in ExecEvalCacheExpr will
dump core if the value is null and the type is pass-by-reference.
Beyond just skipping it, it seems like you could skip the type
properties lookup as well if the value to be cached is null.

* I do not see any logic in here for resetting the already-cached status
in an existing ExprState tree. You're going to need that for cases
where Params are given a new value. I don't think I trust the
assumption that this is never true for EXTERN parameters, and in any
case you're leaving a whole lot of the potential benefit of the patch on
the table by ignoring the possibility of caching executor-internal
parameters. I would rather see, for example, a ReScan call on a plan
node running around and resetting caching of all expression trees
attached to the plan node if its chgParam isn't empty.

* I think you should consider getting rid of the extra argument to
ExecInitExpr, as well as the enable flag in CacheExprState, and instead
driving cache-enable off a new flag added to EState, which should
probably initialize to "caching OK" since that's what the majority of
callers seem to want. That would get rid of a fair amount of the
boilerplate changes, as well as being at least as convenient to use,
maybe more so.

* In the same vein, I think it would be better to avoid adding
the extra argument to eval_const_expressions_mutator and instead pass
that state around in the existing "context" struct argument. In
particular I am entirely unimpressed with the fact that you can't pass
the extra argument through expression_tree_mutator and thus have to dumb
the code down to fail to cache underneath any node type for which
there's not explicit coding in eval_const_expressions_mutator.

* This is pretty horrible:

+ * NOTE: This function is not indempotent. Calling it twice over an expression
+ * tree causes CacheExpr nodes to be removed in the first pass, then re-added
+ * in the 2nd pass. Make sure it only gets called once.

I don't think we can positively guarantee that. The function API should
be fixed to not need such an assumption.

* It seems like some of the ugliness here is due to thinking that
selectivity functions can't cope with CacheExprs. Wouldn't it be a lot
better to make them cope? I don't think there are that many places.
I'd suggest looking for places that strip RelabelType and fixing them to
strip CacheExpr instead (in fact, most likely any that don't are broken
anyway ...)

* There's a lot of stuff that seems wrong in detail in
eval_const_expressions_mutator, eg it'll try to wrap a plain vanilla
Const with a CacheExpr. I see you've hacked that case inside
insert_cache itself, but that seems like evidence of a poorly thought
out recursion invariant. The function should have a better notion than
that of whether caching is useful for a given subtree. In general I'd
not expect it to insert a CacheExpr unless there is a Param or stable
function call somewhere below the current point, but it seems not to be
tracking that. You probably need at least a three-way indicator
returned from each recursion level: subexpression is not cacheable
(eg it contains a Var or volatile function); subexpression is cacheable
but there is no reason to do so (default situation); or subexpression is
cacheable and should be cached (it contains a Param or stable function).

* I don't think it's a good idea for
simplify_and_arguments/simplify_or_arguments to arbitrarily reorder the
arguments based on cacheability. I know that we tell people not to
depend on order of evaluation in AND/OR lists, but that doesn't mean
they listen, and this will introduce reordering into an awful lot of
places where it never happened before. If you have a mix of
cacheability situations, just stick CacheExprs on each arm (or not).

* I'm having a hard time understanding what you did to simplify_function
and friends; that's been whacked around quite a bit, in ways that are
neither clearly correct nor clearly related to the goal of the patch.
It might be best to present this change as a series of two patches,
one that just refactors and one that adds the caching logic.

* I don't believe CacheExprs can be treated as always simple by
ruleutils' isSimpleNode, at least not unless you always print them with
parentheses (which might be a good idea anyway).

* I don't think I believe either of the changes you made to plpgsql's
exec_simple_check_node. I'm not convinced that only PARAM_EXTERN params
are possible, and even if that's true it doesn't seem to be this patch's
business to add an assert for it. Also, immediately returning TRUE for
a CacheExpr amounts to assuming that we can never wrap a CacheExpr
around something that plpgsql would consider non-simple. I'm not sure
that's true now, and even if it is true it seems a mighty fragile
assumption.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2012-03-10 00:36:08 Advisory Lock BIGINT Values
Previous Message Tatsuo Ishii 2012-03-09 23:30:09 missing description initdb manual