Re: Delaying the planning of unnamed statements until Bind

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Delaying the planning of unnamed statements until Bind
Date: 2004-05-20 22:43:14
Message-ID: 40AD3482.6070500@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Oliver Jowett <oliver(at)opencloud(dot)com> writes:
>
>>Following a recent discussion on the JDBC list
>>(http://archives.postgresql.org/pgsql-jdbc/2004-05/msg00100.php) I'm
>>looking at modifying when unnamed statements received via a v3 protocol
>>Parse message are planned. The idea is to delay planning until the Bind
>>message arrives, so that the planner benefits from knowing the actual
>>parameter values in use. This means that clients such as JDBC can use an
>>unnamed Parse/Bind as a way to provide typed and/or binary parameter
>>values without running the risk of worsening the query plan.
>
>
> I'm a bit concerned about driving such a change in behavior off nothing
> more than whether the statement is named or not. The protocol spec does
> say this:
>
> : In addition, an "unnamed" prepared statement and portal exist. Although
> : these behave largely the same as named objects, operations on them are
> : optimized for the case of executing a query only once and then
> : discarding it, whereas operations on named objects are optimized on the
> : expectation of multiple uses.
>
> so you could argue that this change is just an optimization of that
> kind. It had better be documented though.

Ideally the when-to-plan decision should be client-controllable per
statement. I couldn't see a way to do this without causing a protocol
version change; using the unnamed statement sounded like a reasonable
compromise.

It'd also make the extended query protocol closer to the simple query
protocol as the documentation claims. This is the underlying problem
that bit me: the extended query protocol is *not* just the simple query
protocol broken down into smaller steps. There are non-obvious changes
deep in the guts of the system if you use the extra functionality of the
extended protocol.

Would it be better to bite the bullet and bump the protocol version,
adding a new field to Parse to control this behaviour?

>>3. In execute_bind_message, if the source statement is not planned
>>(plan_list == NULL), plan it, passing the concrete parameter values
>>received in the Bind message.
>
>
> I am not sure you can rely on plan_list == NULL to be the driving test;
> consider an empty query string for instance. It'd be better to add a
> boolean planning_done field to the struct.

Oops -- I made the incorrect assumption that NIL != NULL (i.e. it was a
shared placeholder node for empty lists).

Looks like this would only bite in the empty-query case (where
querytree_list is NIL). Trying to plan that via pg_plan_queries looks
fairly harmless. Nevertheless, I can add a field to the struct.

>>The planning happens using the target
>>portal's memory context; the resulting plan_list is only used for this
>>portal and the source statement remains unplanned.
>
>
> I don't like that at all. Do the planning using the given parameters,
> but save the plan. Otherwise you have substantially pessimized the
> behavior for the case where an unnamed statement is reused.

How often is the unnamed statement reused? From my exhaustive sample of
one (the JDBC driver) it's never reused. (admittedly that means it
doesn't affect me either way, but..)

If the planning behaviour was controllable per-statement, would you be
ok with replanning on every Bind when the statement asks for delaying
planning?

The reason I don't like storing the plan is that I see nothing special
about the first set (or Nth set) of parameters if the statement is being
reused, and you could come up with a plan that's substantially worse
than if you'd just planned before parameters were available (e.g.
picking an indexscan over a seqscan because the first Bind had selective
parameters, then subsequently Binding nonselective parameters).

>>4. In planner(), if parameters are provided, replace Param nodes with
>>Const nodes in the query tree where appropriate.
>
>
> No, you can't replace Params with Consts. You can use the values of
> Params in the places where selfuncs.c wants to extract constant values
> for estimation purposes.

Transforming the tree seemed the most reliable way to get a result
that's consistent with Const being in the tree from the start (i.e. the
simple query case), and doing the replacement hasn't broken anything in
my (admittedly brief) testing yet. What should I be looking at to see
the breakage?

Note that the Const vs. Param difference appears to affect
eval_const_expressions too, so it's more than just the selectivity
functions that would need changing if we want to produce the same query
plans as from an equivalent simple query.

>>... it requires threading the parameter values through many
>>planner functions. There are quite a few (20-30ish?) functions affected.
>
>
> This would be a pretty thorough mess, especially after you got done
> propagating the info to selectivity functions. Frankly I'd suggest
> using a planner global variable instead; see PlannerParamList for a
> model.

If I'm replacing the Param nodes it doesn't need to go as deep as the
selectivity functions. It's not great but it's not a disaster either.
Certainly if I needed to propagate all the way to the selectivity
functions it'd get out of control.

Would it be safe to save-and-clear the global parameter state only at
the topmost planner() level as is done for PlannerParamList and friends?
I'm concerned about the global parameter state interfering with the
parameter values of things like functions evaluated during planning.
Tracing the opportunities for recursion in the planner is, well,
interesting..

Thanks for your comments!

-O

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marc G. Fournier 2004-05-20 22:45:43 Re: commit messages from gforge -> pgsql-committers
Previous Message Bruce Momjian 2004-05-20 22:39:01 Re: commit messages from gforge -> pgsql-committers