Re: Avoiding bad prepared-statement plans.

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bart Samwel <bart(at)samwel(dot)tk>, Jeroen Vermeulen <jtv(at)xs4all(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Avoiding bad prepared-statement plans.
Date: 2010-02-15 19:11:34
Message-ID: 201002151911.o1FJBYh22763@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pavel Stehule wrote:
> > The problem that we face is that we don't have any very good way to tell
> > whether a fresh planning attempt is likely to yield a plan significantly
> > better than the generic plan. ?I can think of some heuristics --- for
> > example if the query contains LIKE with a parameterized pattern or a
> > partitioned table --- but that doesn't seem like a particularly nice
> > road to travel.
> >
> > A possible scheme is to try it and keep track of whether we ever
> > actually do get a better plan. ?If, after N attempts, none of the custom
> > plans were ever more than X% cheaper than the generic one, then give up
> > and stop attempting to produce custom plans. ?Tuning the variables might
> > be challenging though.
>
> I afraid so every heuristic is bad. Problem is identification of bad
> generic plan. And nobody ensure, so non generic plan will be better
> than generic. Still I thing we need some way for lazy prepared
> statements - plan is generated everytime with known parameters.

Yea, this opens a whole host of questions for me:

1. Why do we only do bind-level planning for anonymous wire-level queries?

2. I realize we did anonymous-only because that was the only way we had
in the protocol to _signal_ bind-time planning, but didn't we think of
this when we were implementing the wire-level protocol?

3. Do we have no place to add this cleanly without a protocol version
bump?

4. Why don't we just always do planning at first bind time? When is
that worse than using generic values?

5. Why have we not added an option for SQL-level prepare to do this?

6. When do our generic columns costs significantly worse than having
specific constants? I assume unique columns are fine with generic
constants.

7. Why is there no option to do parameterized-queries which replan every
time?

This just seems like an area that has been neglected, or maybe I am
missing something and our current setup is acceptable. We have done a
lot of work to generate acceptable optimizer statistics, but we are not
using them for a significant part of our user base, particularly JDBC.

We do have a TODO item, but it has gotten little attention:

Allow finer control over the caching of prepared query plans

Currently anonymous (un-named) queries prepared via the libpq API
are planned at bind time using the supplied parameters --- allow SQL
PREPARE to do the same. Also, allow control over replanning prepared
queries either manually or automatically when statistics for execute
parameters differ dramatically from those used during planning.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-02-15 19:12:35 Re: psycopg2 license changed
Previous Message Tom Lane 2010-02-15 18:53:21 Re: LISTEN/NOTIFY versus encoding conversion