Re: parameter hints to the optimizer

Lists: pgsql-hackers
From: "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: parameter hints to the optimizer
Date: 2004-07-21 18:12:08
Message-ID: 6EE64EF3AB31D5448D0007DD34EEB34101AEE9@Herge.rcsinc.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

There is one problem with the optimizer that is a constant source of
frustration. When using prepared statements and functions (particularly
where function parameters are passed to a query), the optimizer often
fails to utilize an index inside a plan.

This is a well known problem because the parameter values are not known
at the time the plan is generated, making things difficult for the
optimizer.

It would be nice if 'hint' or sample parameters could be used when
creating the statement so the optimizer could use those values when
generating the plan. For example, the default parameter syntax of C++
could be borrowed (assuming this doesn't break any syntax rules).

example:
prepare my_statement prepare (character varying='abc')
as select * from t where t.k = $1;

create function my_function(int4=1234) returns [...]

Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.

Am I crazy? Comments?
Merlin


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: parameter hints to the optimizer
Date: 2004-07-21 20:47:07
Message-ID: 1090442826.2658.1310.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2004-07-21 at 19:12, Merlin Moncure wrote:
> There is one problem with the optimizer that is a constant source of
> frustration. When using prepared statements and functions (particularly
> where function parameters are passed to a query), the optimizer often
> fails to utilize an index inside a plan.
>
> This is a well known problem because the parameter values are not known
> at the time the plan is generated, making things difficult for the
> optimizer.
>
> It would be nice if 'hint' or sample parameters could be used when
> creating the statement so the optimizer could use those values when
> generating the plan. For example, the default parameter syntax of C++
> could be borrowed (assuming this doesn't break any syntax rules).
>
> example:
> prepare my_statement prepare (character varying='abc')
> as select * from t where t.k = $1;
>
> create function my_function(int4=1234) returns [...]
>
> Another way to deal with the problem is to defer plan generation until
> the first plan execution and use the parameters from that execution.
>
> Am I crazy? Comments?

Crazy enough to suggest what other RDBMS do.

It's a common problem, since it defeats the use of histogram statistics
to determine specific cardinality rather than generic cardinality.

The answer is to follow what those others do, since programs will be
written to take advantage of those optimization quirks.

DB2 supports various modes for BIND: REOPT(ALWAYS), REOPT(ONCE),
REOPT(VARS) and REOPT(NONE) - which are then manifested in their
precompiler.

..back to you,

Best Regards, Simon Riggs


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: parameter hints to the optimizer
Date: 2004-07-21 21:03:44
Message-ID: 40FEDA30.2070401@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure wrote:

> Another way to deal with the problem is to defer plan generation until
> the first plan execution and use the parameters from that execution.

When talking the V3 protocol, 7.5 defers plan generation for the unnamed
statement until parameters are received in the Bind message (which is
essentially the same as what you describe). There was some discussion at
the time about making it more flexible so you could apply it to arbitary
statements, but that needed a protocol change so it didn't happen.

So the guts of the work are done -- we'd just need a way to trigger the
behaviour on demand. It sounds less painful to add something to PREPARE
than to change the V3 protocol at this stage.

-O


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: parameter hints to the optimizer
Date: 2004-08-07 01:41:26
Message-ID: 200408070141.i771fQp05906@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oliver Jowett wrote:
> Merlin Moncure wrote:
>
> > Another way to deal with the problem is to defer plan generation until
> > the first plan execution and use the parameters from that execution.
>
> When talking the V3 protocol, 7.5 defers plan generation for the unnamed
> statement until parameters are received in the Bind message (which is
> essentially the same as what you describe). There was some discussion at
> the time about making it more flexible so you could apply it to arbitary
> statements, but that needed a protocol change so it didn't happen.

What do you mean about arbitrary statements? Non-prepared ones, or
non-unnamed ones?

I am trying to figure out what TODO item needs to be added.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: parameter hints to the optimizer
Date: 2004-08-07 22:49:13
Message-ID: 41155C69.2070407@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:
> Oliver Jowett wrote:
>
>>Merlin Moncure wrote:
>>
>>
>>>Another way to deal with the problem is to defer plan generation until
>>>the first plan execution and use the parameters from that execution.
>>
>>When talking the V3 protocol, 7.5 defers plan generation for the unnamed
>>statement until parameters are received in the Bind message (which is
>>essentially the same as what you describe). There was some discussion at
>>the time about making it more flexible so you could apply it to arbitary
>>statements, but that needed a protocol change so it didn't happen.
>
>
> What do you mean about arbitrary statements? Non-prepared ones, or
> non-unnamed ones?

Non-unnamed ones. Adding flag on the Parse message that says when to
plan the statement (or maybe on each Bind message even).

-O


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: parameter hints to the optimizer
Date: 2004-08-08 00:10:07
Message-ID: 200408080010.i780A7l22395@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oliver Jowett wrote:
> Bruce Momjian wrote:
> > Oliver Jowett wrote:
> >
> >>Merlin Moncure wrote:
> >>
> >>
> >>>Another way to deal with the problem is to defer plan generation until
> >>>the first plan execution and use the parameters from that execution.
> >>
> >>When talking the V3 protocol, 7.5 defers plan generation for the unnamed
> >>statement until parameters are received in the Bind message (which is
> >>essentially the same as what you describe). There was some discussion at
> >>the time about making it more flexible so you could apply it to arbitary
> >>statements, but that needed a protocol change so it didn't happen.
> >
> >
> > What do you mean about arbitrary statements? Non-prepared ones, or
> > non-unnamed ones?
>
> Non-unnamed ones. Adding flag on the Parse message that says when to
> plan the statement (or maybe on each Bind message even).

OK, what are unnamed prepared statements? When are they used currently?
Only via the wire protocol? Who uses them now?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: parameter hints to the optimizer
Date: 2004-08-08 00:41:35
Message-ID: 411576BF.8060608@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:
> Oliver Jowett wrote:
>
>>Bruce Momjian wrote:
>>
>>>Oliver Jowett wrote:
>>>
>>>
>>>>Merlin Moncure wrote:
>>>>
>>>>
>>>>
>>>>>Another way to deal with the problem is to defer plan generation until
>>>>>the first plan execution and use the parameters from that execution.
>>>>
>>>>When talking the V3 protocol, 7.5 defers plan generation for the unnamed
>>>>statement until parameters are received in the Bind message (which is
>>>>essentially the same as what you describe). There was some discussion at
>>>>the time about making it more flexible so you could apply it to arbitary
>>>>statements, but that needed a protocol change so it didn't happen.
>>>
>>>
>>>What do you mean about arbitrary statements? Non-prepared ones, or
>>>non-unnamed ones?
>>
>>Non-unnamed ones. Adding flag on the Parse message that says when to
>>plan the statement (or maybe on each Bind message even).
>
>
> OK, what are unnamed prepared statements? When are they used currently?
> Only via the wire protocol? Who uses them now?

The unnamed prepared statement is like any other prepared statement
except it doesn't have a name :) It can be accessed via:

1) V3 protocol Parse/Bind with an empty statement name uses the unnamed
statement.
2) V2 or V3 "simple query" implicitly closes the unnamed statement.

CVS HEAD defers planning in case (1) until the Bind is received so it
can do planning cost estimation using concrete parameter values and
produce a better plan. It only does this for the unnamed statement, not
for named statements. If you Parse into a named statement, planning
happens immediately when the Parse is done.

This behaviour gives the client some flexibility without changing the
protocol. It means that using Parse/Bind on the unnamed statement with
parameters is essentially equivalent planning-wise to substituting the
parameter values into the actual query and submitting that instead.

What we talked about briefly was providing some way to control when
planning was done on a per-statement basis -- so you could say "don't
defer planning for this unnamed query because I'm going to reuse the
unnamed statement multiple times and the first set of parameters might
not generate an efficient plan" or "do defer planning of this named
query because I know I will be executing it with many similar parameter
values and estimating using the first set of parameters gives a good plan".

Or an alternative is to have a way to control query replanning on each
Bind individually -- so a client can get the benefit of skipping the
parse step on subsequent executions and is able to pass parameters via
Bind, but the query is replanned for the concrete parameter values on
each execution. The JDBC driver wants this -- currently the use of named
statements has to be explicitly turned on as with the current behaviour
you may take a performance hit due to less-than-ideal plans as soon as
you start using named statements.

So maybe the TODO should be something like "allow finer-grained client
control of query estimation and (re-)planning when using Parse/Bind".

-O


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: parameter hints to the optimizer
Date: 2004-08-12 19:45:11
Message-ID: 200408121945.i7CJjBu12020@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


OK, I added this TODO:

* Allow finer control over the caching of prepared query plans

Currently, queries prepared via the libpq API are planned on first
execute using the supplied parameters --- allow SQL PREPARE to do the
same. Also, allow control over replanning prepared queries either
manually or automatically when statistics for execute parameters
differ dramatically from those used during planning.

---------------------------------------------------------------------------

Oliver Jowett wrote:
> Bruce Momjian wrote:
> > Oliver Jowett wrote:
> >
> >>Bruce Momjian wrote:
> >>
> >>>Oliver Jowett wrote:
> >>>
> >>>
> >>>>Merlin Moncure wrote:
> >>>>
> >>>>
> >>>>
> >>>>>Another way to deal with the problem is to defer plan generation until
> >>>>>the first plan execution and use the parameters from that execution.
> >>>>
> >>>>When talking the V3 protocol, 7.5 defers plan generation for the unnamed
> >>>>statement until parameters are received in the Bind message (which is
> >>>>essentially the same as what you describe). There was some discussion at
> >>>>the time about making it more flexible so you could apply it to arbitary
> >>>>statements, but that needed a protocol change so it didn't happen.
> >>>
> >>>
> >>>What do you mean about arbitrary statements? Non-prepared ones, or
> >>>non-unnamed ones?
> >>
> >>Non-unnamed ones. Adding flag on the Parse message that says when to
> >>plan the statement (or maybe on each Bind message even).
> >
> >
> > OK, what are unnamed prepared statements? When are they used currently?
> > Only via the wire protocol? Who uses them now?
>
> The unnamed prepared statement is like any other prepared statement
> except it doesn't have a name :) It can be accessed via:
>
> 1) V3 protocol Parse/Bind with an empty statement name uses the unnamed
> statement.
> 2) V2 or V3 "simple query" implicitly closes the unnamed statement.
>
> CVS HEAD defers planning in case (1) until the Bind is received so it
> can do planning cost estimation using concrete parameter values and
> produce a better plan. It only does this for the unnamed statement, not
> for named statements. If you Parse into a named statement, planning
> happens immediately when the Parse is done.
>
> This behaviour gives the client some flexibility without changing the
> protocol. It means that using Parse/Bind on the unnamed statement with
> parameters is essentially equivalent planning-wise to substituting the
> parameter values into the actual query and submitting that instead.
>
> What we talked about briefly was providing some way to control when
> planning was done on a per-statement basis -- so you could say "don't
> defer planning for this unnamed query because I'm going to reuse the
> unnamed statement multiple times and the first set of parameters might
> not generate an efficient plan" or "do defer planning of this named
> query because I know I will be executing it with many similar parameter
> values and estimating using the first set of parameters gives a good plan".
>
> Or an alternative is to have a way to control query replanning on each
> Bind individually -- so a client can get the benefit of skipping the
> parse step on subsequent executions and is able to pass parameters via
> Bind, but the query is replanned for the concrete parameter values on
> each execution. The JDBC driver wants this -- currently the use of named
> statements has to be explicitly turned on as with the current behaviour
> you may take a performance hit due to less-than-ideal plans as soon as
> you start using named statements.
>
> So maybe the TODO should be something like "allow finer-grained client
> control of query estimation and (re-)planning when using Parse/Bind".
>
> -O
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073