Re: Effects of GUC settings on automatic replans

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Effects of GUC settings on automatic replans
Date: 2007-03-20 17:11:13
Message-ID: 15168.1174410673@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Now that there's a mechanism in the backend that will automatically replan
queries whenever anything changes about the referenced tables, we have to
worry about whether an automatic replan might cause surprising changes in
the behavior of a query. I looked through the available GUC settings to
see what would affect a replan, and came up with just four that would
potentially affect the semantics of the query:

search_path
add_missing_from
transform_null_equals
sql_inheritance

As I've already mentioned, I think we must address search_path by saving
the path at time of first plan and using that same path during any replan.
However, I'm not excited about adding mechanism to similarly save and
restore the others. They're all for legacy-app compatibility and so
seem unlikely to be changed on-the-fly within a session. Also,
add_missing_from and transform_null_equals aren't going to affect sanely
written queries in the first place. sql_inheritance is a little bit
bigger deal, but I wonder whether we shouldn't just remove that variable
altogether --- it's been default ON since 7.1 and I've not heard anyone
complain about that in a long time.

There are a boatload of other GUCs that could potentially result in
changes of planner choices:

enable_bitmapscan
enable_hashagg
enable_hashjoin
enable_indexscan
enable_mergejoin
enable_nestloop
enable_seqscan
enable_sort
enable_tidscan
constraint_exclusion
from_collapse_limit
join_collapse_limit
geqo
geqo_effort
geqo_generations
geqo_pool_size
geqo_selection_bias
geqo_threshold
seq_page_cost
random_page_cost
cpu_tuple_cost
cpu_index_tuple_cost
cpu_operator_cost
effective_cache_size
work_mem

I'm inclined not to worry about these, since changing them can't affect
the semantics of the query, at worst its performance.

One other question is exactly what "saving and restoring" search_path
should mean. We could do it textually and thus need to re-interpret
the string on each replan, or we could save the actual list of schema
OIDs. The main disadvantage of the textual way is that without some
special hack, it's possible that a replan would see the temp-table
schema as being frontmost when it had not been active at all originally;
that seems bad. OTOH if we save the OID list then it would not work
to drop a schema and rename another into its place, which is a bit
inconsistent with the fact that that does work for an individual table.

Comments?

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-21 08:46:54
Message-ID: 4600F0FE.6000300@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Now that there's a mechanism in the backend that will automatically replan
> queries whenever anything changes about the referenced tables, we have to
> worry about whether an automatic replan might cause surprising changes in
> the behavior of a query. I looked through the available GUC settings to
> see what would affect a replan, and came up with just four that would
> potentially affect the semantics of the query:
>
> search_path
> add_missing_from
> transform_null_equals
> sql_inheritance
>
> As I've already mentioned, I think we must address search_path by saving
> the path at time of first plan and using that same path during any replan.

Yes. I think this is the only secure way to do it.

> However, I'm not excited about adding mechanism to similarly save and
> restore the others. They're all for legacy-app compatibility and so
> seem unlikely to be changed on-the-fly within a session. Also,
> add_missing_from and transform_null_equals aren't going to affect sanely
> written queries in the first place. sql_inheritance is a little bit
> bigger deal, but I wonder whether we shouldn't just remove that variable
> altogether --- it's been default ON since 7.1 and I've not heard anyone
> complain about that in a long time.

Let's do a quick survey on a couple mailing lists.

>
> There are a boatload of other GUCs that could potentially result in
> changes of planner choices:

I think the only thing we need do about the GUCs is provide the user
with a command which flushes all plans for the session. Hmmmm, that's
not really necessary I suppose; one can easily reconnect.

For that matter, can anyone think why we'd need a command for the
superuser to flush all plans in the server? It seems like something we
ought to have, but I don't have a good case why ...

--Josh Berkus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-21 14:56:23
Message-ID: 28192.1174488983@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Tom Lane wrote:
>> sql_inheritance is a little bit
>> bigger deal, but I wonder whether we shouldn't just remove that variable
>> altogether --- it's been default ON since 7.1 and I've not heard anyone
>> complain about that in a long time.

> Let's do a quick survey on a couple mailing lists.

OK, I sent out a note to pgsql-general and pgsql-sql about that.

> For that matter, can anyone think why we'd need a command for the
> superuser to flush all plans in the server?

You mean, not only the current backend but other people's backends?
I can't see any very simple way to do that, and without a pretty
compelling use-case I'm not going to worry about it.

So far nobody's commented on the question of exactly what should be
saved and restored for a cached query's search_path. I think for the
moment I will go with the save-a-list-of-schema-OIDs approach, since
that seems fairly unlikely to be subvertible by a miscreant user.
We can always change the details later.

regards, tom lane


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-21 17:39:34
Message-ID: 46016DD6.9060807@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/20/2007 1:11 PM, Tom Lane wrote:
> Now that there's a mechanism in the backend that will automatically replan
> queries whenever anything changes about the referenced tables, we have to
> worry about whether an automatic replan might cause surprising changes in
> the behavior of a query. I looked through the available GUC settings to
> see what would affect a replan, and came up with just four that would
> potentially affect the semantics of the query:
>
> search_path
> add_missing_from
> transform_null_equals
> sql_inheritance

Don't we actually store the parsetree in the query cache, and doesn't
that actually make a lot of the above rather NOT affect the resulting
plan any more? The objects, used in the parsetree, are identified by OID
and got only looked up according to the search_path (for example) during
the initial parsing of the query.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-21 17:46:12
Message-ID: 12482.1174499172@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> On 3/20/2007 1:11 PM, Tom Lane wrote:
>> search_path
>> add_missing_from
>> transform_null_equals
>> sql_inheritance

> Don't we actually store the parsetree in the query cache, and doesn't
> that actually make a lot of the above rather NOT affect the resulting
> plan any more?

No, what the code now does is to store the raw grammar output --- a
replan includes a fresh pass through parse_analyze. This must happen
if we want the thing to cope with dropping and replacing temp tables,
which is one of the main use-cases ...

regards, tom lane


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-21 18:25:30
Message-ID: 4601789A.6030809@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/21/2007 1:46 PM, Tom Lane wrote:
> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>> On 3/20/2007 1:11 PM, Tom Lane wrote:
>>> search_path
>>> add_missing_from
>>> transform_null_equals
>>> sql_inheritance
>
>> Don't we actually store the parsetree in the query cache, and doesn't
>> that actually make a lot of the above rather NOT affect the resulting
>> plan any more?
>
> No, what the code now does is to store the raw grammar output --- a
> replan includes a fresh pass through parse_analyze. This must happen
> if we want the thing to cope with dropping and replacing temp tables,
> which is one of the main use-cases ...

Ah, yes, that makes more cents now.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-24 17:20:35
Message-ID: 4DCB6439-B0B2-4654-9040-B94AFADD054F@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mar 21, 2007, at 5:11 AM, Tom Lane wrote:
> constraint_exclusion
>
> I'm inclined not to worry about these, since changing them can't
> affect
> the semantics of the query, at worst its performance.

Hrm... wasn't that option added in case there was a bug in the
exclusion code? I certainly can't think of any performance reason why
you'd want to disable it... so it might be worth invalidating plans
if it changes.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Jim Nasby" <decibel(at)decibel(dot)org>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-25 00:06:58
Message-ID: 871wje4259.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jim Nasby" <decibel(at)decibel(dot)org> writes:

> On Mar 21, 2007, at 5:11 AM, Tom Lane wrote:
>> constraint_exclusion
>>
>> I'm inclined not to worry about these, since changing them can't affect
>> the semantics of the query, at worst its performance.
>
> Hrm... wasn't that option added in case there was a bug in the exclusion code?
> I certainly can't think of any performance reason why you'd want to disable
> it... so it might be worth invalidating plans if it changes.

It was added precisely because we didn't have plan invalidation... If you
dropped a constraint there was previously nothing to force the plan to be
recalculated if it depended on the constraint for correctness.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-03-25 17:31:22
Message-ID: 7774.1174843882@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <decibel(at)decibel(dot)org> writes:
> On Mar 21, 2007, at 5:11 AM, Tom Lane wrote:
>> constraint_exclusion

> Hrm... wasn't that option added in case there was a bug in the
> exclusion code?

Well, the "bug" was a lack of ways to get rid of plans that were
no longer valid because of constraint changes; a problem that no
longer exists now that the invalidation mechanism is there.
(Hm, I think the docs need some updates now...)

The other argument was that you might not want the costs of searching
for contradictory constraints if your workload was such that the search
never or hardly ever succeeds. That still justifies the existence of
this GUC variable, I think, but I don't see that it's a reason to force
replanning if the variable is changed. Certainly it's not any more
interesting than any of the other variables affecting planner behavior.

regards, tom lane


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-04-09 20:39:24
Message-ID: 09D87436-9EE8-4A95-AB45-B5375173D3ED@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mar 25, 2007, at 12:31 PM, Tom Lane wrote:
> Jim Nasby <decibel(at)decibel(dot)org> writes:
>> On Mar 21, 2007, at 5:11 AM, Tom Lane wrote:
>>> constraint_exclusion
>
>> Hrm... wasn't that option added in case there was a bug in the
>> exclusion code?
>
> Well, the "bug" was a lack of ways to get rid of plans that were
> no longer valid because of constraint changes; a problem that no
> longer exists now that the invalidation mechanism is there.
> (Hm, I think the docs need some updates now...)
>
> The other argument was that you might not want the costs of searching
> for contradictory constraints if your workload was such that the
> search
> never or hardly ever succeeds. That still justifies the existence of
> this GUC variable, I think, but I don't see that it's a reason to
> force
> replanning if the variable is changed. Certainly it's not any more
> interesting than any of the other variables affecting planner
> behavior.

I'm doubtful that there are any cases where not doing the search
would be worth the time saved, since it'd mean you'd be getting data
out of most/all partitions at that point...

If we are going to leave the GUC I think we should default it to ON.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Effects of GUC settings on automatic replans
Date: 2007-04-09 20:59:17
Message-ID: 10551.1176152357@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <decibel(at)decibel(dot)org> writes:
> On Mar 25, 2007, at 12:31 PM, Tom Lane wrote:
>> The other argument was that you might not want the costs of searching
>> for contradictory constraints if your workload was such that the
>> search
>> never or hardly ever succeeds. That still justifies the existence of
>> this GUC variable, I think, but I don't see that it's a reason to
>> force
>> replanning if the variable is changed. Certainly it's not any more
>> interesting than any of the other variables affecting planner
>> behavior.

> I'm doubtful that there are any cases where not doing the search
> would be worth the time saved, since it'd mean you'd be getting data
> out of most/all partitions at that point...

You've got some kind of blinders on, Jim ... queries against large
partitioned tables are not the only ones in the world, or even most
of them.

regards, tom lane