Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
Date: 2012-03-15 01:29:49
Message-ID: 10461.1331774989@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I looked into the problem complained of here:
http://archives.postgresql.org/pgsql-bugs/2012-03/msg00016.php
It's not at all specific to custom types; you can exhibit it with
this query in the regression database:

explain select * from
(select 1 as t, unique1 from tenk1 a
union all
select 2 as t, unique1 from tenk1 b) c
where t = 2;

9.0 successfully optimizes away the excluded subquery:

QUERY PLAN
-------------------------------------------------------------------------
Result (cost=0.00..458.00 rows=10000 width=8)
-> Append (cost=0.00..458.00 rows=10000 width=8)
-> Seq Scan on tenk1 b (cost=0.00..458.00 rows=10000 width=8)
(3 rows)

but 9.1 and HEAD not so much:

QUERY PLAN
----------------------------------------------------------------------
Result (cost=0.00..966.00 rows=100 width=8)
-> Append (cost=0.00..966.00 rows=100 width=8)
-> Seq Scan on tenk1 a (cost=0.00..483.00 rows=50 width=8)
Filter: (1 = 2)
-> Seq Scan on tenk1 b (cost=0.00..483.00 rows=50 width=8)
Filter: (2 = 2)
(6 rows)

This is a consequence of commit 57664ed25e5dea117158a2e663c29e60b3546e1c,
which was already known to cause some issues as per commit
b28ffd0fcc583c1811e5295279e7d4366c3cae6c. This gripe is basically the
same problem: when we push the "t = 2" condition down into the subqueries,
"t" is no longer replaced with just constant 1 or constant 2, but with
those constants wrapped in PlaceHolderVar nodes. In this case that
prevents constant-folding from realizing it can simplify "1 = 2" or
"2 = 2" to constant false or true, whereas in the previous complaint
the PHV wrappers were interfering with matching expressions to indexes.

I spent a fair amount of time thinking about whether we could revert
that patch and solve the original problem some other way, but with no
real success. The original problem was reported here:
http://archives.postgresql.org/pgsql-hackers/2011-11/msg00419.php
with an example equivalent to this variant of the previous query:

explain select * from
(select thousand as t1, tenthous as t2 from tenk1 a
union all
select 42 as t1, 42 as t2 from tenk1 b) c
order by t1, t2;

There is an EquivalenceClass for each of "t1" and "t2", and if we don't
do something like wrapping the constants with distinct PHVs, then
add_child_rel_equivalences will end up pushing identical constants into
both ECs, thus totally bollixing the fundamental rule that any expression
should match at most one EC.

Another variant of this is where there are identical Vars rather than
constants in one of the subqueries:

explain select * from
(select thousand as t1, tenthous as t2 from tenk1 a
union all
select unique2 as t1, unique2 as t2 from tenk1 b) c
order by t1, t2;

I chose this example to match existing indexes in the regression database:
the ideal plan would do an indexscan on the (thousand, tenthous) index
for the first arm, and an indexscan on the (unique2) index for the second
arm, and MergeAppend them together. In general the planner is aware that
"ORDER BY x, x" is the same as "ORDER BY x", so you'd think it could apply
that principle to the second arm of this union ... but it can't. To do
that, it would have to realize that the unique2 index matches both of the
EquivalenceClasses in this query, and that's totally outside its model of
reality.

It seems to me that to do a really nice job with this sort of situation,
we would need some more general concept than EquivalenceClasses. I'm
not sure what, though I have a vague feeling that it might look like
EquivalenceClasses that are only valid within some sub-area of a query.

Now, this is a sufficiently weird corner case that I'm not desirous of
making major planner design changes just to improve this particular
outcome (and in any case that doesn't sound like a backpatchable bug
fix). But down the road we may think of more reasons why we need a
better idea than EquivalenceClasses.

In the meantime, the best solution I've been able to think of goes like
this: continue to add PHVs on to duplicated or non-Var subquery outputs
when propagating those outputs into the outer query, but then strip them
off again when propagating transformed outer expressions down into the
sub-query. There are basically only two places where we do the latter ---
set_append_rel_pathlist in allpaths.c propagates the inheritance parent's
baserestrictlist and other attachments to child rels, and
match_eclass_clauses_to_index extracts modified join clauses from
EquivalenceClasses. So it's a bit ugly but should be a localized fix,
and it would allow us to revert b28ffd0fcc583c1811e5295279e7d4366c3cae6c
because the problem would be taken care of at a higher level. This
would not fix the problem shown in the last example, that ideally we
should be able to match an index to more than one EquivalenceClass;
the planner would just take the first match. In just about all of the
non-contrived cases I've been able to think of, this is all right because
there's only one plausible match anyhow, so I'm not terribly concerned.

Comments?

regards, tom lane


From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
Date: 2012-03-15 09:16:03
Message-ID: 4F61B353.2010700@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-03-15 02:29, Tom Lane wrote:
>
> explain select * from
> (select thousand as t1, tenthous as t2 from tenk1 a
> union all
> select 42 as t1, 42 as t2 from tenk1 b) c
> order by t1, t2;
>
> There is an EquivalenceClass for each of "t1" and "t2", and if we don't
> do something like wrapping the constants with distinct PHVs, then
> add_child_rel_equivalences will end up pushing identical constants into
> both ECs, thus totally bollixing the fundamental rule that any expression
> should match at most one EC.

I'm having a hard time imagining that add_child_rel_equivalences is not
just plain wrong. Even though it will only add child equivalence members
to a parent eq class when certain conditions are met, isn't it the case
that since a union (all) is addition of tuples and not joining, any kind
of propagating restrictions on a append rel child member to other areas
of the plan can cause unwanted results, like the ones currently seen?

regards,
Yeb Havinga


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
Date: 2012-03-15 14:37:55
Message-ID: 19935.1331822275@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yeb Havinga <yebhavinga(at)gmail(dot)com> writes:
> On 2012-03-15 02:29, Tom Lane wrote:
>>> There is an EquivalenceClass for each of "t1" and "t2", and if we don't
>>> do something like wrapping the constants with distinct PHVs, then
>>> add_child_rel_equivalences will end up pushing identical constants into
>>> both ECs, thus totally bollixing the fundamental rule that any expression
>>> should match at most one EC.

> I'm having a hard time imagining that add_child_rel_equivalences is not
> just plain wrong. Even though it will only add child equivalence members
> to a parent eq class when certain conditions are met, isn't it the case
> that since a union (all) is addition of tuples and not joining, any kind
> of propagating restrictions on a append rel child member to other areas
> of the plan can cause unwanted results, like the ones currently seen?

None of the known problems are the fault of that, really. The child
entries don't cause merging of ECs, which would be the only way that
they'd affect the semantics of the query at large. So in that sense
they are not really members of the EC but just some auxiliary entries
that ease figuring out whether a child expression matches an EC.

Having said that, I have been wondering whether there's a better way to
do that task. Right now prepare_sort_from_pathkeys has to do a lot of
fairly ugly, heuristic stuff to do that matching at all. The other
area of the code that has to match child expressions to ECs is index
path generation, and we already know that code isn't terribly happy with
the PHV-based solution either ...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
Date: 2012-03-15 18:00:51
Message-ID: 23034.1331834451@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Yeb Havinga <yebhavinga(at)gmail(dot)com> writes:
>> I'm having a hard time imagining that add_child_rel_equivalences is not
>> just plain wrong. Even though it will only add child equivalence members
>> to a parent eq class when certain conditions are met, isn't it the case
>> that since a union (all) is addition of tuples and not joining, any kind
>> of propagating restrictions on a append rel child member to other areas
>> of the plan can cause unwanted results, like the ones currently seen?

> None of the known problems are the fault of that, really. The child
> entries don't cause merging of ECs, which would be the only way that
> they'd affect the semantics of the query at large. So in that sense
> they are not really members of the EC but just some auxiliary entries
> that ease figuring out whether a child expression matches an EC.

After further thought about that, I've concluded that indeed my patch
57664ed25e5dea117158a2e663c29e60b3546e1c was just plain wrong, and
Teodor was more nearly on the right track than I was in the original
discussion. If child EC members aren't full-fledged members then
there's no a-priori reason why they need to be distinct from each other.
There are only a few functions that actually match anything to child
members (although there are some others that could use Asserts or tests
to make it clearer that they aren't paying attention to child members).
AFAICT, if we don't try to enforce uniqueness of child members, the only
consequences will be:

(1) It'll be order-dependent which EquivalenceClass a child index column
is thought to match. As I explained earlier, this is not really the
fault of this representational detail, but is a basic shortcoming of the
whole current concept of ECs. Taking the first match is fine for now.

(2) It'll be unclear which of several identical subplan output columns
should be sorted by in prepare_sort_from_pathkeys. Now ordinarily that
does not particularly matter --- if you have multiple identical
nonvolatile expressions, you can take any one (and we already have a
hack in there for the volatile case). I think it *only* matters for
MergeAppend, where we need to be sure that the sort column locations
match across all the children. However, we can fix that in some
localized way instead of screwing up ECs generally. The idea I have in
mind at the moment, since create_merge_append_plan already starts by
determining the sort column locations for the MergeAppend itself, is to
pass down that info to the calls for the child plans and insist that we
match to the same column locations we found for the parent MergeAppend.

So I now propose reverting the earlier two patches (but not their
regression test cases of course) and instead hacking MergeAppend plan
building as per (2).

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
Date: 2012-03-16 15:16:14
Message-ID: 11635.1331910974@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> So I now propose reverting the earlier two patches (but not their
> regression test cases of course) and instead hacking MergeAppend plan
> building as per (2).

Attached is a draft patch for that. There are several things going on
here:

* Revert the preceding two patches (except for their regression test
cases).

* Install defenses to ensure that child EC members are ignored except in
very specific contexts, and improve documentation around that. The most
significant change is that get_eclass_for_sort_expr now takes a Relids
argument, and won't consider child members unless they match the Relids.
It turns out that the places that were trying to match indexes to ECs
already acted that way; which I think I had done just as a speed
optimization, but it turns out to be important for correctness too.

* Rewrite prepare_sort_from_pathkeys() to allow required sort column
numbers to be passed in, thus fixing Teodor's original problem more
directly. As part of that, I removed createplan.c's add_sort_column()
code, which was meant as a last-ditch check that we don't build sorts
with useless duplicate sort columns. As far as I can tell, that's dead
code and has been for a long time, because we already eliminate
duplicate pathkeys or SortGroupClause list entries far upstream of this.
Even if there are some corner cases where it does something useful,
that's rare enough to not really justify expending the cycles. I had to
remove it because if it did fire and remove a sort column, the outer
loop in prepare_sort_from_pathkeys() wouldn't be in sync with the input
column-number array.

* While testing this I noticed that planagg.c failed to optimize min/max
aggregates on appendrels that are made from UNION ALL subqueries rather
than inherited relations. So this patch also tweaks planagg.c to handle
such cases.

Viewing the problem this way, the only known symptom is the originally
reported "MergeAppend child's targetlist doesn't match MergeAppend",
which of course is not an issue before 9.1, so I'm not going to try
to back-patch further than 9.1. It is possible that we need some of the
child EC member defenses further back, but I'll await some evidence of
user-visible bugs first.

regards, tom lane

Attachment Content-Type Size
placeholdervar-fix-this-time-for-sure.patch text/x-patch 52.2 KB

From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Yeb Havinga <yebhavinga(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
Date: 2012-03-17 04:07:14
Message-ID: CAM-w4HNgteqrE=2MuunNgdApjA-tys+o_Cjz7x9xGvcWF0VkTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 16, 2012 at 3:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> So I now propose reverting the earlier two patches (but not their
>> regression test cases of course) and instead hacking MergeAppend plan
>> building as per (2).

As a wise man once said, "This is tricky stuff". I feel a better that
I got stuck on this stuff when you're still trying to feel your way
after this many go-arounds.

I think I'll be aiming at some less subtle things when I get back to
contributing. Things like the getrusage patch I still haven't
forgotten about.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Yeb Havinga <yebhavinga(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
Date: 2012-03-17 14:46:52
Message-ID: 6870.1331995612@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> On Fri, Mar 16, 2012 at 3:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So I now propose reverting the earlier two patches (but not their
> regression test cases of course) and instead hacking MergeAppend plan
> building as per (2).

> As a wise man once said, "This is tricky stuff". I feel a better that
> I got stuck on this stuff when you're still trying to feel your way
> after this many go-arounds.

Well, looking back on it, I feel this was at bottom a documentation
failure. I think that when I wrote the EquivalenceClass code, I knew
that "child" members did not have similar semantics to regular members.
But I had forgotten that when Teodor reported the MergeAppend bug,
and so misdiagnosed what I was seeing happen as being corruption of
the EC contents, when it wasn't really. I added some documentation
around this point in the patch I committed yesterday...

regards, tom lane