Re: convert EXSITS to inner join gotcha and bug

Lists: pgsql-hackers
From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 09:11:19
Message-ID: f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

Seems, there two issues:

1) Sometime conditions which for a first glance could be pushed down to scan are
leaved as join quals. And it could be a ~30 times performance loss.

2) Number of query result depend on enabe_seqscan variable.

The query
explain analyze
SELECT
*
FROM
t1
INNER JOIN t2 ON (
EXISTS (
SELECT
true
FROM
t3
WHERE
t3.id1 = t1.id AND
t3.id2 = t2.id
)
)
WHERE
t1.name = '5c5fec6a41b8809972870abc154b3ecd'
;

produces following plan:
Nested Loop (cost=6.42..1928.71 rows=1 width=99) (actual time=71.415..148.922
rows=162 loops=1)
Join Filter: (t3.id1 = t1.id)
Rows Removed by Join Filter: 70368
-> Index Only Scan using t1i2 on t1 (cost=0.28..8.30 rows=1 width=66)
(actual time=0.100..0.103 rows=1 loops=1)
Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text)
Heap Fetches: 1
-> Hash Join (cost=6.14..1918.37 rows=163 width=66) (actual
time=0.370..120.971 rows=70530 loops=1)
(1) Hash Cond: (t3.id2 = t2.id)
(2) -> Seq Scan on t3 (cost=0.00..1576.30 rows=70530 width=66) (actual
time=0.017..27.424 rows=70530 loops=1)
-> Hash (cost=3.84..3.84 rows=184 width=33) (actual
time=0.273..0.273 rows=184 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 20kB
-> Seq Scan on t2 (cost=0.00..3.84 rows=184 width=33) (actual
time=0.017..0.105 rows=184 loops=1)
Planning time: 7.326 ms
Execution time: 149.115 ms

Condition (1) is not pushed to scan (2) which seemsly could be safely moved.
With seqscan = off condition is not pushed too but query returns only one row
instead of 162. Scan on t3 returns ~70000 rows but only ~150 rows are really
needed. I didn't found a combination of GUCs enable_* to push down that and it
seems to me there is reason for that which I don't see or support is somehow missed.

If pair of (t3.id1, t3.id2) is unique (see dump, there is a unique index on
them) the query could be directly rewrited to inner join and its plan is:
Nested Loop (cost=9.70..299.96 rows=25 width=66) (actual time=0.376..5.232
rows=162 loops=1)
-> Nested Loop (cost=9.43..292.77 rows=25 width=99) (actual
time=0.316..0.645 rows=162 loops=1)
-> Index Only Scan using t1i2 on t1 (cost=0.28..8.30 rows=1
width=66) (actual time=0.047..0.050 rows=1 loops=1)
Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text)
Heap Fetches: 1
-> Bitmap Heap Scan on t3 (cost=9.15..283.53 rows=94 width=66)
(actual time=0.257..0.426 rows=162 loops=1)
Recheck Cond: (id1 = t1.id)
Heap Blocks: exact=3
-> Bitmap Index Scan on t3i1 (cost=0.00..9.12 rows=94 width=0)
(actual time=0.186..0.186 rows=162 loops=1)
Index Cond: (id1 = t1.id)
-> Index Only Scan using t2i1 on t2 (cost=0.27..0.29 rows=1 width=33)
(actual time=0.024..0.024 rows=1 loops=162)
Index Cond: (id = t3.id2)
Heap Fetches: 162
Planning time: 5.532 ms
Execution time: 5.457 ms

Second plan is ~30 times faster. But with turned off sequentual scan the first
query is not work correctly, which points to some bug in planner, I suppose.
Both 9.6 and 10devel are affected to addiction of query result on seqscan variable.

Dump to reproduce (subset of real data but obfucated), queries are in attachment
http://sigaev.ru/misc/exists_to_nested.sql.gz
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
query.sql text/plain 658 bytes

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 09:48:29
Message-ID: 70a2b572-6ead-ef98-70fa-c1305d92efd5@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Both 9.6 and 10devel are affected to addiction of query result on seqscan
> variable.
Oops, I was too nervious, 9.6 is not affected to enable_seqscan setting. But it
doesn't push down condition too.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 12:45:04
Message-ID: CAPpHfdtY-S51uNfEHObmq+3gJK_Ohfx7pEMs7k5SEM-JUMjrNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 28, 2017 at 12:48 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:

> Both 9.6 and 10devel are affected to addiction of query result on seqscan
>> variable.
>>
> Oops, I was too nervious, 9.6 is not affected to enable_seqscan setting.
> But it doesn't push down condition too.

I've reproduced this bug on d981074c.
On default config, after loading example.sql.bz2 and VACUUM ANALYZE, query
result is OK.

# explain analyze SELECT
*
FROM
t1
INNER JOIN t2 ON (
EXISTS (
SELECT
true
FROM
t3
WHERE
t3.id1 = t1.id AND
t3.id2 = t2.id
)
)
WHERE
t1.name = '5c5fec6a41b8809972870abc154b3ecd';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=6.42..1924.71 rows=1 width=99) (actual
time=14.044..34.957 rows=*162* loops=1)
Join Filter: (t3.id1 = t1.id)
Rows Removed by Join Filter: 70368
-> Index Only Scan using t1i2 on t1 (cost=0.28..4.30 rows=1 width=66)
(actual time=0.026..0.028 rows=1 loops=1)
Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text)
Heap Fetches: 0
-> Hash Join (cost=6.14..1918.37 rows=163 width=66) (actual
time=0.077..28.310 rows=70530 loops=1)
Hash Cond: (t3.id2 = t2.id)
-> Seq Scan on t3 (cost=0.00..1576.30 rows=70530 width=66)
(actual time=0.005..6.433 rows=70530 loops=1)
-> Hash (cost=3.84..3.84 rows=184 width=33) (actual
time=0.065..0.065 rows=184 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 20kB
-> Seq Scan on t2 (cost=0.00..3.84 rows=184 width=33)
(actual time=0.003..0.025 rows=184 loops=1)
Planning time: 2.542 ms
Execution time: 35.008 ms
(14 rows)

But with seqscan and hashjoin disabled, query returns 0 rows.

# set enable_seqscan = off;
# set enable_hashjoin = off;
# explain analyze SELECT
*
FROM
t1
INNER JOIN t2 ON (
EXISTS (
SELECT
true
FROM
t3
WHERE
t3.id1 = t1.id AND
t3.id2 = t2.id
)
)
WHERE
t1.name = '5c5fec6a41b8809972870abc154b3ecd';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.97..5265.82 rows=1 width=99) (actual
time=18.718..18.718 rows=*0* loops=1)
Join Filter: (t3.id1 = t1.id)
Rows Removed by Join Filter: 163
-> Index Only Scan using t1i2 on t1 (cost=0.28..4.30 rows=1 width=66)
(actual time=0.024..0.024 rows=1 loops=1)
Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text)
Heap Fetches: 0
-> Merge Join (cost=0.69..5259.48 rows=163 width=66) (actual
time=0.033..18.670 rows=163 loops=1)
Merge Cond: (t2.id = t3.id2)
-> Index Only Scan using t2i1 on t2 (cost=0.27..19.03 rows=184
width=33) (actual time=0.015..0.038 rows=184 loops=1)
Heap Fetches: 0
-> Index Only Scan using t3i2 on t3 (cost=0.42..4358.37
rows=70530 width=66) (actual time=0.015..10.484 rows=70094 loops=1)
Heap Fetches: 0
Planning time: 2.571 ms
Execution time: 18.778 ms
(14 rows)

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 14:26:58
Message-ID: 26008.1493389618@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> writes:
> I've reproduced this bug on d981074c.
> On default config, after loading example.sql.bz2 and VACUUM ANALYZE, query
> result is OK.
> But with seqscan and hashjoin disabled, query returns 0 rows.

Ah, thanks for the clue about enable_hashjoin, because it wasn't
reproducing for me as stated.

It looks like in the case that's giving wrong answers, the mergejoin
is wrongly getting marked as "Inner Unique". Something's a bit too
cheesy about that planner logic --- not sure what, yet.

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 14:42:08
Message-ID: 787bc5ab-7be6-fa43-e4bc-9e3c0436f8c7@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Ah, thanks for the clue about enable_hashjoin, because it wasn't
> reproducing for me as stated.
I missed tweaked config, sorry

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 15:56:34
Message-ID: CAKJS1f8K=Bhdmw0KpWtiMtmYhQV8ZxvONCzpBwyO71-OkXFfKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 April 2017 at 00:45, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> On default config, after loading example.sql.bz2 and VACUUM ANALYZE, query
> result is OK.

Hi,

Did you mean to attach this?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 15:59:45
Message-ID: 29208.1493395185@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> Did you mean to attach this?

See the link in Teodor's original message (it's actually a .bz2 file
not a .gz)

regards, tom lane


From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 16:29:18
Message-ID: CAKJS1f9GLO4+sULDrJEC9+Qksb3gyYDUemtkg2n0oOr8nXsD6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(On 29 April 2017 at 02:26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> writes:
>> I've reproduced this bug on d981074c.
>> On default config, after loading example.sql.bz2 and VACUUM ANALYZE, query
>> result is OK.
>> But with seqscan and hashjoin disabled, query returns 0 rows.
>
> Ah, thanks for the clue about enable_hashjoin, because it wasn't
> reproducing for me as stated.
>
> It looks like in the case that's giving wrong answers, the mergejoin
> is wrongly getting marked as "Inner Unique". Something's a bit too ()
> cheesy about that planner logic --- not sure what, yet.

Seems related to the unconditional setting of extra.inner_unique to
true for JOIN_UNIQUE_INNER jointypes in add_paths_to_joinrel()

Setting this based on the return value of innerrel_is_unique() as done
with the other join types seems to fix the issue.

I don't know yet if that's the correct fix. It's pretty late 'round
this side to be thinking too hard about it.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 21:37:58
Message-ID: CAPpHfdsX4CeC7OsiCjy5JgKmZLyiNsUWcqQzy_Pag4nUE6OfXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 28, 2017 at 6:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> > Did you mean to attach this?
>
> See the link in Teodor's original message (it's actually a .bz2 file
> not a .gz)
>

Yes, I didn't mean Teodor has renamed it.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-28 21:54:53
Message-ID: 19680.1493416493@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> (On 29 April 2017 at 02:26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It looks like in the case that's giving wrong answers, the mergejoin
>> is wrongly getting marked as "Inner Unique". Something's a bit too
>> cheesy about that planner logic --- not sure what, yet.

> Seems related to the unconditional setting of extra.inner_unique to
> true for JOIN_UNIQUE_INNER jointypes in add_paths_to_joinrel()
> Setting this based on the return value of innerrel_is_unique() as done
> with the other join types seems to fix the issue.
> I don't know yet if that's the correct fix. It's pretty late 'round
> this side to be thinking too hard about it.

Yes, I think that's correct. I'd jumped to the conclusion that we could
skip making the test in this case, but this example shows that that's
wrong. The problem is that, in an example like this, create_unique_path
will create a path that's unique-ified for all the join keys of the
semijoin --- but we're considering joining against just a subset of the
semijoin's outer rels, so the inner path is NOT unique for that subset.

We could possibly skip making the test if the outerrel contains
sjinfo->min_lefthand, but I'm not sufficiently excited about shaving
cycles here to take any new risks. Let's just call innerrel_is_unique()
and be done.

Will fix in a bit, once I've managed to create a smaller test case for
the regression tests.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-29 03:39:40
Message-ID: 14538.1493437180@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>> (On 29 April 2017 at 02:26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Seems related to the unconditional setting of extra.inner_unique to
>> true for JOIN_UNIQUE_INNER jointypes in add_paths_to_joinrel()
>> Setting this based on the return value of innerrel_is_unique() as done
>> with the other join types seems to fix the issue.

> Yes, I think that's correct.

Well, "make check-world" disabused me of that notion: there are several
test cases in postgres_fdw that lost perfectly-valid inner_unique
markings. The reason is that create_unique_path will create uniqueness
even when you couldn't prove it from the underlying rel itself. So my
previous thought about comparing the outerrel to sjinfo->min_lefthand is
really necessary to avoid regressions from what we had before.

However, while that seems to be enough to generate correct plans, it
doesn't address Teodor's performance complaint: he's wishing the planner
would notice that the semijoin inner rel is effectively unique, even when
the best plan involves initially joining the semijoin inner rel to just
a subset of the semijoin outer --- against which that inner rel is *not*
unique. Applying innerrel_is_unique() helps for some simpler cases, but
not this one.

Really, the way to fix Teodor's complaint is to recognize that the
semijoin inner rel is effectively unique against the whole outer rel,
and then strength-reduce the semijoin to a plain join. The infrastructure
we built for unique joins is capable of proving that, we just weren't
applying it in the right way.

Attached are two alternative patches. The first just does the minimum
necessary to fix the bug; the second adds some code to perform
strength-reduction of semijoins. The second patch is capable of finding
the plan Teodor wanted for his test case --- in fact, left to its own
devices, it finds a *better* plan, both by cost and actual runtime.

I'm kind of strongly tempted to apply the second patch; but it would
be fair to complain that reduce_unique_semijoins() is new development
and should wait for v11. Opinions?

regards, tom lane

Attachment Content-Type Size
minimum-inner-unique-fix.patch text/x-diff 4.1 KB
reduce-semijoins.patch text/x-diff 17.3 KB

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-29 17:35:05
Message-ID: ead7502c-a0bf-641e-6ebd-69dddb9165a5@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Really, the way to fix Teodor's complaint is to recognize that the
> semijoin inner rel is effectively unique against the whole outer rel,
> and then strength-reduce the semijoin to a plain join. The infrastructure
> we built for unique joins is capable of proving that, we just weren't
> applying it in the right way.
Perfect, it works. Thank you! Second patch reduces time of full query (my
example was just a small part) from 20 minutes to 20 seconds.

> I'm kind of strongly tempted to apply the second patch; but it would
> be fair to complain that reduce_unique_semijoins() is new development
> and should wait for v11. Opinions?

Obviously, I'm voting for second patch applied to version 10.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-30 00:14:27
Message-ID: CAKJS1f8pYcW3s-=DGXN6az1WgHUeZN1NJDpxAYUZzNMfNConGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 April 2017 at 15:39, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'm kind of strongly tempted to apply the second patch; but it would
> be fair to complain that reduce_unique_semijoins() is new development
> and should wait for v11. Opinions?

My vote is for the non-minimal patch. Of course, I'd be voting for
minimal patch if this was for a minor version release fix, but we're
not even in beta yet for v10. The original patch was intended to fix
cases like this, although I'd failed to realise this particular case.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: convert EXSITS to inner join gotcha and bug
Date: 2017-04-30 00:27:29
Message-ID: 30370.1493512049@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> On 29 April 2017 at 15:39, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm kind of strongly tempted to apply the second patch; but it would
>> be fair to complain that reduce_unique_semijoins() is new development
>> and should wait for v11. Opinions?

> My vote is for the non-minimal patch. Of course, I'd be voting for
> minimal patch if this was for a minor version release fix, but we're
> not even in beta yet for v10. The original patch was intended to fix
> cases like this, although I'd failed to realise this particular case.

Yeah, I thought we'd discussed doing something more or less like this
way back in that thread.

After studying the patch some more, I noticed that reduce_unique_semijoins
falsifies the assumption in innerrel_is_unique that we only probe inner
uniqueness for steadily larger relid sets. If the semijoin LHS is more
than one relation, then it'll test inner uniqueness using that LHS, and if
the proof fails, that's knowledge that can save individual proof attempts
for the individual LHS rels later on during the join search. So in the
attached, I've modified reduce_unique_semijoins's API a bit more to allow
the caller to override the don't-cache heuristic.

Also, this form of the patch is an incremental patch over the minimal
fix I posted yesterday. It seems like a good idea to push it as a
separate commit, if only for future bisection purposes.

If I don't hear objections, I'll push this tomorrow sometime.

regards, tom lane

Attachment Content-Type Size
reduce-semijoins-2.patch text/x-diff 17.4 KB