Re: Merge join and index scan strangeness

Lists: pgsql-hackers
From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Merge join and index scan strangeness
Date: 2010-02-19 13:09:34
Message-ID: 4B7E8D8E.4010004@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

I found something strange with merge join. Let there are two table
(http://www.sigaev.ru/misc/ex.sql.gz, 360Kb) t1 and t2, both without indexes.
Query is:
UPDATE
t1
SET
f1 = t1.f1 || t2.f1
FROM
t2
WHERE
t2.f1 = t1.f1 AND
t2.f2 = t1.f2 AND
t2.f3 = t1.f3 AND
t2.f4 = t1.f4
;

I forbid everything except merge join and index scan, so explain gives:
set enable_hashjoin=off;
set enable_nestloop=off;
set enable_seqscan=off;
set enable_bitmapscan=off;

Merge Join (cost=20000035240.26..20000388197.90 rows=14024070 width=82)
Merge Cond: ((t2.f1 = t1.f1) AND (t2.f2 = t1.f2) AND (t2.f3 = t1.f3) AND
(t2.f4 = t1.f4))
-> Sort (cost=10000000040.69..10000000042.19 rows=600 width=59)
Sort Key: t2.f1, t2.f2, t2.f3, t2.f4
-> Seq Scan on t2 (cost=10000000000.00..10000000013.00 rows=600
width=59)
-> Materialize (cost=10000035199.57..10000038135.06 rows=234839 width=65)
-> Sort (cost=10000035199.57..10000035786.67 rows=234839 width=65)
Sort Key: t1.f1, t1.f2, t1.f3, t1.f4
-> Seq Scan on t1 (cost=10000000000.00..10000005017.39
rows=234839 width=65)

All looks good at this point. Create index on suggested by merge join columns:
CREATE INDEX i1 ON t1 (f1, f2, f3, f4);
CREATE INDEX i2 ON t2 (f1, f2, f3, f4);

And explain:
Merge Join (cost=49897.68..402855.32 rows=14024070 width=82)
Merge Cond: ((t2.f4 = t1.f4) AND (t2.f1 = t1.f1) AND (t2.f2 = t1.f2) AND
(t2.f3 = t1.f3))
-> Sort (cost=90.81..92.31 rows=600 width=59)
Sort Key: t2.f4, t2.f1, t2.f2, t2.f3
-> Index Scan using i2 on t2 (cost=0.00..63.13 rows=600 width=59)
-> Materialize (cost=49806.86..52742.35 rows=234839 width=65)
-> Sort (cost=49806.86..50393.96 rows=234839 width=65)
Sort Key: t1.f4, t1.f1, t1.f2, t1.f3
-> Index Scan using i1 on t1 (cost=0.00..19624.68 rows=234839
width=65)

Merge join chooses another order of fields! It seems to me that index scan with
sort should be slower than pure index scan. Ok, add another indexes with
suggested column's order:

CREATE INDEX i11 ON t1 (f4, f1, f2, f3);
CREATE INDEX i21 ON t2 (f4, f1, f2, f3);

Explain:
Merge Join (cost=90.81..372665.64 rows=14024070 width=82)
Merge Cond: ((t1.f1 = t2.f1) AND (t1.f2 = t2.f2) AND (t1.f3 = t2.f3) AND
(t1.f4 = t2.f4))
-> Index Scan using i1 on t1 (cost=0.00..19624.68 rows=234839 width=65)
-> Sort (cost=90.81..92.31 rows=600 width=59)
Sort Key: t2.f1, t2.f2, t2.f3, t2.f4
-> Index Scan using i21 on t2 (cost=0.00..63.13 rows=600 width=59)

Megre join uses index scan but for table t2 it uses wrong index! And again index
scan + sort instead of index scan.

Am I miss something or misunderstand?

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge join and index scan strangeness
Date: 2010-02-19 13:14:41
Message-ID: 4B7E8EC1.6060109@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> I found something strange with merge join. Let there are two table

Sorry, postgresql's version is 8.4 from today CVS
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge join and index scan strangeness
Date: 2010-02-19 13:51:39
Message-ID: 4B7E976B.5070103@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev wrote:
>
>> I found something strange with merge join. Let there are two table
>
> Sorry, postgresql's version is 8.4 from today CVS
For what it's worth - 8.4.0 gives as expected.

aap=# explain UPDATE
t1
SET
f1 = t1.f1 || t2.f1
FROM
t2
WHERE
t2.f1 = t1.f1 AND
t2.f2 = t1.f2 AND
t2.f3 = t1.f3 AND
t2.f4 = t1.f4
;
QUERY
PLAN
---------------------------------------------------------------------------------------------
Merge Join (cost=0.00..28522.60 rows=1 width=142)
Merge Cond: ((t1.f1 = t2.f1) AND (t1.f2 = t2.f2) AND (t1.f3 = t2.f3)
AND (t1.f4 = t2.f4))
-> Index Scan using i1 on t1 (cost=0.00..26090.94 rows=234839
width=110)
-> Index Scan using i2 on t2 (cost=0.00..77.25 rows=600 width=104)
(4 rows)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge join and index scan strangeness
Date: 2010-02-19 14:19:48
Message-ID: 2177.1266589188@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> I found something strange with merge join. Let there are two table

> Sorry, postgresql's version is 8.4 from today CVS

Can't reproduce it here, either in HEAD or 8.4. Sure you have a clean
build with no local modifications? The outright-incorrect last plan
you show seems to indicate something rather badly wrong with pathkey
matching.

regards, tom lane


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge join and index scan strangeness
Date: 2010-02-19 16:00:48
Message-ID: Pine.LNX.4.64.1002191859160.8265@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 19 Feb 2010, Tom Lane wrote:

> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>>> I found something strange with merge join. Let there are two table
>
>> Sorry, postgresql's version is 8.4 from today CVS
>
> Can't reproduce it here, either in HEAD or 8.4. Sure you have a clean
> build with no local modifications? The outright-incorrect last plan
> you show seems to indicate something rather badly wrong with pathkey
> matching.

I reproduced on my machine
PostgreSQL 8.4.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1, 64-bit
Notice, plan is different and index scan on t1 uses wrong index.

Merge Join (cost=45224.01..251225.22 rows=9760080 width=86) (actual time=1687.545..1687.545 rows=0 loops=1)
Merge Cond: ((t2.f1 = t1.f1) AND (t2.f2 = t1.f2) AND (t2.f3 = t1.f3) AND (t2.f4 = t1.f4))
-> Index Scan using i2 on t2 (cost=0.00..65.44 rows=600 width=59) (actual time=0.008..0.179 rows=600 loops=1)
-> Sort (cost=45224.01..45811.10 rows=234839 width=69) (actual time=1612.586..1645.436 rows=161842 loops=1)
Sort Key: t1.f1, t1.f2, t1.f3, t1.f4
Sort Method: external sort Disk: 20888kB
-> Index Scan using i11 on t1 (cost=0.00..24274.83 rows=234839 width=69) (actual time=0.637..137.659 rows=234839 loops=1)
Total runtime: 1969.029 ms
(8 rows)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Merge join and index scan strangeness
Date: 2010-02-19 17:15:58
Message-ID: 5127.1266599758@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Poking a bit deeper, it *does* think the plan with sorts is cheaper than
without. The mergejoin plan it really prefers is:

regression=# set enable_hashjoin TO 0;
SET
regression=# set enable_nestloop TO 0;
SET
regression=# explain ...
QUERY PLAN
---------------------------------------------------------------------------------------------------
Update (cost=41.69..379234.38 rows=14225817 width=88)
-> Merge Join (cost=41.69..379234.38 rows=14225817 width=88)
Merge Cond: ((t1.f1 = t2.f1) AND (t1.f2 = t2.f2) AND (t1.f3 = t2.f3) AND (t1.f4 = t2.f4))
-> Index Scan using i1 on t1 (cost=0.00..21198.88 rows=234839 width=65)
-> Sort (cost=41.69..43.19 rows=600 width=65)
Sort Key: t2.f1, t2.f2, t2.f3, t2.f4
-> Seq Scan on t2 (cost=0.00..14.00 rows=600 width=65)
(7 rows)

but if you force it to use indexscans:

regression=# set enable_seqscan TO 0;
SET
regression=# explain ...
QUERY PLAN
---------------------------------------------------------------------------------------------------
Update (cost=52001.84..410457.60 rows=14225817 width=88)
-> Merge Join (cost=52001.84..410457.60 rows=14225817 width=88)
Merge Cond: ((t1.f3 = t2.f3) AND (t1.f1 = t2.f1) AND (t1.f2 = t2.f2) AND (t1.f4 = t2.f4))
-> Sort (cost=51783.56..52370.66 rows=234839 width=65)
Sort Key: t1.f3, t1.f1, t1.f2, t1.f4
-> Index Scan using i1 on t1 (cost=0.00..21198.88 rows=234839 width=65)
-> Sort (cost=93.12..94.62 rows=600 width=65)
Sort Key: t2.f3, t2.f1, t2.f2, t2.f4
-> Index Scan using i2 on t2 (cost=0.00..65.44 rows=600 width=65)
(9 rows)

and then without sorts:

regression=# set enable_sort TO 0;
SET
regression=# explain ...
QUERY PLAN
---------------------------------------------------------------------------------------------------
Update (cost=0.00..483609.37 rows=14225817 width=88)
-> Merge Join (cost=0.00..483609.37 rows=14225817 width=88)
Merge Cond: ((t2.f1 = t1.f1) AND (t2.f2 = t1.f2) AND (t2.f3 = t1.f3) AND (t2.f4 = t1.f4))
-> Index Scan using i2 on t2 (cost=0.00..65.44 rows=600 width=65)
-> Materialize (cost=0.00..23547.27 rows=234839 width=65)
-> Index Scan using i1 on t1 (cost=0.00..21198.88 rows=234839 width=65)
(6 rows)

Note that the join cost is way higher than the sum of the input costs in
all three cases. The reason for that is that it's expecting a whole lot
of rescanning of the inner relation due to duplicate merge keys. This
means that a "bare" inner indexscan is going to be penalized very
heavily for refetches, whereas plans with either sort or materialize in
between look better because the refetch cost is very low. So that's how
a plan with a sort can be preferred to one without. I think the weird
looking choices of sort order may just be randomness because all sort
orders cost the same once it decides to sort.

However, even given that, it's odd that it prefers a plan with two sorts
to a plan with one materialize. Poking around in costsize.c, I think
that the reason for this is that the rescan cost of a sort is estimated
at cpu_operator_cost per tuple, whereas rescanning a materialize node is
being estimated at cpu_tuple_cost per tuple. For a plan where rescan
cost is the dominant factor, that matters. We probably ought to make
those two estimates the same. Since neither plan node type does any
projection or qual checking, the lower number is probably the better
choice.

BTW, the real bottom line here is that mergejoin is a crummy plan choice
when there are so few distinct join key values. The planner would never
have picked any of these plans if you hadn't forced it to. So I'm not
sure how important this is in the real world.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Merge join and index scan strangeness
Date: 2010-02-20 00:52:06
Message-ID: 15045.1266627126@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> However, even given that, it's odd that it prefers a plan with two sorts
> to a plan with one materialize. Poking around in costsize.c, I think
> that the reason for this is that the rescan cost of a sort is estimated
> at cpu_operator_cost per tuple, whereas rescanning a materialize node is
> being estimated at cpu_tuple_cost per tuple. For a plan where rescan
> cost is the dominant factor, that matters. We probably ought to make
> those two estimates the same. Since neither plan node type does any
> projection or qual checking, the lower number is probably the better
> choice.

I've done that in HEAD. I'm loath to touch it in the back branches,
though, because the logic in that area now is quite different from what
it was in 8.4 and earlier. As I said before, I think this isn't too
important in cases where you're not forcing a mergejoin, so it seems
better to not risk destabilizing plans in released branches.

regards, tom lane