Quick Links

Re: costing of hash join

Lists:	pgsql-hackers

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	costing of hash join
Date:	2014-01-03 22:18:50
Message-ID:	CAMkU=1wMr-Ft8=hFfXDy+KRjPVgrN5LPovaLwkrQH9+YzU277g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I'm trying to figure out why hash joins seem to be systematically underused
in my hands. In the case I am immediately looking at it prefers a merge
join with both inputs getting seq scanned and sorted, despite the hash join
being actually 2 to 3 times faster, where inputs and intermediate working
sets are all in memory. I normally wouldn't worry about a factor of 3
error, but I see this a lot in many different situations. The row
estimates are very close to actual, the errors is only in the cpu estimates.

A hash join is charged cpu_tuple_cost for each inner tuple for inserting it
into the hash table:

* charge one cpu_operator_cost for each column's hash function. Also,
* tack on one cpu_tuple_cost per inner row, to model the costs of
* inserting the row into the hashtable.

But a sort is not charged a similar charge to insert a tuple into the sort
memory pool:

* Also charge a small amount (arbitrarily set equal to operator cost)
per
* extracted tuple. We don't charge cpu_tuple_cost because a Sort node
* doesn't do qual-checking or projection, so it has less overhead than
* most plan nodes. Note it's correct to use tuples not output_tuples

Are these operations different enough to justify this difference? The
qual-checking (and I think projection) needed on a hash join should have
already been performed by and costed to the seq scan feeding the hashjoin,
right?

Cheers,

Jeff

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: costing of hash join
Date:	2014-01-03 22:50:55
Message-ID:	27396.1388789455@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> I'm trying to figure out why hash joins seem to be systematically underused
> in my hands. In the case I am immediately looking at it prefers a merge
> join with both inputs getting seq scanned and sorted, despite the hash join
> being actually 2 to 3 times faster, where inputs and intermediate working
> sets are all in memory. I normally wouldn't worry about a factor of 3
> error, but I see this a lot in many different situations. The row
> estimates are very close to actual, the errors is only in the cpu estimates.

Can you produce a test case for other people to look at?

What datatype(s) are the join keys?

> A hash join is charged cpu_tuple_cost for each inner tuple for inserting it
> into the hash table:

Doesn't seem like monkeying with that is going to account for a 3x error.

Have you tried using perf or oprofile or similar to see where the time is
actually, rather than theoretically, going?

regards, tom lane