Re: Hash Join cost estimates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hash Join cost estimates
Date: 2013-04-04 18:19:36
Message-ID: 5420.1365099576@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> I've been fiddling with this on the very much larger overall database
> where this test case came from and have found that hashing the large
> table can actually be *faster* and appears to cause a more consistent
> and constant amount of disk i/o (which is good).

Interesting.

> What I'm trying to get at in this overall email is: why in the world is
> it so expensive to do hash lookups?

perf or oprofile reveal anything?

Also, I assume that the cases you are looking at are large enough that
even the "small" table doesn't fit in a single hash batch? It could
well be that the answer has to do with some bogus or at least
unintuitive behavior of the batching process, and it isn't really at all
a matter of individual hash lookups being slow.

(You never did mention what work_mem setting you're testing, anyway.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-04-04 18:36:22 Re: Hash Join cost estimates
Previous Message Vibhor Kumar 2013-04-04 18:03:18 Re: pg_dump selectively ignores extension configuration tables