Re: Hash vs. HashJoin nodes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash vs. HashJoin nodes
Date: 2005-03-31 04:37:23
Message-ID: 1931.1112243843@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Neil Conway <neilc(at)samurai(dot)com> writes:
> ... I'm wondering if there is any value to maintaining the hash
> vs. hash join distinction in the first place.)

One small objection is that we'd lose the ability to separately display
the time spent building the hash table in EXPLAIN ANALYZE output. It's
probably not super important, but might be a reason to keep two plan
nodes in the tree.

I recall having looked at related ideas (not this one exactly) and being
discouraged by the fact that pulling a tuple from *either* input first
is demonstrably a losing strategy, since either input might have a very
high startup cost. You could possibly ameliorate that by comparing the
estimated startup costs for the two inputs and pulling from the
estimated-cheaper one first.

This could all get pretty hairy when you consider that it has to still
work for left joins too ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2005-03-31 04:51:47 Re: Hash vs. HashJoin nodes
Previous Message Christopher Kings-Lynne 2005-03-31 03:57:29 TSearch2 performance issue?