Hash vs. HashJoin nodes

From: Neil Conway <neilc(at)samurai(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Hash vs. HashJoin nodes
Date: 2005-03-31 03:57:01
Message-ID: 424B750D.7050009@samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Is there a reason why the implementation of hash joins uses a separate
"hash" child node? AFAICS that node is only used in hash joins. Perhaps
the intent was to be able to provide a generic "hashing" capability that
could be used by any part of the executor that needs to hash tuples, but
AFAICS the hash node is not currently used in that way.

(The reason I ask is that Andrew @ Supernews and I were discussing a
potential minor improvement to the hash join implementation. If either
of the inputs to an inner hash join is empty, we can avoid building the
hash table or reading the other join relation. The existing code works
fine if it is the inner hash relation that is empty (since that is read
first), but if the outer join relation is empty we do a lot of
unnecessary work. We could improve this by first pulling a single tuple
from the hash join's inner relation; if it is non-null, then pull a
single tuple from the outer relation. If that is also non-null, then go
and build the hash table for the inner relation as usual. This isn't
easy to implement at present because nodeHash is used to hash the inner
relation, and does the whole job at once. Of course, it would be
possible to hack nodeHash to detect the first time it is called and then
return after a single tuple, so the caller would actually invoke it
twice for non-empty input -- but that seems a bit ugly, so I'm wondering
if there is any value to maintaining the hash vs. hash join distinction
in the first place.)

-Neil

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2005-03-31 03:57:29 TSearch2 performance issue?
Previous Message Jaime Casanova 2005-03-31 03:06:00 Re: when using a bound cursor, error found..