Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: b8flowerfire <b8flowerfire(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller
Date: 2014-06-09 15:09:55
Message-ID: 29155.1402326595@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> This has come up before. Basically, the problem is that if you reduce
> NTUP_PER_BUCKET, the bucket array gets larger, which might reduce the
> amount of space available for tuples to the point where the hash join
> overflows to multiple batches. That will be more expensive than
> performing the hash join with a higher NTUP_PER_BUCKET.

> But having said that, I think the current situation is pretty bad,
> too. NTUP_PER_BUCKET is basically the anticipated load factor for the
> hash table, and everything I've ever read about hash table design says
> you want that to be something like 1, or 0.25. So 10 seems really
> high. But I'm not sure exactly what to do to fix the problem, because
> there are indeed cases where we will be worse off if we just lower the
> value categorically.

Keep in mind that that standard advice is meant for all-in-memory cases,
not for cases where the alternative to running with longer hash chains
is dumping tuples out to disk and reading them back.

I'm quite prepared to believe that we should change NTUP_PER_BUCKET ...
but appealing to standard advice isn't a good basis for arguing that.
Actual performance measurements (in both batched and unbatched cases)
would be a suitable basis for proposing a change.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2014-06-09 15:10:54 Re: Supporting Windows SChannel as OpenSSL replacement
Previous Message Magnus Hagander 2014-06-09 15:03:09 Re: Supporting Windows SChannel as OpenSSL replacement