Re: tweaking NTUP_PER_BUCKET

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: tweaking NTUP_PER_BUCKET
Date: 2014-07-03 18:10:13
Message-ID: 20140703181013.GS16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas,

* Tomas Vondra (tv(at)fuzzy(dot)cz) wrote:
> However it's likely there are queries where this may not be the case,
> i.e. where rebuilding the hash table is not worth it. Let me know if you
> can construct such query (I wasn't).

Thanks for working on this! I've been thinking on this for a while and
this seems like it may be a good approach. Have you considered a bloom
filter over the buckets..? Also, I'd suggest you check the archives
from about this time last year for test cases that I was using which
showed cases where hashing the larger table was a better choice- those
same cases may also show regression here (or at least would be something
good to test).

Have you tried to work out what a 'worst case' regression for this
change would look like? Also, how does the planning around this change?
Are we more likely now to hash the smaller table (I'd guess 'yes' just
based on the reduction in NTUP_PER_BUCKET, but did you make any changes
due to the rehashing cost?)?

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2014-07-03 18:35:54 Re: bad estimation together with large work_mem generates terrible slow hash joins
Previous Message Tomas Vondra 2014-07-03 18:03:08 Re: tweaking NTUP_PER_BUCKET