Quick Links

Re: Bloom Filter lookup for hash joins

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Ants Aasma <ants(at)cybertec(dot)at>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Bloom Filter lookup for hash joins
Date:	2013-06-26 14:34:40
Message-ID:	6395.1372257280@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Ants Aasma <ants(at)cybertec(dot)at> writes:
> On Wed, Jun 26, 2013 at 9:46 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
>> I have been reading the current implementation of hash joins, and in
>> ExecScanHashBucket, which I understand is the actual lookup function,
>> we could potentially look at a bloom filter per bucket. Instead of
>> actually looking up each hash value for the outer relation, we could
>> just check the corresponding bloom filter for that bucket, and if we
>> get a positive, then lookup the actual values i.e. continue with our
>> current behaviour (since we could be looking at a false positive).

> The problem here is that if the hash table is in memory, doing a hash
> table lookup directly is likely to be cheaper than a bloom filter
> lookup,

Yeah. Given the plan to reduce NTUP_PER_BUCKET to 1, it's hard to see
how adding a Bloom filter phase could be anything except overhead. Even
with the current average bucket length, it doesn't sound very promising.

regards, tom lane

In response to

Re: Bloom Filter lookup for hash joins at 2013-06-26 11:08:20 from Ants Aasma

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Atri Sharma	2013-06-26 14:39:10	Re: A better way than tweaking NTUP_PER_BUCKET
Previous Message	Tom Lane	2013-06-26 14:25:45	Re: Hash partitioning.