Re: Bloom Filter lookup for hash joins

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bloom Filter lookup for hash joins
Date: 2013-06-26 08:39:24
Message-ID: CAFj8pRBscxO58me4zWXd=ZJxzk9Taf+w+L_0-Ckm1ZcSRmGnEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

it looks interesting - please, try to send some performance results,

Regards

Pavel

2013/6/26 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>:
> Hi All,
>
> I have been researching bloom filters and discussed it on IRC with
> RhodiumToad and David Fetter, and they pointed me to the various
> places that could potentially have bloom filters, apart from the
> places that already have them currently.
>
> I have been reading the current implementation of hash joins, and in
> ExecScanHashBucket, which I understand is the actual lookup function,
> we could potentially look at a bloom filter per bucket. Instead of
> actually looking up each hash value for the outer relation, we could
> just check the corresponding bloom filter for that bucket, and if we
> get a positive, then lookup the actual values i.e. continue with our
> current behaviour (since we could be looking at a false positive).
>
> This doesn't involve a lot of new logic. We need to add a bloom filter
> in HashJoinState and set it when the hash table is being made. Also,
> one thing we need to look at is the set of hash functions being used
> for the bloom filters. This can be a point of further discussion.
>
> The major potential benefit we could gain is that bloom filters never
> give false negatives. So, we could save a lot of lookup where the
> corresponding bloom filter gives a negative.
>
> This approach can also be particularly useful for hash anti joins,
> where we look for negatives. Since bloom filters can easily be stored
> and processed, by straight bit operations, we could be looking at a
> lot of saving here.
>
> I am not sure if we already do something like this. Please direct me
> to the relevant parts in the code if we already do.
>
> Regards,
>
> Atri
>
> --
> Regards,
>
> Atri
> l'apprenant
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Albe Laurenz 2013-06-26 08:43:15 Re: Kudos for Reviewers -- straw poll
Previous Message Kevin Grittner 2013-06-26 08:38:33 Re: refresh materialized view concurrently