Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

From: "Joshua Tolley" <eggyknap(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Bryce Cutt" <pandasuit(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Lawrence, Ramon" <ramon(dot)lawrence(at)ubc(dot)ca>, pgsql-hackers(at)postgresql(dot)org, jonah(dot)harris(at)gmail(dot)com
Subject: Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Date: 2008-11-06 23:22:16
Message-ID: e7e0a2570811061522g63a06fa8o4f02972a607840eb@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 6, 2008 at 3:52 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On Thu, 2008-11-06 at 15:33 -0700, Joshua Tolley wrote:
>
>> Stay tuned.
>
> Minor question on this patch. AFAICS there is another patch that seems
> to be aiming at exactly the same use case. Jonah's Bloom filter patch.
>
> Shouldn't we have a dust off to see which one is best? Or at least a
> discussion to test whether they overlap? Perhaps you already did that
> and I missed it because I'm not very tuned in on this thread.
>
> --
> Simon Riggs www.2ndQuadrant.com
> PostgreSQL Training, Services and Support

We haven't had that discussion AFAIK, and definitely should. First
glance suggests they could coexist peacefully, with proper coaxing. If
I understand things properly, Jonah's patch filters tuples early in
the join process, and this patch tries to ensure that hash join
batches are kept in RAM when they're most likely to be used. So
they're orthogonal in purpose, and the patches actually apply *almost*
cleanly together. Jonah, any comments? If I continue to have some time
to devote, and get through all I think I can do to review this patch,
I'll gladly look at Jonah's too, FWIW.

- Josh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2008-11-06 23:51:20 Re: [WIP] In-place upgrade
Previous Message Simon Riggs 2008-11-06 22:52:54 Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets