Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

From: Joshua Tolley <eggyknap(at)gmail(dot)com>
To: Bryce Cutt <pandasuit(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Lawrence, Ramon" <ramon(dot)lawrence(at)ubc(dot)ca>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Date: 2008-11-06 03:11:19
Message-ID: 20081106031119.GA2007@polonium.part.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 05, 2008 at 04:06:11PM -0800, Bryce Cutt wrote:
> The error is causes by me Asserting against the wrong variable. I
> never noticed this as I apparently did not have assertions turned on
> on my development machine. That is fixed now and with the new patch
> version I have attached all assertions are passing with your query and
> my test queries. I added another assertion to that section of the
> code so that it is a bit more vigorous in confirming the hash table
> partition is correct. It does not change the operation of the code.
>
> There are two partition counts. One holds the maximum number of
> buckets in the hash table and the other counts the number of actual
> buckets created for hash values. I was incorrectly testing against
> the second one because that was valid before I started using a hash
> table to store the buckets.
>
> The enable_hashjoin_usestatmcvs flag was valuable for my own research
> and tests and likely useful for your review but Tom is correct that it
> can be removed in the final version.
>
> - Bryce Cutt

Thanks for the new patch; I'll take a look as soon as I can (prolly
tomorrow).

- Josh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2008-11-06 03:12:20 Re: [WIP] In-place upgrade
Previous Message KaiGai Kohei 2008-11-06 02:44:32 Re: The suppress_redundant_updates_trigger() works incorrectly