Re: bad estimation together with large work_mem generates terrible slow hash joins

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bad estimation together with large work_mem generates terrible slow hash joins
Date: 2014-09-10 18:55:50
Message-ID: 54109EB6.8020408@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/10/2014 09:31 PM, Robert Haas wrote:
> On Wed, Sep 10, 2014 at 2:25 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> The dense-alloc-v5.patch looks good to me. I have committed that with minor
>> cleanup (more comments below). I have not looked at the second patch.
>
> Gah. I was in the middle of doing this. Sigh.

Sorry.

>>> * the chunks size is 32kB (instead of 16kB), and we're using 1/4
>>> threshold for 'oversized' items
>>>
>>> We need the threshold to be >=8kB, to trigger the special case
>>> within AllocSet. The 1/4 rule is consistent with ALLOC_CHUNK_FRACTION.
>>
>> Should we care about the fact that if there are only a few tuples, we will
>> nevertheless waste 32kB of memory for the chunk? I guess not, but I thought
>> I'd mention it. The smallest allowed value for work_mem is 64kB.
>
> I think we should change the threshold here to 1/8th. The worst case
> memory wastage as-is ~32k/5 > 6k.

Not sure how you arrived at that number. The worst case is that each 32k
chunk is filled up to 24k+1 bytes, i.e. 8k-1 bytes is wasted in each
chunk. That's 25% wastage. That's not too bad IMHO - the worst case
waste of AllocSet is 50%.

But if you want to twiddle it, feel free. Note that there's some
interplay with allocset code, like Tomas mentioned. If the threshold is
< 8k, palloc() will round up request sizes smaller than 8k anyway. That
wastes some memory, nothing more serious than that, but it seems like a
good idea to avoid it.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2014-09-10 18:59:52 Re: bad estimation together with large work_mem generates terrible slow hash joins
Previous Message Robert Haas 2014-09-10 18:53:07 removing volatile qualifiers from lwlock.c