Re: Poor memory context performance in large hash joins

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Poor memory context performance in large hash joins
Date: 2017-02-24 18:19:02
Message-ID: CAMkU=1zbiQYRtR_qOEkfqGLhqite-JN-J92v7SyEf7P9pg94YA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 23, 2017 at 2:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > The number of new chunks can be almost as as large as the number of old
> > chunks, especially if there is a very popular value. The problem is that
> > every time an old chunk is freed, the code in aset.c around line 968 has
> to
> > walk over all the newly allocated chunks in the linked list before it can
> > find the old one being freed. This is an N^2 operation, and I think it
> has
> > horrible CPU cache hit rates as well.
>
> Maybe it's time to convert that to a doubly-linked list.

I don't think that would help. You would need some heuristic to guess
whether the chunk you are looking for is near the front, or near the end.
And in this case, the desired chunk starts out at the front, and then keeps
moving down the list with each iteration as new things are added to the
front, until near the end of the ExecHashIncreaseNumBatches it is freeing
them from near the end. But in between, it is freeing them from the
middle, so I don't think a doubly-linked list would change it from N^2,
just lower the constant, even if you always knew which end to start at. Or
am I misunderstanding what the implications for a doubly-linked-list are?

What would really help here is if it remembered the next pointer of the
just-freed chunk, and started the scan from that location the next time,
cycling around to the head pointer if it doesn't find anything. But I
don't think that that is a very general solution.

Or, if you could pass a flag when creating the context telling it whether
memory will be freed mostly-LIFO or mostly-FIFO, and have it use a stack or
a queue accordingly.

> Although if the
> hash code is producing a whole lot of requests that are only a bit bigger
> than the separate-block threshold, I'd say It's Doing It Wrong. It should
> learn to aggregate them into larger requests.
>

Right now it is using compiled-in 32KB chunks. Should it use something
like max(32kb,work_mem/128) instead?

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-02-24 18:30:11 Re: Checksums by default?
Previous Message Bruce Momjian 2017-02-24 17:47:52 Re: Checksums by default?