Re: Scaling shared buffer eviction

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling shared buffer eviction
Date: 2014-09-03 11:27:41
Message-ID: CAA4eK1+mXQCOS6DZfcXnR7itfDHeOgh3eGkD0Om=qOv-h43Utg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 3, 2014 at 1:45 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Aug 28, 2014 at 7:11 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > I have updated the patch to address the feedback. Main changes are:
> >
> > 1. For populating freelist, have a separate process (bgreclaimer)
> > instead of doing it by bgwriter.
> > 2. Autotune the low and high threshold values for buffers
> > in freelist. I have used the formula as suggested by you upthread.
> > 3. Cleanup of locking regimen as discussed upthread (completely
> > eliminated BufFreelist Lock).
> > 4. Improved comments and general code cleanup.
>
> +Background Reclaimer's Processing
> +---------------------------------
>
> I suggest titling this section "Background Reclaim".

I don't mind changing it, but currently used title is based on similar
title "Background Writer's Processing". It is used in previous
paragraph. Is there a reason to title this differently?

> +The background reclaimer is designed to move buffers to freelist that are
>
> I suggest replacing the first three words of this sentence with
"bgreclaimer".

Again what I have used is matching with BgWriter's explanation. I thought
it would be better if wording used is similar.

>
> + while (tmp_num_to_free > 0)
>
> I am not sure it's a good idea for this value to be fixed at loop
> start and then just decremented.

It is based on the idea what bgwriter does for num_to_scan and
calling it once has advantage that we need to take freelist_lck
just once.

> Shouldn't we loop and do the whole
> thing over once we reach the high watermark, only stopping when
> StrategySyncStartAndEnd() says num_to_free is 0?

Do you mean to say that for high water mark check, we should
always refer StrategySyncStartAndEnd() rather than getting the
value in begining?

Are you thinking that somebody else would have already put some
buffers onto freelist which would change initially identified high water
mark? I think that can be done only during very few and less used
operations. Do you think for that we start referring
StrategySyncStartAndEnd() in loop?
>
> In freelist.c, it seems like a poor idea to have two spinlocks as
> consecutive structure members; they'll be in the same cache line,
> leading to false sharing. If we merge them into a single spinlock,
> does that hurt performance?

I have kept them separate so that backends searching for a buffer
in freelist doesn't contend with bgreclaimer (while doing clock sweep)
or clock sweep being done by other backends. I think it will be bit
tricky to devise a test where this can hurt, however it doesn't seem
too bad to have two separate locks in this case.

> If we put them further apart, e.g. by
> moving the freelist_lck to the start of the structure, followed by the
> latches, and leaving victimbuf_lck where it is, does that help
> performance?

I can investigate.

> + /*
> + * If the buffer is pinned or has a nonzero usage_count,
> we cannot use
> + * it; discard it and retry. (This can only happen if
VACUUM put a
> + * valid buffer in the freelist and then someone else
> used it before
> + * we got to it. It's probably impossible altogether as
> of 8.3, but
> + * we'd better check anyway.)
> + */
> +
>
> This comment is clearly obsolete.

Okay, but this patch hasn't changed anything w.r.t above comment,
so I haven't changed it. Do you want me to remove second part of
comment starting with "(This can only happen"?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2014-09-03 11:40:49 Re: Commitfest status
Previous Message Heikki Linnakangas 2014-09-03 11:18:44 Re: Commitfest status