Quick Links

Re: Page replacement algorithm in buffer cache

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Page replacement algorithm in buffer cache
Date:	2013-04-02 05:08:20
Message-ID:	CAOeZViegL=+r_gVJnKZPW81Az5DgtpEVwz7-FUv_swRJHg6WXw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> I don't be believe there is any reasonable argument that
> sitting and spinning while holding the BufFreelistLock is a good idea.

I completely agree. The idea of spinning for a lock while already
inside a lock seems like a source of a hit to performance.

Regards,

Atri

--
Regards,

Atri
l'apprenant

On Mon, Apr 1, 2013 at 6:58 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Sun, Mar 31, 2013 at 1:27 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> On Friday, March 22, 2013, Ants Aasma wrote:
>>>
>>> On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure <mmoncure(at)gmail(dot)com>
>>> wrote:
>>> > well if you do a non-locking test first you could at least avoid some
>>> > cases (and, if you get the answer wrong, so what?) by jumping to the
>>> > next buffer immediately. if the non locking test comes good, only
>>> > then do you do a hardware TAS.
>>> >
>>> > you could in fact go further and dispense with all locking in front of
>>> > usage_count, on the premise that it's only advisory and not a real
>>> > refcount. so you only then lock if/when it's time to select a
>>> > candidate buffer, and only then when you did a non locking test first.
>>> > this would of course require some amusing adjustments to various
>>> > logical checks (usage_count <= 0, heh).
>>>
>>> Moreover, if the buffer happens to miss a decrement due to a data
>>> race, there's a good chance that the buffer is heavily used and
>>> wouldn't need to be evicted soon anyway. (if you arrange it to be a
>>> read-test-inc/dec-store operation then you will never go out of
>>> bounds) However, clocksweep and usage_count maintenance is not what is
>>> causing contention because that workload is distributed. The issue is
>>> pinning and unpinning.
>>
>>
>> That is one of multiple issues. Contention on the BufFreelistLock is
>> another one. I agree that usage_count maintenance is unlikely to become a
>> bottleneck unless one or both of those is fixed first (and maybe not even
>> then)
>
> usage_count manipulation is not a bottleneck but that is irrelevant.
> It can be affected by other page contention which can lead to priority
> inversion. I don't be believe there is any reasonable argument that
> sitting and spinning while holding the BufFreelistLock is a good idea.
>
> merlin

--
Regards,

Atri
l'apprenant

In response to

Re: Page replacement algorithm in buffer cache at 2013-04-01 13:28:13 from Merlin Moncure

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Claudio Freire	2013-04-02 05:40:40	Re: Spin Lock sleep resolution
Previous Message	Tom Lane	2013-04-02 04:24:33	Re: Spin Lock sleep resolution