Re: Readme of Buffer Management seems to have wrong sentence

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Readme of Buffer Management seems to have wrong sentence
Date: 2012-05-23 19:03:28
Message-ID: CAMkU=1zkke82SM1=T=tRLwC4AYF5U2APkd2LTYYWU3bRcryQCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 23, 2012 at 11:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>> One thing I wanted to play with is having newly read buffers get a
>> usage count of 0 rather than 1.  The problem is that there is no way
>> to test it in enough different situations to convince people it would
>> be a general improvement.
>
> Hmm ... ISTM that that was discussed back when we instituted buffer
> usage counts, and rejected on the grounds that a newly-read buffer could
> then have negligible life expectancy.  The clock sweep might be just
> about to pass over it.

I guess that could be the case if the buffer just came off of the
linked list, but that situation is very rare in general use (and the
sweep hand wouldn't be moving at all until the linked list is empty).
If it were allocated through the clock, then the sweep hand should
have just passed over it and so is unlikely to be just about to pass
over it again.

If the clock sweep is moving so fast that it makes nearly a complete
rotation in the time it takes to read a buffer from disk, then I think
the system is probably beyond redemption. But I guess that that is
something to be tested for, if I can engineer a test.

> By starting at 1, it's guaranteed to have at
> least 1 sweep cycle time in which it might accumulate more hits.
>
> In other words, we have a choice of whether a buffer's initial lifetime
> is between 0 and 1 sweep times, or between 1 and 2 sweep times; and the
> discrimination against an unlucky buffer position is infinite in the
> first case versus at most 2X in the second case.

But the cost of preventing an occasional buffer from being unlucky is
that the length of the average clock sweep is almost doubled, and thus
it is also easier for hot-ish buffers to get accidentally evicted.
This last part could perhaps be ameliorated by having the usage
incremented by 2 rather than 1 each time a buffer hits.

Also, if the just-read buffer does get unlucky, either it won't be
needed again soon anyway and so it deserves to be unlucky, or the next
time it is needed it will probably still be in RAM, will be read in
quickly before the clock hand has moved much, and will have plenty of
time to accumulate new hits the next time around.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-05-23 19:03:45 Re: Readme of Buffer Management seems to have wrong sentence
Previous Message Amit Kapila 2012-05-23 18:47:59 Re: Readme of Buffer Management seems to have wrong sentence