Re: patch: improve SLRU replacement algorithm

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch: improve SLRU replacement algorithm
Date: 2012-04-08 16:53:04
Message-ID: 9707.1333903984@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On reflection, it seems to me that the right fix here is to make
> SlruSelectLRUPage() to avoid selecting a page on which an I/O is
> already in progress.

This patch seems reasonably sane to me. It's not intuitively obvious
that we should ignore I/O-busy pages, but your tests seem to prove
that that's a better algorithm.

However, I do have a couple of quibbles with the comments. The first
para in the large block comment in SlruSelectLRUPage:

* If we find any EMPTY slot, just select that one. Else locate the
* least-recently-used slot to replace.

seems now to be quite out of touch with reality, and correcting it two
paras down doesn't really fix that. Besides which, you ought to explain
*why* it's ignoring I/O-busy pages. So perhaps merge the first and
third paras of the comment into something like

* If we find any EMPTY slot, just select that one. Else choose
* a victim page to replace. We normally take the least recently
* used valid page, but we will never take the slot containing
* latest_page_number, even if it appears least recently used.
* Slots that are already I/O busy are never selected, either:
* a read-busy slot will not be least recently used once the read
* finishes, while waiting behind someone else's write has been
* shown to be less efficient than starting another write.

Or maybe you have a better short description of why this is a good idea,
but there ought to be something here about it.

Also, as a matter of style, I think this comment ought to be inside the
"if" block not before it:

/*
* All pages (except possibly the latest one) are I/O busy. We'll have
* to wait for an I/O to complete and then retry. We choose to wait
* for the I/O on the least recently used slot, on the assumption that
* it was likely initiated first of all the I/Os in progress and may
* therefore finish first.
*/
if (best_valid_delta < 0)
{
SimpleLruWaitIO(ctl, bestinvalidslot);
continue;
}

I don't know about you, but I read a comment like this as asserting a
fact about the situation when control reaches where the comment is.
So it needs to be inside the "if". (Analogy: if it were an actual
Assert(all-pages-are-IO-busy), it would have to be inside the if, no?)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-04-08 17:19:38 Re: Last gasp
Previous Message Boszormenyi Zoltan 2012-04-08 16:43:45 Re: [PATCH] lock_timeout and common SIGALRM framework