LRU

Lists: pgsql-patches
From: Neil Conway <neilc(at)samurai(dot)com>
To: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: LRU
Date: 2005-01-24 06:21:41
Message-ID: 1106547701.22946.317.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

This patch replaces ARC with LRU in current sources. It is intended as a
short-term measure to replace ARC; in the longer run, we should probably
look at implementing a more effective replacement policy that is not
patent encumbered.

Notes:

- WIP; it has survived a couple hundred thousand pgbench transactions
and repeated regression tests, but I still want to do another code
review or two.

- It doesn't implement sequential scan activity hints, so it probably
suffers from sequential flooding. This may be something we'll want to
fix in the 8.0 timeframe, I'm not sure.

- The LRU list is maintained via links in the BufferDesc structure, as
was the case in 7.4 and earlier. In general the patch ought to be pretty
similar to the 7.4 code, although I've taken the opportunity to (IMHO)
simplify some things (the freelist is no longer circular, for instance).

- One major behavioral change effects the bgwriter. In order to return
pages in LRU order, we can _only_ scan the free list (since we don't
bother to maintain a list of pinned buffers sorted by the recency of
last access). I think this is actually a good idea: I'm not sure why the
bgwriter should be writing out pinned buffers anyway, since it increases
the likelihood that it will interfere with normal system activity, and a
pinned buffer is pretty likely to be re-dirtied later on. So the
bgwriter has been changed to only examine unpinned buffers.

As a result of that, there is no easy way to say "flush at most n% of
the dirty buffers", because the bgwriter no longer considers all the
dirty buffers. So I've removed bgwriter_percent for now. Advice on the
best way to resolve this situation would be welcome -- should we keep
bgwriter_percent in one form or another, redefine bgwriter_maxpages
according to the December discussion, or ...?

- This patch has a lot of assertions -- I'll disable the more expensive
and unnecessary ones once the patch is closer to being finished.

- Added BufferGetBufferDescriptor(); I'm not sure if it is notationally
cleaner to use this rather than manually looking up entries in the
BufferDescriptor array. Also, at present BufferGetBufferDescriptor()
doesn't handle local buffers; to make it do that, we'd need to
doubly-evaluate the macro argument. Comments?

-Neil

Attachment Content-Type Size
bufmgr_lru-7.patch text/x-patch 78.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: LRU
Date: 2005-01-24 22:39:18
Message-ID: 25723.1106606358@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Neil Conway <neilc(at)samurai(dot)com> writes:
> This patch replaces ARC with LRU in current sources.

I'm more than slightly uncomfortable with the size of this patch.
I realize that what you were trying to do was revert the code to its
7.4 state, but is that the long-run direction we want to pursue?
I think it would be better and safer to try to localize the changes
into freelist.c.

regards, tom lane


From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: LRU
Date: 2005-01-24 23:14:16
Message-ID: 1106608456.1780.15.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Mon, 2005-01-24 at 17:39 -0500, Tom Lane wrote:
> I'm more than slightly uncomfortable with the size of this patch.
[...]
> I think it would be better and safer to try to localize the changes
> into freelist.c.

IMHO that is basically what the patch does; the changes to other files
are either cosmetic or are required to change the replacement strategy.
More specifically, diffstat shows:

doc/src/sgml/runtime.sgml | 48 -
src/backend/catalog/index.c | 2
src/backend/commands/dbcommands.c | 4
src/backend/commands/vacuum.c | 27
src/backend/postmaster/bgwriter.c | 3
src/backend/storage/buffer/buf_init.c | 55 !
src/backend/storage/buffer/buf_table.c | 67 !
src/backend/storage/buffer/bufmgr.c | 177 -!!
src/backend/storage/buffer/freelist.c |
1031 !!!!!!!!!!!!!!!!!!!!!!!!!
src/backend/utils/misc/guc.c | 19
src/backend/utils/misc/postgresql.conf.sample | 1
src/include/postmaster/bgwriter.h | 1
src/include/storage/buf_internals.h | 85 !!
src/include/storage/bufmgr.h | 2
14 files changed, 56 insertions(+), 139 deletions(-), 1327
modifications(!)

A summary of the changes made to each file:

runtime.sgml: doc updates
index.c: trivial API change (BufferSync)
dbcommands.c: trivial API change (BufferSync)
vacuum.c: trivial API change (StrategyVacuumHint)
bgwriter.c: trivial API change (BufferSync), remove bgwriter_percent
buf_init.c: changes required for LRU
buf_table.c: changes required for LRU (or more properly, reverting some
modifications to the buf_table stuff that was part of the ARC patch: the
patch should be very close to the buf_table in 7.4)
bufmgr.c: move PinBuffer() and UnpinBuffer() to freelist.c, replace
"StrategyXXX" calls with similar calls to BufTableXXX and so on. Some
BufferSync API changes. Most of these changes are pretty
straightforward.
freelist.c: obvious :)
guc.c: remove bgwriter_percent and debug_shared_buffers
postgresql.conf.sample: remove bgwriter_percent
bgwriter.h: remove bgwriter_percent
buf_internals.h: changes necessary for LRU
bufmgr.h: BufferSync API change

In other words, you might be able to somewhat reduce the size of the
patch by, say, not renaming some exported functions in freelist.c, but I
don't see that that will have a significant effect upon the complexity
of the patch or the risk that it might cause regressions. If you have
specific suggestions about how to reduce the scope of the patch I'm all
ears, but I don't think the size of the patch is necessarily the most
reliable metric for the probability it will result in a regression.

-Neil


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: LRU
Date: 2005-01-24 23:43:19
Message-ID: 27011.1106610199@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Neil Conway <neilc(at)samurai(dot)com> writes:
> In other words, you might be able to somewhat reduce the size of the
> patch by, say, not renaming some exported functions in freelist.c, but I
> don't see that that will have a significant effect upon the complexity
> of the patch or the risk that it might cause regressions.

Well, the thing that's bothering me is that you are undoing a number of
changes that we'll probably just have to redo later; with consequently
*two* chances to introduce bugs. Case in point is moving the
responsibility for this:

+ /*
+ * Clear the buffer's tag. This doesn't matter for the hash table,
+ * since the buffer is already removed from it, but it ensures that
+ * sequential searches through the buffer table won't think the buffer
+ * is still valid for its old page.
+ */
+ CLEAR_BUFFERTAG(buf->tag);

out of freelist.c and putting it back in bufmgr.c. Jan actually
introduced a bug in his original ARC patch by deleting this code from
BufTableDelete and forgetting to put equivalent defenses into freelist.c.
I've got little confidence that this patch doesn't create some similar
problem, and none that we won't make the same mistake over again when
we re-revert the division of labor.

In short I'd rather try to minimize the amount of API churn that bufmgr.c
sees in this patch, because we'll probably just be changing it back later.

If the problem is that the current freelist.c API exposes too much about
the ARC implementation, the answer to that is not to go back to exposing
the LRU-list implementation; it's to generalize the API.

regards, tom lane


From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: LRU
Date: 2005-01-25 00:22:23
Message-ID: 1106612543.1780.46.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Mon, 2005-01-24 at 18:43 -0500, Tom Lane wrote:
> Well, the thing that's bothering me is that you are undoing a number of
> changes that we'll probably just have to redo later; with consequently
> *two* chances to introduce bugs.

I'm not sure if we should apply this patch to both HEAD and
REL8_0_STABLE, or just the latter. I would guess there is no IP reason
to remove ARC from HEAD as well; while it might introduce complications
in backporting changes from HEAD, those same complications were present
through most of the 8.0 cycle (i.e. HEAD used a different replacement
policy than REL7_4_STABLE did).

> If the problem is that the current freelist.c API exposes too much about
> the ARC implementation, the answer to that is not to go back to exposing
> the LRU-list implementation; it's to generalize the API.

That would be one approach, yeah. My goal was to keep the code simple
and reasonably similar to the 7.4 codebase (which has gotten a lot more
testing than the ARC code, in addition to the fact that the ARC APIs
aren't really appropriate for an LRU implementation). Worrying about the
right freelist.c APIs seemed a little beside the point -- I didn't want
to make _any_ assumptions about how the code is going to look in 8.1, as
it may be fundamentally different.

-Neil