Proof-of-concept ARC removal patches

Lists: pgsql-patches
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-patches(at)postgreSQL(dot)org
Subject: Proof-of-concept ARC removal patches
Date: 2005-02-04 02:41:52
Message-ID: 9253.1107484912@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Attached are two different variants of a patch to remove the ARC
cache algorithm in favor of variants of the "2Q" algorithm.

The first patch approximates the "full 2Q" algorithm described by
Johnson and Shasha, while the second approximates their "simplified 2Q"
method. Full 2Q uses a list of pages that were recently in cache but
no longer are, while simplified 2Q does not; the first patch is
therefore a smaller change to the existing code.

To show how closely the code is related, I kept the ARC names of T1, T2,
and B1 for the corresponding 2Q lists (A1, Am, A1out respectively).
We'd probably rename to use the 2Q names if we apply this, but the
patches are smaller and easier to review as they are.

The patches are not exactly the J&S algorithms, in part because I kept
the special cases for VACUUM, and in part because I really don't believe
their idea about not promoting from T1 into T2 until the page has fallen
into B1. This helped to minimize the change from the existing ARC code.

In some desultory testing with pgbench, there was no significant
performance difference among the three algorithms, which leads me to
think that simplified 2Q might be the best bet (it certainly has the
smallest memory footprint). But it would be a good idea to do some
more measurements before believing that. I hope that Mark Wong can give
these a try on his setup soon.

These should apply against CVS tip of either HEAD or REL8_0_STABLE
branch; note that "CVS tip" includes the bufmgr refactoring patch
I applied earlier today.

Comments?

regards, tom lane

Attachment Content-Type Size
unknown_filename text/plain 17.6 KB
unknown_filename text/plain 19.5 KB

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Proof-of-concept ARC removal patches
Date: 2005-02-04 04:57:11
Message-ID: KGEFLMPJFBNNLNOOOPLGGEFFCIAA.simon@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

> Tom Lane wrote
> Attached are two different variants of a patch to remove the ARC
> cache algorithm in favor of variants of the "2Q" algorithm.
>
> The first patch approximates the "full 2Q" algorithm described by
> Johnson and Shasha, while the second approximates their
> "simplified 2Q"
> method. Full 2Q uses a list of pages that were recently in cache but
> no longer are, while simplified 2Q does not; the first patch is
> therefore a smaller change to the existing code.

"Full 2Q" is the right one, I think.

> The patches are not exactly the J&S algorithms, in part because I kept
> the special cases for VACUUM, and in part because I really
> don't believe
> their idea about not promoting from T1 into T2 until the page
> has fallen
> into B1. This helped to minimize the change from the
> existing ARC code.

Agreed.

> In some desultory testing with pgbench, there was no significant
> performance difference among the three algorithms, which leads me to
> think that simplified 2Q might be the best bet (it certainly has the
> smallest memory footprint). But it would be a good idea to do some
> more measurements before believing that. I hope that Mark
> Wong can give
> these a try on his setup soon.

Will give these a whirl...

> Comments?
>

The use of 25% T1 and 75% T2 is probably the only thing to discuss, for
me.

Johnson & Sasha's results show that having a larger T1 makes the system
more responsive to changes, while a larger T2 gives a better longer term
hit rate. J&S's results show that higher T1/T2 ratios are better with
smaller caches. The CARS results show that keeping T1 above a certain
minimum size is better also.

From that, I'd suggest we choose a T1/T2 ratio that changes according to
how you start shared_buffers. If we had another GUC at all, it should be
one that isn't listed in the postgresql.conf file at all...since it
would be easy to misuse.

A default setting of something like T1 as % of total (just roughly)
= 50% when shared_buffers <= 1000
= 25% when shared_buffers = 5000
= 10% when shared_buffers = 20000
with smoothing...

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Proof-of-concept ARC removal patches
Date: 2005-02-04 17:57:07
Message-ID: 15713.1107539827@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> The use of 25% T1 and 75% T2 is probably the only thing to discuss, for
> me.

Hmm. I had been trying to avoid adding a GUC parameter for this ;-)

> A default setting of something like T1 as % of total (just roughly)
> = 50% when shared_buffers <= 1000
> = 25% when shared_buffers = 5000
> = 10% when shared_buffers = 20000
> with smoothing...

Have you got anything to back up the need for such an adjustment?

BTW, now that I look at this code more closely, I'm feeling dissatisfied
with Jan's custom adjustments to the ARC algorithm, particularly the
rule that it takes touches from two different transactions to get a page
into T2. I preserved that logic in these proposed patches but I'm
thinking that we want to change it, independently of the ARC/2Q issue.
I'll start another thread in pghackers about that. I bring it up now
just because I think that it has an impact on the desirable size of T1.
It might be unwise to put too much emphasis on comparative benchmarks
taken before we change that rule.

regards, tom lane