Re: Clock sweep not caching enough B-Tree leaf pages?

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Date: 2014-04-16 17:26:45
Message-ID: CAGTBQpb8-ug5X=pgM8-CoX0ZdrOzkDv0UaYb+FqLKpxz4xtY6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 16, 2014 at 4:22 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>
> I don't want to dismiss what you're saying about heating and cooling
> being unrelated, but I don't find the conclusion that not everything
> can be hot obvious. Maybe "heat" should be relative rather than
> absolute, and maybe that's actually what you meant. There is surely
> some workload where buffer access actually is perfectly uniform, and
> what do you do there? What "temperature" are those buffers?

In that case, hotness, or retention priority, should be relative to
re-population cost.

IE: whether it's likely to still be in the OS cache or not, whether
it's dirty or not, etc.

> It occurs to me that within the prototype patch, even though
> usage_count is incremented in a vastly slower fashion (in a wall time
> sense), clock sweep doesn't take advantage of that. I should probably
> investigate having clock sweep become more aggressive in decrementing
> in response to realizing that it won't get some buffer's usage_count
> down to zero on the next revolution either. There are certainly
> problems with that, but they might be fixable. Within the patch, in
> order for it to be possible for the usage_count to be incremented in
> the interim, an average of 1.5 seconds must pass, so if clock sweep
> were to anticipate another no-set-to-zero revolution, it seems pretty
> likely that it would be exactly right, or if not then close enough,
> since it can only really fail to correct for some buffers getting
> incremented once more in the interim. Conceptually, it would be like
> multiple logical revolutions were merged into one actual one,
> sufficient to have the next revolution find a victim buffer.

Why use time at all? Why not synchronize usage bumpability to clock sweeps?

I'd use a simple bit that the clock sweep clears, and the users set.
Only one increase per sweep.

Or maybe use a decreasing loop count instead of a bit. In any case,
measuring "time" in terms of clock sweeps sounds like a better
proposition.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-04-16 17:42:16 Re: Clock sweep not caching enough B-Tree leaf pages?
Previous Message Andres Freund 2014-04-16 17:12:50 Re: Tracking replication slot "blockings"