Re: Clock with Adaptive Replacement

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clock with Adaptive Replacement
Date: 2016-02-15 21:02:20
Message-ID: 56C23CDC.8060704@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/12/16 9:55 AM, Robert Haas wrote:
> I think it's important to spend time and energy figuring out exactly
> what the problems with our current algorithm are. We know in general
> terms that usage counts tend to converge to either 5 or 0 and
> therefore sometimes evict buffers both at great cost and almost

Has anyone done testing on the best cap to usage count? IIRC 5 was
pulled out of thin air. Actually, I don't recall ever seeing a clock
sweep that supported more than a single bit, though often there are
multiple 'pools' a buffer could be in (ie: active vs inactive in most
unix VMs).

If you have a reasonable amount of 1 or 0 count buffers then this
probably doesn't matter too much, but if your working set is
significantly higher than shared buffers then you're probably doing a
LOT of full sweeps to try and get something decremented down to 0.

> randomly. But what's a lot less clear is how much that actually hurts
> us given that we are relying on the OS cache anyway. It may be that
> we need to fix some other things before or after improving the buffer
> eviction algorithm before we actually get a performance benefit. I
> suspect, for example, that a lot of the problems with large
> shared_buffers settings have to do with the bgwriter and checkpointer
> behavior rather than with the buffer eviction algorithm; and that
> others have to do with cache duplication between PostgreSQL and the
> operating system. So, I would suggest (although of course it's up to

It would be nice if there was at least an option to instrument how long
an OS read request took, so that you could guage how many requests were
coming from the OS vs disk. (Obviously direct knowledge from the OS is
even better, but I don't think those APIs exist.)

> you) that you might want to focus on experiments that will help you
> understand where the problems are before you plunge into writing code
> to fix them.

+1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-02-15 21:03:39 Re: Re: Reusing abbreviated keys during second pass of ordered [set] aggregates
Previous Message Alvaro Herrera 2016-02-15 20:58:40 Re: Re: Reusing abbreviated keys during second pass of ordered [set] aggregates