Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Date: 2014-05-07 21:51:24
Message-ID: 20140507215124.GB4780@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-05-07 16:24:53 -0500, Merlin Moncure wrote:
> On Wed, May 7, 2014 at 4:15 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-05-07 13:51:57 -0700, Jeff Janes wrote:
> >> On Wed, May 7, 2014 at 11:38 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:
> >>
> >> > On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote:
> >> > >
> >> > > *) raising shared buffers does not 'give more memory to postgres for
> >> > > caching' -- it can only reduce it via double paging
> >> >
> >> > That's absolutely not a necessary consequence. If pages are in s_b for a
> >> > while the OS will be perfectly happy to throw them away.
> >> >
> >>
> >> Is that an empirical observation?
> >
> > Yes.
> >
> >> I've run some simulations a couple years
> >> ago, and also wrote some instrumentation to test that theory under
> >> favorably engineered (but still plausible) conditions, and couldn't get
> >> more than a small fraction of s_b to be so tightly bound in that the kernel
> >> could forget about them. Unless of course the entire workload or close to
> >> it fits in s_b.
> >
> > I think it depends on your IO access patterns. If the whole working set
> > fits into the kernel's page cache and there's no other demand for pages
> > it will stay in. If you constantly rewrite most all your pages they'll
> > also stay in the OS cache because they'll get written out. If the churn
> > in shared_buffers is so high (because it's so small in comparison to the
> > core hot data set) that there'll be dozens if not hundreds clock sweeps
> > a second you'll also have no locality.
> > It's also *hugely* kernel version specific :(
>
> right. This is, IMNSHO, exactly the sort of language that belongs in the docs.

Well, that's just the tip of the iceberg though. Whether you can accept
small shared_buffers to counteract double buffering or not is also a
hard to answer question... That again heavily depends on the usage
patterns. If you have high concurrency and your working set has some
locality it's very important to have a high s_b lest you fall afoul of
the freelist lock. If you have high concurrency but 90+ of your page
lookups *aren't* going to be in the cache you need to be very careful
with a large s_b because the clock sweeps to lower the usagecounts can
enlarge the lock contention.
Then there's both memory and cache efficiency questions around both the
PrivateRefCount array and the lwlocks....

In short: I think it's pretty hard to transfer this into language that's
a) agreed upon b) understandable to someone that hasn't discovered
several of the facts for him/herself.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-05-07 21:54:28 Re: pg_shmem_allocations view
Previous Message Robert Haas 2014-05-07 21:48:15 Re: pg_shmem_allocations view