Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Date: 2013-09-13 21:57:30
Message-ID: 20130913215730.GA7437@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-09-13 14:04:55 -0700, Kevin Grittner wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>
> > Absolutely not claiming the contrary. I think it sucks that we
> > couldn't fully figure out what's happening in detail. I'd love to
> > get my hand on a setup where it can be reliably reproduced.
>
> I have seen two completely different causes for symptoms like this,
> and I suspect that these aren't the only two.
>
> (1)  The dirty page avalanche: PostgreSQL hangs on to a large
> number of dirty buffers and then dumps a lot of them at once.  The
> OS does the same.  When PostgreSQL dumps its buffers to the OS it
> pushes the OS over a "tipping point" where it is writing dirty
> buffers too fast for the controller's BBU cache to absorb them.
> Everything freezes until the controller writes and accepts OS
> writes for a lot of data.  This can take several minutes, during
> which time the database seems "frozen".  Cure is some combination
> of these: reduce shared_buffers, make the background writer more
> aggressive, checkpoint more often, make the OS dirty page writing
> more aggressive, add more BBU RAM to the controller.

That should hopefully be diagnosable from other system stats like the
dirty rate.

> (2)  Transparent huge page support goes haywire on its defrag work.
> Clues on this include very high "system" CPU time during an
> episode, and `perf top` shows more time in kernel spinlock
> functions than anywhere else.  The database doesn't completely lock
> up like with the dirty page avalanche, but it is slow enough that
> users often describe it that way.  So far I have only seen this
> cured by disabling THP support (in spite of some people urging that
> just the defrag be disabled). 

Yes, I have seen that issue a couple of times now as well. I can confirm
that in at least two cases disabling defragmentation alone proved to be
enough to fix the issue.
Annoyingly enough there are different ways to disable
defragmentation/THP depending on whether you're using THP backported by
redhat or the upstream version...

> It does make me wonder whether there
> is something we could do in PostgreSQL to interact better with
> THPs.

The best thing I see is to just use explicit hugepages. I've previously
sent a prototype for that then has been turned into an actual
implementation by Christian Kruse.
A colleague of mine is working on polishing that patch into something
committable.
If you use large s_b, the memory savings alone (some 100kb instead
dozens of megabytes per backend) can be worth it, not to talk of actual
performance gains.

Updating the kernel helps as well, they've improved the efficiency of
defragmentation a good bit.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-09-13 21:59:00 Re: record identical operator
Previous Message Marko Tiikkaja 2013-09-13 21:56:57 plpgsql.print_strict_params