Re: Scaling shared buffer eviction

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling shared buffer eviction
Date: 2014-06-09 04:03:11
Message-ID: CAA4eK1Kz6c6ULQN7D03cQDU_E4OO8gcUaNBKnFtKF9pPZsN60A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 8, 2014 at 7:21 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > I have improved the patch by making following changes:
> > a. Improved the bgwriter logic to log for xl_running_xacts info and
> > removed the hibernate logic as bgwriter will now work only when
> > there is scarcity of buffer's in free list. Basic idea is when the
> > number of buffers on freelist drops below the low threshold, the
> > allocating backend sets the latch and bgwriter wakesup and begin
> > adding buffer's to freelist until it reaches high threshold and then
> > again goes back to sleep.
>
> The numbers from your benchmarks are very exciting, but the above
> concerns me. My tuning of the bgwriter in production has generally
> *not* been aimed at keeping pages on the freelist, but toward
> preventing shared_buffers from accumulating a lot of dirty pages,
> which were leading to cascades of writes between caches and thus to
> write stalls. By pushing dirty pages into the (*much* larger) OS
> cache, and letting write combining happen there, where the OS could
> pace based on the total number of dirty pages instead of having
> some hidden and appearing rather suddenly, latency spikes were
> avoided while not causing any noticeable increase in the number of
> OS writes to the RAID controller's cache.
>
> Essentially I was able to tune the bgwriter so that a dirty page
> was always push out to the OS cache within three seconds, which led
> to a healthy balance of writes between the checkpoint process and
> the bgwriter.

I think it would have been better if bgwriter does writes based on
the amount of buffer's that get dirtied to achieve the balance of
writes.

> Backend processes related to user connections still
> performed about 30% of the writes, and this work shows promise
> toward bringing that down, which would be great; but please don't
> eliminate the ability to prevent write stalls in the process.

I agree that for some cases as explained by you, the current bgwriter
logic does satisfy the need, however there are other cases as well
where actually it doesn't help much, one of such cases I am trying to
improve (ease backend buffer allocations) and other may be when
there is constant write activity for which I am not sure how much it
really helps. Part of the reason for trying to make bgwriter respond
mainly to ease backend allocations is the previous discussion for
the same, refer below link:
http://www.postgresql.org/message-id/CA+TgmoZ7dvhC4h-ffJmZCff6VWyNfOEAPZ021VxW61uH46R3QA@mail.gmail.com

However if we want to retain current property of bgwriter, we can do
the same by one of below ways:
a. Have separate processes for writing dirty buffers and moving buffers
to freelist.
b. In the current bgwriter, separate the two works based on the need.
The need can be decided based on whether bgwriter has been waked
due to shortage of buffers on free list or if it has been waked due to
BgWriterDelay.

Now as populating freelist and balance writes by writing dirty buffers
are two separate responsibilities, so not sure if doing that by one
process is a good idea.

I am planing to take some more performance data, part of which will
be write load as well, but I am now sure if that can anyway show the
need as mentioned by you.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Barwick 2014-06-09 04:58:35 "RETURNING PRIMARY KEY" syntax extension
Previous Message Noah Misch 2014-06-09 03:29:36 Re: wrapping in extended mode doesn't work well with default pager