Re: postgresql latency & bgwriter not doing its job

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgresql latency & bgwriter not doing its job
Date: 2014-08-27 04:04:05
Message-ID: CAA4eK1LO86acdj6z8Xg0OkLXe-dHqCVGzsfjhUyJ1tTXUFgvkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 26, 2014 at 12:53 PM, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>
> Given the small flow of updates, I do not think that there should be
reason to get that big a write contention between WAL & checkpoint.
>
> If tried with "full_page_write = off" for 500 seconds: same overall
behavior, 8.5% of transactions are stuck (instead of 10%). However in
details pg_stat_bgwriter is quite different:
>
> buffers_checkpoint = 13906
> buffers_clean = 20748
> buffers_backend = 472
>
> That seems to suggest that bgwriter did some stuff for once, but that did
not change much the result in the end. This would imply that my suggestion
> to make bgwriter write more would not fix the problem alone.

I think the reason could be that in most cases bgwriter
passes the sync responsibility to checkpointer rather than
doing by itself which causes IO storm during checkpoint
and another thing is that it will not proceed to even write
the dirty buffer unless the refcounf and usage_count of
buffer is zero. I see there is some merit in your point which
is to make bgwriter more useful than its current form.
I could see 3 top level points to think about whether improvement
in any of those can improve the current situation:
a. Scanning of buffer pool to find the dirty buffers that can
be flushed.
b. Deciding on what is criteria to flush a buffer
c. Sync of buffers

> With "synchronous_commit = off", the situation is much improved, with
only 0.3% of transactions stuck. Not a surprise. However, I would not
> recommand that as a solution:-)

Yeah, actually it was just to test what is the actual problem,
I will also not recommend such a solution, however it gives
us atleast the indication that due to IO in checkpoint, backends
are not even able to flush comparatively small WAL data.
How about if you keep WAL (pg_xlog) on a separate filesystem
may be via creating symlink or if possible with -X option of initdb?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2014-08-27 04:04:48 Re: Similar to csvlog but not really, json logs?
Previous Message Tom Lane 2014-08-27 04:03:42 Re: jsonb format is pessimal for toast compression