Re: postgresql latency & bgwriter not doing its job

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgresql latency & bgwriter not doing its job
Date: 2014-08-26 06:12:48
Message-ID: alpine.DEB.2.10.1408260733240.4394@sto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Josh,

> So I think that you're confusing the roles of bgwriter vs. spread
> checkpoint. What you're experiencing above is pretty common for
> nonspread checkpoints on slow storage (and RAID5 is slow for DB updates,
> no matter how fast the disks are), or for attempts to do spread
> checkpoint on filesystems which don't support it (e.g. Ext3, HFS+). In
> either case, what's happening is that the *OS* is freezing all logical
> and physical IO while it works to write out all of RAM, which makes me
> suspect you're using Ext3 or HFS+.

I'm using ext4 on debian wheezy with postgresqk 9.4b2.

I agree that the OS may be able to help, but this aspect does not
currently work for me at all out of the box. The "all of RAM" is really a
few thousands 8 kB pages written randomly, a few dozen MB.

Also, if pg needs advanced OS tweaking to handle a small load, ISTM that
it fails at simplicity:-(

As for checkpoint spreading, raising checkpoint_completion_target to 0.9
degrades the situation (20% of transactions are more than 200 ms late
instead of 10%, bgwriter wrote less that 1 page per second, on on 500s
run). So maybe there is a bug here somewhere.

> Making the bgwriter more aggressive adds a significant risk of writing
> the same pages multiple times between checkpoints, so it's not a simple fix.

Hmmm... This must be balanced with the risk of being offline. Not all
people are interested in throughput at the price of latency, so there
could be settings that help latency, even at the price of reducing
throughput (average tps). After that, it is the administrator choice to
set pg for higher throughput or lower latency.

Note that writing some "least recently used" page multiple times does not
seems to be any issue at all for me under small/medium load, especially as
the system has nothing else to do: if you have nothing else to do, there
is no cost in writing a page, even if you may have to write it again some
time later, and it helps prevent dirty pages accumulation. So it seems to
me that pg can help, it is not only/merely an OS issue.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-08-26 06:17:08 Re: Escaping from blocked send() reprised.
Previous Message Fabrízio de Royes Mello 2014-08-26 04:42:20 Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED