Re: Backport of fsync queue compaction

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Backport of fsync queue compaction
Date: 2012-06-21 04:57:20
Message-ID: 4FE2A9B0.9090503@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I don't want to take a bunch of time away from the active CF talking
about this, just wanted to pass along some notes:

-Back branch release just happening a few weeks ago. Happy to have this
dropped until the CF is over.

-Attached is a working backport of this to 8.4, with standard git
comments in the header. I think this one will backport happily with git
cherrypick. Example provided mainly to prove that; not intended to be a
patch submission.

-It is still possible to get extremely long running sync times with the
improvement applied.

Since I had a 8.4 server manifesting this problem where I could just try
this one change, I did that. Before we had this:

> 2012-06-17 14:48:13 EDT LOG: checkpoint complete: wrote 90 buffers
> (0.1%); 0 transaction log file(s) added, 0 removed, 14 recycled;
> write=26.531 s, sync=4371.513 s, total=4461.058 s

After the compaction code was working (also backported the extra logging
here) I got this instead:

2012-06-20 23:10:36 EDT LOG: checkpoint complete: wrote 188 buffers
(0.1%); 0 transaction log file(s) added, 0 removed, 7 recycled;
write=31.975 s, sync=3064.270 s, total=3096.263 s; sync files=308,
longest=482.200 s, average=9.948 s

So the background writer still took a long time due to starvation from
clients. But the backend side latency impact wasn't nearly as bad
though. The peak load average didn't jump into the hundreds, it only
got 10 to 20 clients behind on things.

Anyway, larger discussion around this and related OS tuning is a better
topic for pgsql-performance, will raise this there when I've sorted that
out a bit more clearly.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Attachment Content-Type Size
0003-Try-to-avoid-running-with-a-full-fsync-request-queue.patch text/x-patch 0 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2012-06-21 05:15:49 Re: Allow WAL information to recover corrupted pg_controldata
Previous Message Amit Kapila 2012-06-21 03:32:02 Re: Allow WAL information to recover corrupted pg_controldata