Re: postgresql latency & bgwriter not doing its job

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgresql latency & bgwriter not doing its job
Date: 2014-08-27 17:07:13
Message-ID: 20140827170713.GN21544@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-08-27 19:00:12 +0200, Fabien COELHO wrote:
>
> >off:
> >
> >$ pgbench -p 5440 -h /tmp postgres -M prepared -c 16 -j16 -T 120 -R 180 -L 200
> >number of skipped transactions: 1345 (6.246 %)
> >
> >on:
> >
> >$ pgbench -p 5440 -h /tmp postgres -M prepared -c 16 -j16 -T 120 -R 180 -L 200
> >number of skipped transactions: 1 (0.005 %)
>
> >That machine is far from idle right now, so the noise is pretty high.
>
> What is the OS and FS? Could it be XFS?

That's linux v3.17-rc2 + ext4.

> >But rather nice initial results.
>
> Indeed, I can confirm:
>
> I did 5000s 25tps tests:
> - Off: 8002 transactions lost (6.3%)
> - On: 158 transactions "lost" (0.12%).
>
> Although it is still 13 times larger than the 12 (0.01%) lost with my every
> 0.2s CHECKPOINT hack, it is nevertheless much much better than before!
>
> The bad news, under pgbench unthrottled load, the tps is divided by 2 (300
> -> 150, could have been worse), *BUT* is also much smoother, the tps is not
> going to 0, but stay in 50-100 range before the next spike.

Yea, I'm not surprised. With a sensible (aka larger) checkpoint_timeout
the performance penalty isn't that big, but it's there. That's why I
think (as mentioned to Heikki nearby) it needs to be combined with
sorting during the checkpoint phase.

> I'm wondering about he order of operations. It seems to me that you sync
> just after giving back a buffer.

Yep. Was just a rather quick patch...

> Maybe it would be better to pipeline it,
> that is something like:
>
> round 0:
> send buffers 0
> sleep?
>
> round N:
> sync buffers N-1
> send buffers N
> sleep?
>
> final N sync:
> sync buffer N

Yes, I think we're going to need to leave a it more room for write
combining and such here. But I think it's going to better to issue
flushes for several buffers together - just not after each write(). To
be really beneficial it needs sorted output though.

> I have not found how to control the checkpoint pacing interval, if there is
> such a thing. With a 200ms lag limit on pgbench, it would be nice if it is
> less than 200ms.

Not sure what you mean.

> I found this old thread "Add basic checkpoint sync spreading" by Greg Smith
> and Simons Riggs, dating from 2010:
> http://www.postgresql.org/message-id/4CE07548.4030709@2ndquadrant.com
> https://commitfest.postgresql.org/action/patch_view?id=431 which ends up
> "returned with feedback".

I didn't really like the unapplied remainder of what was proposed in
there.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2014-08-27 17:13:44 Re: Patch to support SEMI and ANTI join removal
Previous Message Fabien COELHO 2014-08-27 17:05:25 Re: pgbench throttling latency limit