Re: Improvement of checkpoint IO scheduler for stable transaction responses

From: didier <did447(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improvement of checkpoint IO scheduler for stable transaction responses
Date: 2013-07-22 01:26:22
Message-ID: CAJRYxu+XT8EA+OMGq1GUXztGZSpkcX1+ZYAXNaNPXv5YyUeKOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 20, 2013 at 6:28 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:

> On 7/20/13 4:48 AM, didier wrote:
>
>> With your tests did you try to write the hot buffers first? ie buffers
>> with a high refcount, either by sorting them on refcount or at least
>> sweeping the buffer list in reverse?
>>
>
> I never tried that version. After a few rounds of seeing that all changes
> I tried were just rearranging the good and bad cases, I got pretty bored
> with trying new changes in that same style.
>
>
> by writing to the OS the less likely to be recycle buffers first it may
>> have less work to do at fsync time, hopefully they have been written by
>> the OS background task during the spread and are not re-dirtied by other
>> backends.
>>
>
> That is the theory. In practice write caches are so large now, there is
> almost no pressure forcing writes to happen until the fsync calls show up.
> It's easily possible to enter the checkpoint fsync phase only to discover
> there are 4GB of dirty writes ahead of you, ones that have nothing to do
> with the checkpoint's I/O.
>
> Backends are constantly pounding the write cache with new writes in
> situations with checkpoint spikes. The writes and fsync calls made by the
> checkpoint process are only a fraction of the real I/O going on. The volume
> of data being squeezed out by each fsync call is based on total writes to
> that relation since the checkpoint. That's connected to the writes to that
> relation happening during the checkpoint, but the checkpoint writes can
> easily be the minority there.
>
> It is not a coincidence that the next feature I'm working on attempts to
> quantify the total writes to each 1GB relation chunk. That's the most
> promising path forward on the checkpoint problem I've found.
>
>
> --
> Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Quan Zongliang 2013-07-22 04:17:56 improve Chinese locale performance
Previous Message Tatsuo Ishii 2013-07-22 00:21:43 Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)