Re: [HACKERS] Sync vs. fsync during checkpoint
- From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
- To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
- Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zeugswetter Andreas SB SD <ZeugswetterA(at)spardat(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Win32 port list <pgsql-hackers-win32(at)postgresql(dot)org>
- Subject: Re: [HACKERS] Sync vs. fsync during checkpoint
- Date: Mon, 09 Feb 2004 09:33:09 -0500
- Message-id: <40279A25(dot)6020600(at)Yahoo(dot)com>
Bruce Momjian wrote:
Jan Wieck wrote:
Tom Lane wrote:
> "Zeugswetter Andreas SB SD" <ZeugswetterA(at)spardat(dot)at> writes:
>> So Imho the target should be to have not much IO open for the checkpoint,
>> so the fsync is fast enough, even if serial.
>
> The best we can do is push out dirty pages with write() via the bgwriter
> and hope that the kernel will see fit to write them before checkpoint
> time arrives. I am not sure if that hope has basis in fact or if it's
> just wishful thinking. Most likely, if it does have basis in fact it's
> because there is a standard syncer daemon forcing a sync() every thirty
> seconds.
Looking at the response time charts I did for showing how vacuum delay
is doing, it seems at least on Linux there is hope that that is the
case. Those charts have just a regular 5 minute checkpoint with enough
checkpoint segments for that, and no other sync effort done at all.
The system has a hard time to handle a larger scaled test DB, so it is
definitely well saturated with IO. The charts are here:
http://developer.postgresql.org/~wieck/vacuum_cost/
>
> That means that instead of an I/O storm every checkpoint interval,
> we get a smaller I/O storm every 30 seconds. Not sure this is a big
> improvement. Jan already found out that issuing very frequent sync()s
> isn't a win.
In none of those charts I can see any checkpoint caused IO storm any
more. Charts I'm currently doing for 7.4.1 show extremely clear spikes
at checkpoints. If someone is interested in those as well I will put
them up.
So, Jan, are you basically saying that the background writer has solved
the checkpoint I/O flood problem, and we just need to deal with changing
sync to multiple fsync's at checkpoint?
ISTM that the background writer at least has the ability to lower the
impact of a checkpoint significantly enough that one might not care
about it any more. "Has the ability" means, it needs to be adjusted to
the actual DB usage. The charts I produced where not done with the
default settings, but rather after making the bgwriter a bit more
agressive against dirty pages.
The whole sync() vs. fsync() discussion is in my opinion nonsense at
this point. Without the ability to limit the amount of files to a
reasonable number, by employing tablespaces in the form of larger
container files, the risk of forcing excessive head movement is simply
too high.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #
Home |
Main Index |
Thread Index