Re: initdb and fsync

From: Noah Misch <noah(at)leadboat(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: initdb and fsync
Date: 2012-02-05 22:56:57
Message-ID: 20120205225657.GA2911@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 05, 2012 at 10:53:20AM -0800, Jeff Davis wrote:
> On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote:
> > If we add fsync calls to the initdb process, they should cover the entire data
> > directory tree. This patch syncs files that initdb.c writes, but we ought to
> > also sync files that bootstrap-mode backends had written.
>
> It doesn't make sense for initdb to take responsibility to sync files
> created by the backend. If there are important files that the backend
> creates, it should be the backend's responsibility to fsync them (and
> their parent directory, if needed). And if they are unimportant to the
> backend, then there is no reason for initdb to fsync them.

I meant primarily to illustrate the need to be comprehensive, not comment on
which executable should fsync a particular file. Bootstrap-mode backends do
not sync anything during an initdb run on my system. With your patch, we'll
fsync a small handful of files and leave nearly everything else vulnerable.

That being said, having each backend fsync its own writes will mean syncing
certain files several times within a single initdb run. If the penalty from
that proves high enough, we may do well to instead have initdb.c sync
everything just once.

> > initdb should do these syncs by default and offer an option to disable them.
>
> For test frameworks that run initdb often, that makes sense.
>
> But for developers, it doesn't make sense to spend 0.5s typing an option
> that saves you 0.3s. So, we'd need some more convenient way to choose
> the no-fsync option, like an environment variable that developers can
> set. Or maybe developers don't care about 0.3s?

Developers have shell aliases/functions/scripts and command history. I
wouldn't object to having an environment variable control it, but I would not
personally find that more convenient than a command-line switch.

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-02-05 23:14:33 Re: Report: race conditions in WAL replay routines
Previous Message Tom Lane 2012-02-05 22:23:18 Re: Report: race conditions in WAL replay routines