Re: fallocate / posix_fallocate for new WAL file creation (etc...)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <greg(at)2ndQuadrant(dot)com>, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date: 2013-05-29 14:42:45
Message-ID: 20130529144245.GC3955@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-05-29 10:36:07 -0400, Stephen Frost wrote:
> * Peter Eisentraut (peter_e(at)gmx(dot)net) wrote:
> > On 5/28/13 11:36 AM, Greg Smith wrote:
> > > Outside of the run for performance testing, I think it would be good at
> > > this point to validate that there is really a 16MB file full of zeroes
> > > resulting from these operations. I am not really concerned that
> > > posix_fallocate might be slower in some cases; that seems unlikely. I
> > > am concerned that it might result in a file that isn't structurally the
> > > same as the 16MB of zero writes implementation used now.
> >
> > I see nothing in the posix_fallocate() man pages that says that the
> > allocated space is filled with any kind of data or zeroes. It will
> > likely be garbage data, but that should be fine for a new WAL file.
>
> I *really* hope that the Linux kernel, and other, folks are smart enough
> to realize that they can't just re-use random blocks from an I/O device
> without cleaning it first.

FWIW, posix' description about posix_fallocate() doesn't actually say
*anything* about reading. The guarantee it makes is:
"If posix_fallocate() returns successfully, subsequent writes to the
specified file data shall not fail due to the lack of free space on the
file system storage media.".

http://pubs.opengroup.org/onlinepubs/009696799/functions/posix_fallocate.html

So we don't even know whether we can read. I think that means we need to
zero the file anyway...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-05-29 14:43:59 Re: pg_dump with postgis extension dumps rules separately
Previous Message Dimitri Fontaine 2013-05-29 14:41:58 Re: Patch to .gitignore