Re: fallocate / posix_fallocate for new WAL file creation (etc...)

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Greg Smith <greg(at)2ndQuadrant(dot)com>, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date: 2013-05-29 15:19:33
Message-ID: 51A61C85.60102@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/29/13 10:42 AM, Andres Freund wrote:
> On 2013-05-29 10:36:07 -0400, Stephen Frost wrote:
>> I *really* hope that the Linux kernel, and other, folks are smart enough
>> to realize that they can't just re-use random blocks from an I/O device
>> without cleaning it first.
>
> FWIW, posix' description about posix_fallocate() doesn't actually say
> *anything* about reading. The guarantee it makes is:
> "If posix_fallocate() returns successfully, subsequent writes to the
> specified file data shall not fail due to the lack of free space on the
> file system storage media.".
>
> http://pubs.opengroup.org/onlinepubs/009696799/functions/posix_fallocate.html
>
> So we don't even know whether we can read. I think that means we need to
> zero the file anyway...

We could use Linux fallocate(), which does guarantee that the file reads
back as zeroes. Or we use posix_fallocate() and write over the first
few bytes, enough for a subsequent reader to detect that it shouldn't
read any further.

But all of this is getting very complicated for such a marginal improvement.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-05-29 15:55:02 Re: preserving forensic information when we freeze
Previous Message Bruce Momjian 2013-05-29 15:17:24 Re: pg_upgrade -u