Re: fallocate / posix_fallocate for new WAL file creation (etc...)

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date: 2013-05-15 21:46:33
Message-ID: CAKuK5J1wgybQs_YQms60+8NYsuA5A8sACvhSAhVtPFihjmdGhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 15, 2013 at 4:34 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> On 2013-05-15 16:26:15 -0500, Jon Nelson wrote:
>> >> I have written up a patch to use posix_fallocate in new WAL file
>> >> creation, including configuration by way of a GUC variable, but I've
>> >> not contributed to the PostgreSQL project before. Therefore, I'm
>> >> fairly certain the patch is not formatted properly or conforms to the
>> >> appropriate style guides. Currently, the patch is based on 9.2, and is
>> >> quite small in size - 3.6KiB.
>>
>> I have re-based and reformatted the code, and basic testing shows a
>> reduction in WAL-file creation time of a fairly significant amount.
>> I ran 'make test' and did additional local testing without issue.
>> Therefore, I am attaching the patch. I will try to add it to the
>> commitfest page.
>
> Some where quick comments, without thinking about this:

Thank you for the kind feedback.

> * needs a configure check for posix_fallocate. The current version will
> e.g. fail to compile on windows or many other non linux systems. Check
> how its done for posix_fadvise.

I will address as soon as I am able.

> * Is wal file creation performance actually relevant? Is the performance
> of a system running on fallocate()d wal files any different?

In my limited testing, I noticed a drop of approx. 100ms per WAL file.
I do not have a good idea for how to really stress the WAL-file
creation area without calling pg_start_backup and pg_stop_backup over
and over (with archiving enabled).

However, a file allocated with fallocate is (supposed to be) less
fragmented than one created by the traditional means.

> * According to the man page posix_fallocate doesn't set errno but rather
> returns the error code.

That's true. I originally wrote the patch using fallocate(2). What
would be appropriate here? Should I switch on the return value and the
six (6) or so relevant error codes?

> * I wonder whether we ever want to actually disable this? Afair the libc
> contains emulation for posix_fadvise if the filesystem doesn't support
> it.

I know that glibc does, but I don't know about other libc implementations.

--
Jon

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2013-05-15 22:08:19 Re: Better LWLocks with compare-and-swap (9.4)
Previous Message Andres Freund 2013-05-15 21:34:45 Re: fallocate / posix_fallocate for new WAL file creation (etc...)