Re: fallocate / posix_fallocate for new WAL file creation (etc...)

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date: 2013-05-25 18:55:09
Message-ID: CAKuK5J3UPsP64G1f3GSiQyx7W2229yffjPZ4_ctGqvLh2khL+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 16, 2013 at 7:05 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> On 5/16/13 9:16 AM, Jon Nelson wrote:
>>
>> Am I doing this the right way? Should I be posting the full patch each
>> time, or incremental patches?
>
>
> There are guidelines for getting your patch in the right format at
> https://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git
> that would improve this one. You have some formatting issues with tab
> spacing at lines 120 through 133 in your v3 patch. And it looks like there
> was a formatting change on line 146 that is making the diff larger than it
> needs to be.

I've corrected the formatting change (end-of-line whitespace was
stripped) on line 146.
The other whitespace changes are - I think - due to newly-indented
code due to a new code block.
Included please find a v4 patch which uses context diffs per the above url.

> The biggest thing missing from this submission is information about what
> performance testing you did. Ideally performance patches are submitted with
> enough information for a reviewer to duplicate the same test the author did,
> as well as hard before/after performance numbers from your test system. It
> often turns tricky to duplicate a performance gain, and being able to run
> the same test used for initial development eliminates a lot of the problems.

This has been a bit of a struggle. While it's true that WAL file
creation doesn't happen with great frequency, and while it's also true
that - with strace and other tests - it can be proven that
fallocate(16MB) is much quicker than writing it zeroes by hand,
proving that in the larger context of a running install has been
challenging.

Attached you'll find a small test script (t.sh) which creates a new
cluster in 'foo', changes some config values, starts the cluster, and
then times how long it takes pgbench to prepare a database. I've used
"wal_level = hot_standby" in the hopes that this generates the largest
number of WAL files (and I set the number of such files to 1024). The
hardware is an AMD 9150e with a 2-disk software RAID1 (SATA disks) on
kernel 3.9.2 and ext4 (x86_64, openSUSE 12.3). The test results are
not that surprising. The longer the test (the larger the scale factor)
the less of a difference using posix_fallocate makes. With a scale
factor of 100, I see an average of 10-11% reduction in the time taken
to initialize the database. With 300, it's about 5.5% and with 900,
it's between 0 and 1.2%. I will be doing more testing but this is what
I started with. I'm very open to suggestions.

> Second bit of nitpicking. There are already some GUC values that appear or
> disappear based on compile time options. They're all debugging related
> things though. I would prefer not to see this one go away when it's
> implementation isn't available. That's going to break any scripts that SHOW
> the setting to see if it's turned on or not as a first problem. I think the
> right model to follow here is the IFDEF setup used for
> effective_io_concurrency. I wouldn't worry about this too much though.
> Having a wal_use_fallocate GUC is good for testing. But if it works out
> well, when it's ready for commit I don't see why anyone would want it turned
> off on platforms where it works. There are already too many performance
> tweaking GUCs. Something has to be very likely to be changed from the
> default before its worth adding one for it.

Ack. I've revised the patch to always have the GUC (for now), default
to false, and if configure can't find posix_fallocate (or the user
disables it by way of pg_config_manual.h) then it remains a GUC that
simply can't be changed.

I'll also be re-running the tests.

--
Jon

Attachment Content-Type Size
image/png 7.6 KB
fallocate-v4.patch application/octet-stream 7.2 KB
t.sh application/x-sh 952 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2013-05-25 20:44:35 Re: Planning incompatibilities for Postgres 10.0
Previous Message Simon Riggs 2013-05-25 17:57:23 Re: Planning incompatibilities for Postgres 10.0