Poor performance of btrfs with Postgresql

Lists: pgsql-general
From: Toby Corkindale <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
To: luv-main <luv-main(at)luv(dot)asn(dot)au>, pgsql-general(at)postgresql(dot)org
Subject: Poor performance of btrfs with Postgresql
Date: 2011-04-21 06:22:24
Message-ID: 4DAFCD20.4020608@strategicdata.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I've done some testing of PostgreSQL on different filesystems, and with
different filesystem mount options.

I found that xfs and ext4 both performed similarly, with ext4 just a few
percent faster; and I found that adjusting the mount options only gave
small improvements, except for the barrier options. (Which come with a
hefty warning)

I also tested btrfs, and was disappointed to see it performed
*dreadfully* - even with the recommended options for database loads.

Best TPS I could get out of ext4 on the test machine was 2392 TPS, but
btrfs gave me just 69! This is appalling performance. (And that was with
nodatacow and noatime set)

I'm curious to know if anyone can spot anything wrong with my testing?
I note that the speed improvement from datacow to nodatacow was only
small - can I be sure it was taking effect? (Although cat /proc/mounts
reported it had)

The details of how I was running the test, and all the results, are here:
http://blog.dryft.net/2011/04/effects-of-filesystems-and-mount.html

I wouldn't run btrfs in production systems at the moment anyway, but I
am curious about the current performance.
(Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28)

Cheers,
Toby


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Toby Corkindale <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
Cc: luv-main <luv-main(at)luv(dot)asn(dot)au>, pgsql-general(at)postgresql(dot)org
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-21 07:28:10
Message-ID: BANLkTinb8-PZ3X9--nEWBqv8paT1=2m2Qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Apr 21, 2011 at 2:22 AM, Toby Corkindale
<toby(dot)corkindale(at)strategicdata(dot)com(dot)au> wrote:
> I've done some testing of PostgreSQL on different filesystems, and with
> different filesystem mount options.
>
> I found that xfs and ext4 both performed similarly, with ext4 just a few
> percent faster; and I found that adjusting the mount options only gave small
> improvements, except for the barrier options. (Which come with a hefty
> warning)
>
> I also tested btrfs, and was disappointed to see it performed *dreadfully* -
> even with the recommended options for database loads.
>
> Best TPS I could get out of ext4 on the test machine was 2392 TPS, but btrfs
> gave me just 69! This is appalling performance. (And that was with nodatacow
> and noatime set)
>
> I'm curious to know if anyone can spot anything wrong with my testing?
> I note that the speed improvement from datacow to nodatacow was only small -
> can I be sure it was taking effect? (Although cat /proc/mounts reported it
> had)
>
> The details of how I was running the test, and all the results, are here:
> http://blog.dryft.net/2011/04/effects-of-filesystems-and-mount.html
>
> I wouldn't run btrfs in production systems at the moment anyway, but I am
> curious about the current performance.
> (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28)

your nobarrier options are not interesting -- hardware sync is not
being flushed. the real numbers are in the 230 range. not sure why
brtfs is doing so badly -- maybe try comparing on single disk volume
vs raid 0?

merlin


From: Toby Corkindale <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: luv-main <luv-main(at)luv(dot)asn(dot)au>, pgsql-general(at)postgresql(dot)org
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-21 07:58:18
Message-ID: 4DAFE39A.8090203@strategicdata.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 21/04/11 17:28, Merlin Moncure wrote:
> On Thu, Apr 21, 2011 at 2:22 AM, Toby Corkindale
> <toby(dot)corkindale(at)strategicdata(dot)com(dot)au> wrote:
>> I've done some testing of PostgreSQL on different filesystems, and with
>> different filesystem mount options.
>>
>> I found that xfs and ext4 both performed similarly, with ext4 just a few
>> percent faster; and I found that adjusting the mount options only gave small
>> improvements, except for the barrier options. (Which come with a hefty
>> warning)
>>
>> I also tested btrfs, and was disappointed to see it performed *dreadfully* -
>> even with the recommended options for database loads.
>>
>> Best TPS I could get out of ext4 on the test machine was 2392 TPS, but btrfs
>> gave me just 69! This is appalling performance. (And that was with nodatacow
>> and noatime set)
>>
>> I'm curious to know if anyone can spot anything wrong with my testing?
>> I note that the speed improvement from datacow to nodatacow was only small -
>> can I be sure it was taking effect? (Although cat /proc/mounts reported it
>> had)
>>
>> The details of how I was running the test, and all the results, are here:
>> http://blog.dryft.net/2011/04/effects-of-filesystems-and-mount.html
>>
>> I wouldn't run btrfs in production systems at the moment anyway, but I am
>> curious about the current performance.
>> (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28)
>
> your nobarrier options are not interesting -- hardware sync is not
> being flushed. the real numbers are in the 230 range. not sure why
> brtfs is doing so badly -- maybe try comparing on single disk volume
> vs raid 0?

Note that some documentation recommends disabling barriers IFF you have
battery-backed write-cache hardware, which is often true on higher-end
hardware.. thus the measured performance is interesting to know.

Quoted from the "mount" man page:
Write barriers enforce proper on-disk ordering of journal commits,
making volatile disk write caches safe to use, at some performance
penalty. If your disks are battery-backed in one way or
another, disabling barriers may safely improve performance.

Cheers,
Toby


From: "Henry C(dot)" <henka(at)cityweb(dot)co(dot)za>
To: "Toby Corkindale" <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
Cc: "luv-main" <luv-main(at)luv(dot)asn(dot)au>, pgsql-general(at)postgresql(dot)org
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-21 10:16:04
Message-ID: d06ed907fa47ee31bb11ffe83805952c.squirrel@support.metroweb.co.za
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> I've done some testing of PostgreSQL on different filesystems, and with
> different filesystem mount options.

Since Pg is already "journalling", why bother duplicating (and pay the
performance penalty, whatever that penalty may be) the effort for no real
gain (except maybe a redundant sense of safety)? ie, use a
non-journalling battle-tested fs like ext2.

Regards
Henry


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-general(at)postgresql(dot)org
Cc: "Henry C(dot)" <henka(at)cityweb(dot)co(dot)za>, "Toby Corkindale" <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>, "luv-main" <luv-main(at)luv(dot)asn(dot)au>
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-21 13:03:58
Message-ID: 201104211503.59149.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thursday, April 21, 2011 12:16:04 PM Henry C. wrote:
> > I've done some testing of PostgreSQL on different filesystems, and with
> > different filesystem mount options.
>
> Since Pg is already "journalling", why bother duplicating (and pay the
> performance penalty, whatever that penalty may be) the effort for no real
> gain (except maybe a redundant sense of safety)? ie, use a
> non-journalling battle-tested fs like ext2.
Don't. The fsck on reboot will eat way too much time.

Using metadata only journaling is ok though. In my opinion the problem with
btrfs is more the overhead of COW, but thats an impression from several kernel
version ago, so...

Andres


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-21 18:07:24
Message-ID: 4DB0725C.5070601@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 04/21/2011 06:16 AM, Henry C. wrote:
> Since Pg is already "journalling", why bother duplicating (and pay the
> performance penalty, whatever that penalty may be) the effort for no real
> gain (except maybe a redundant sense of safety)? ie, use a
> non-journalling battle-tested fs like ext2.
>

The first time your server is down and unreachable over the network
after a crash, because it's run fsck to recover, failed to execute
automatically, and now requires manual intervention before the system
will finish booting, you'll never make that mistake again. On real
database workloads, there's really minimal improvement to gain for that
risk--and sometimes actually a drop in performance--using ext2 over a
properly configured ext3. If you want to loosen the filesystem journal
requirements on a PostgreSQL-only volume, use "data=writeback" on ext3.
And I'd still expect ext4/XFS to beat any ext2/ext3 combination you can
come up with, performance-wise.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-21 20:18:15
Message-ID: 4DB09107.2030502@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 04/21/2011 02:22 AM, Toby Corkindale wrote:
> I also tested btrfs, and was disappointed to see it performed
> *dreadfully* - even with the recommended options for database loads.
>
> Best TPS I could get out of ext4 on the test machine was 2392 TPS, but
> btrfs gave me just 69! This is appalling performance. (And that was
> with nodatacow and noatime set)

I don't run database performance tests until I've tested the performance
of the system doing fsync calls, what I call its raw commit rate.
That's how fast a single comitting process will be able to execute
individual database INSERT statements for example. Whether or not
barriers are turned on or not is the biggest impact on that, and from
what you're describing it sounds like the main issue here is that you
weren't able to get btrfs+nobarrier performing as expected.

If you grab
http://projects.2ndquadrant.it/sites/default/files/bottom-up-benchmarking.pdf
page 26 will show you how to measure fsync rate directly using
sysbench. Other slides cover how to get sysbench working right, you'll
need to get a development snapshot to compile on your Ubuntu system.

General fsync issues around btrfs are still plentiful it seems.
Installing packages with dpkg sometimes does that (I haven't been
following exactly which versions of Ubuntu do and don't fsync), so there
are bug reports like
https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/570805 and
https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/607632

One interesting thing from there is an idea I'd never though of: you
can link in an alternate system library that just ignore fsync if you
want to test turning it off above the filesystem level. Someone has
released a package to do just that, libeatmydata:
http://www.flamingspork.com/projects/libeatmydata/

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


From: "mark" <dvlhntr(at)gmail(dot)com>
To: "'Toby Corkindale'" <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>, "'luv-main'" <luv-main(at)luv(dot)asn(dot)au>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-22 02:39:05
Message-ID: 02b601cc0096$739809e0$5ac81da0$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> owner(at)postgresql(dot)org] On Behalf Of Toby Corkindale
> Sent: Thursday, April 21, 2011 12:22 AM
> To: luv-main; pgsql-general(at)postgresql(dot)org
> Subject: [GENERAL] Poor performance of btrfs with Postgresql
>
> I've done some testing of PostgreSQL on different filesystems, and with
> different filesystem mount options.
>

{snip}

>
> I'm curious to know if anyone can spot anything wrong with my testing?


{snip}

> (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28)

Don't take this the wrong way - I applaud you asking for feedback. BTW ->
Have you seen Greg Smiths PG 9.0 high performance book ? it's got some
chapters dedicated to benchmarking.

Do you have battery backed write cache and a 'real' hardware raid card?
Not sure why your testing with raid 0, but that is just me.

You also did not provide enough other details for it to be of interest to
many other people as a good data point. If you left all else at the defaults
then might just mention that.

Did you play with readahead ?

XFS mount options I have used a time or two... for some of our gear at work:

rw,noatime,nodiratime,logbufs=8,inode64,allocsize=16m

How was the raid configured ? did you do stripe/block alignment ? might not
make a noticeable difference but if one is serious maybe it is a good habit
to get into. I haven't done as much tuning work as I should with xfs but a
primer can be found at :
http://oss.sgi.com/projects/xfs/training/xfs_slides_04_mkfs.pdf

Getting benches with pg 9 would also be interested because of the changes to
pgbench between 8.4 and 9.0, although at only about 230 tps I don't know how
much a difference you will see, since the changes only really show up when
you can sustain at a much higher tps rate.

Knowing the PG config, would also be interesting, but with so few disks and
OS, xlogs, and data all being on the same disks .... well yeah it's not a
superdome, but still would be worth noting on your blog for posterity sake.

Right now I wish I had a lot of time to dig into different XFS setups on
some of our production matching gear - but other projects have me too busy
and I am having trouble getting our QA people loan me gear for it.

Heck I haven't tested ext4 at all to speak of - so shame on me for that.

To loosely quote someone else I saw posting to a different thread a while
back "I would walk through fire for a 10% performance gain". IMO through
proper testing and benchmarking you can make sure you are not giving up 10%
(or more) performance where you don't have to - no matter what hardware you
are running.

-Mark

>
> Cheers,
> Toby
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


From: Toby Corkindale <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
To: mark <dvlhntr(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Poor performance of btrfs with Postgresql
Date: 2011-04-27 05:49:15
Message-ID: 4DB7AE5B.7050802@strategicdata.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 22/04/11 12:39, mark wrote:
>> (Tested on Ubuntu Server - Maverick - Kernel 2.6.35-28)
>
>
> Don't take this the wrong way - I applaud you asking for feedback. BTW ->
> Have you seen Greg Smiths PG 9.0 high performance book ? it's got some
> chapters dedicated to benchmarking.

I do have the book, actually; I wasn't referring to it for these quick
tests though.

> Do you have battery backed write cache and a 'real' hardware raid card?
> Not sure why your testing with raid 0, but that is just me.

In production, yes. On a development machine, no. (Also hence the raid-0
-- this machine doesn't need to be highly reliable, and am more
interested in higher performance.)

> You also did not provide enough other details for it to be of interest to
> many other people as a good data point. If you left all else at the defaults
> then might just mention that.
>
> Did you play with readahead ?

No, but that's a good suggestion.
Have you? How much difference has it made?

[snip]

> How was the raid configured ? did you do stripe/block alignment ? might not
> make a noticeable difference but if one is serious maybe it is a good habit
> to get into. I haven't done as much tuning work as I should with xfs but a
> primer can be found at :
> http://oss.sgi.com/projects/xfs/training/xfs_slides_04_mkfs.pdf

Linux software RAID; stripe/blocks were aligned correctly for lvm and at
least ext4; unsure about XFS, and I've blown that away by now so can't
check. :/

> Getting benches with pg 9 would also be interested because of the changes to
> pgbench between 8.4 and 9.0, although at only about 230 tps I don't know how
> much a difference you will see, since the changes only really show up when
> you can sustain at a much higher tps rate.

Well, closer to 2400 TPS actually, including the runs with barriers
disabled.

I'll re-run the tests in May - by then ubuntu server will be out, and
11.04 comes with a newer kernel that supposedly improves btrfs
performance a bit (and ext4 slightly), and I'll also use PG 9.0

> Knowing the PG config, would also be interesting, but with so few disks and
> OS, xlogs, and data all being on the same disks .... well yeah it's not a
> superdome, but still would be worth noting on your blog for posterity sake.

Yeah; I know it's not a supercomputer setup, but I found it interesting
to note that btrfs was such a poor contender -- that was the main point
of my results. Also it's interesting to note that disabling barriers
provides such a massive increase in performance.
(But with serious caveats if you are to do so safely)

> Right now I wish I had a lot of time to dig into different XFS setups on
> some of our production matching gear - but other projects have me too busy
> and I am having trouble getting our QA people loan me gear for it.
>
> Heck I haven't tested ext4 at all to speak of - so shame on me for that.

It seems worthwhile - it consistently ran slightly faster than XFS.

> To loosely quote someone else I saw posting to a different thread a while
> back "I would walk through fire for a 10% performance gain". IMO through
> proper testing and benchmarking you can make sure you are not giving up 10%
> (or more) performance where you don't have to - no matter what hardware you
> are running.

I'm more worried about giving up 80% of my performance, as demonstrated
by using sub-optimal filesystems, or sub-optimal options to the optimal
filesystems!

Toby