Perf regression in 2.6.32 (Ubuntu 10.04 LTS)

Lists: pgsql-hackers
From: Domas Mituzas <midom(dot)lists(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-12 20:31:27
Message-ID: 9EFE53ED-3F4E-4FE7-89F4-DCF47E9F4BAF@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello folks,

I've been playing around today a lot with sysbench, and observed that 2.6.32 kernel supplied by Ubuntu is having perf regression with PG (which does not affect MySQL), compared to 2.6.28 builds I have.
What I observed can be seen in a paste at http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 - 2.6.32-24-server).

Machines are two socket quad-opterons 2356s.

oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle cpu, which is somewhere in the top symbol :)

Domas


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Domas Mituzas <midom(dot)lists(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-13 09:29:00
Message-ID: 4C8DEEDC.40105@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/09/10 23:31, Domas Mituzas wrote:
> I've been playing around today a lot with sysbench, and observed that 2.6.32 kernel supplied by Ubuntu is having perf regression with PG (which does not affect MySQL), compared to 2.6.28 builds I have.
> What I observed can be seen in a paste at http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 - 2.6.32-24-server).
>
> Machines are two socket quad-opterons 2356s.
>
> oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has>20% of idle cpu, which is somewhere in the top symbol :)

Can you run oprofile on the older kernel, so that we can compare and see
where the time is spent?

Looks like over 7% of the time is spent in s_lock, which suggests some
change in behavior in context switching or something like that, but
let's see what the old profile looks like.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Domas Mituzas <midom(dot)lists(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-13 11:49:48
Message-ID: DE4CA5B3-194C-4228-A8FA-12057C3A36E4@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> Can you run oprofile on the older kernel, so that we can compare and see where the time is spent?
> Looks like over 7% of the time is spent in s_lock, which suggests some change in behavior in context switching or something like that, but let's see what the old profile looks like.

I grabbed the 2.6.28.2 as a loaner from prod boxes I had around, may take a while to do that again.
Will see if I can get some nehalem loaners (or do these tests at other environment) to do more modern hardware comparison.

Domas


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Domas Mituzas <midom(dot)lists(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-13 16:05:30
Message-ID: 4C8E4BCA.3040003@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Domas Mituzas wrote:
> I've been playing around today a lot with sysbench, and observed that 2.6.32 kernel supplied by Ubuntu is having perf regression with PG (which does not affect MySQL), compared to 2.6.28 builds I have.
> What I observed can be seen in a paste at http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 - 2.6.32-24-server).
>
> Machines are two socket quad-opterons 2356s.
>
> oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle cpu, which is somewhere in the top symbol :)
>

Are you using the same filesystem setup on both setups? And regardless,
what is that filesystem? We know that between 2.6.28 and 2.6.32 the
kernel improved how it handles fsync requests in a good way from a
reliability perspective (to fix bugs that could cause data loss before),
particularly on ext4, so it's possible the regression you're seeing is
just the expense of handling things properly.

If you already have sysbench on there, I'd suggest comparing the two
systems by seeing how fast each can execute fsync requests:

sysbench --test=fileio --file-fsync-freq=1 --file-num=1
--file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec"

To help distinguish whether this regression might be coming from the
already known changes in that area, or if it's instead from something
that's impacting CPU efficiency.

Also, it's easy to see a performance change of this size just from the
database files being on a different part of the disk if you didn't
control for that. Disks are almost twice as fast at their beginning
than their end nowadays.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Domas Mituzas <midom(dot)lists(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-13 16:27:30
Message-ID: 4C8E50F2.3040907@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/13/2010 06:05 PM, Greg Smith wrote:
> Domas Mituzas wrote:
>> I've been playing around today a lot with sysbench, and observed that
>> 2.6.32 kernel supplied by Ubuntu is having perf regression with PG
>> (which does not affect MySQL), compared to 2.6.28 builds I have.
>> What I observed can be seen in a paste at
>> http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is
>> 2.6.32 - 2.6.32-24-server).
>> Machines are two socket quad-opterons 2356s.
>> oprofile output can be seen at
>> http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle
>> cpu, which is somewhere in the top symbol :)
>
> Are you using the same filesystem setup on both setups? And regardless,
> what is that filesystem? We know that between 2.6.28 and 2.6.32 the
> kernel improved how it handles fsync requests in a good way from a
> reliability perspective (to fix bugs that could cause data loss before),
> particularly on ext4, so it's possible the regression you're seeing is
> just the expense of handling things properly.
>
> If you already have sysbench on there, I'd suggest comparing the two
> systems by seeing how fast each can execute fsync requests:
>
> sysbench --test=fileio --file-fsync-freq=1 --file-num=1
> --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec"
>
> To help distinguish whether this regression might be coming from the
> already known changes in that area, or if it's instead from something
> that's impacting CPU efficiency.
>
> Also, it's easy to see a performance change of this size just from the
> database files being on a different part of the disk if you didn't
> control for that. Disks are almost twice as fast at their beginning than
> their end nowadays.

well the main point here is that domas is doing a pure read-only test on
a rather small workload so it should entirely fit in memory...
From some very quick testing here as well it rathers seems that for
some reason the CPU scheduler is not actually scheduling us all the
available CPU on 2.6.32 or we are having some sort of locking issue that
is more exposed on this kernel.

Stefan


From: Thom Brown <thom(at)linux(dot)com>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Domas Mituzas <midom(dot)lists(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-13 16:32:31
Message-ID: AANLkTinU=o7DWH+h-iMRp22FgH+2fEOmHfVH+Qzj=s+s@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 September 2010 17:27, Stefan Kaltenbrunner
<stefan(at)kaltenbrunner(dot)cc> wrote:
> On 09/13/2010 06:05 PM, Greg Smith wrote:
>>
>> Domas Mituzas wrote:
>>>
>>> I've been playing around today a lot with sysbench, and observed that
>>> 2.6.32 kernel supplied by Ubuntu is having perf regression with PG
>>> (which does not affect MySQL), compared to 2.6.28 builds I have.
>>> What I observed can be seen in a paste at
>>> http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is
>>> 2.6.32 - 2.6.32-24-server).
>>> Machines are two socket quad-opterons 2356s.
>>> oprofile output can be seen at
>>> http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w - system has >20% of idle
>>> cpu, which is somewhere in the top symbol :)
>>
>> Are you using the same filesystem setup on both setups? And regardless,
>> what is that filesystem? We know that between 2.6.28 and 2.6.32 the
>> kernel improved how it handles fsync requests in a good way from a
>> reliability perspective (to fix bugs that could cause data loss before),
>> particularly on ext4, so it's possible the regression you're seeing is
>> just the expense of handling things properly.
>>
>> If you already have sysbench on there, I'd suggest comparing the two
>> systems by seeing how fast each can execute fsync requests:
>>
>> sysbench --test=fileio --file-fsync-freq=1 --file-num=1
>> --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec"
>>
>> To help distinguish whether this regression might be coming from the
>> already known changes in that area, or if it's instead from something
>> that's impacting CPU efficiency.
>>
>> Also, it's easy to see a performance change of this size just from the
>> database files being on a different part of the disk if you didn't
>> control for that. Disks are almost twice as fast at their beginning than
>> their end nowadays.
>
> well the main point here is that domas is doing a pure read-only test on a
> rather small workload so it should entirely fit in memory...
> From some very quick testing here as well it rathers seems that for some
> reason the CPU scheduler is not actually scheduling us all the available CPU
> on 2.6.32 or we are having some sort of locking issue that is more exposed
> on this kernel.

I thought sysbench was designed for MySQL benchmarks. How new is the
PostgreSQL driver? Is it stable yet?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Domas Mituzas <midom(dot)lists(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-13 16:43:30
Message-ID: 4C8E54B2.1000009@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thom Brown wrote:
> I thought sysbench was designed for MySQL benchmarks. How new is the
> PostgreSQL driver? Is it stable yet?
>

It's been out there for years; the FreeBSD 7.0 development used it
extensively on MySQL and PostgreSQL to track kernel performance on both
databases back in 2007:
http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf

I don't think "stable" applies here just based on code age though, given
how infrequent updates to the sysbench code are and how little QA is put
into them. They pushed out two updates in 2009, 0.4.11 and 0.4.12, but
all they did for me was break basic compilation on multiple platforms.
I still use 0.4.10 as the last version that seems to work without
makefile surgery on both RedHat and Ubuntu.

The last time I tried it, the read-only OLTP implementation worked fine,
but the one that wrote instead was prone to deadlocks in PostgreSQL.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Domas Mituzas <midom(dot)lists(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-13 17:20:55
Message-ID: 4C8E5D77.3050204@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/13/2010 06:43 PM, Greg Smith wrote:
> Thom Brown wrote:
>> I thought sysbench was designed for MySQL benchmarks. How new is the
>> PostgreSQL driver? Is it stable yet?
>
> It's been out there for years; the FreeBSD 7.0 development used it
> extensively on MySQL and PostgreSQL to track kernel performance on both
> databases back in 2007:
> http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf
>
> I don't think "stable" applies here just based on code age though, given
> how infrequent updates to the sysbench code are and how little QA is put
> into them. They pushed out two updates in 2009, 0.4.11 and 0.4.12, but
> all they did for me was break basic compilation on multiple platforms. I
> still use 0.4.10 as the last version that seems to work without makefile
> surgery on both RedHat and Ubuntu.
>
> The last time I tried it, the read-only OLTP implementation worked fine,
> but the one that wrote instead was prone to deadlocks in PostgreSQL.

yeah the read-only part works quite well(the other ones not so much) and
it was much faster than pgbench in older pg release - I have not looked
yet if the new threaded in 9.0 implementation fixes that issue.

Stefan


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Domas Mituzas <midom(dot)lists(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-27 15:28:13
Message-ID: AANLkTimPCHCb56Tdid_h9v6jZtCw5dn-1Sjz_r_=5+C2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 13, 2010 at 12:05 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Domas Mituzas wrote:
>>
>> I've been playing around today a lot with sysbench, and observed that
>> 2.6.32 kernel supplied by Ubuntu is having perf regression with PG (which
>> does not affect MySQL), compared to 2.6.28 builds I have.
>> What I observed can be seen in a paste at
>> http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 -
>> 2.6.32-24-server).
>> Machines are two socket quad-opterons 2356s.
>> oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w -
>> system has >20% of idle cpu, which is somewhere in the top symbol :)
>>
>
> Are you using the same filesystem setup on both setups?  And regardless,
> what is that filesystem?  We know that between 2.6.28 and 2.6.32 the kernel
> improved how it handles fsync requests in a good way from a reliability
> perspective (to fix bugs that could cause data loss before), particularly on
> ext4, so it's possible the regression you're seeing is just the expense of
> handling things properly.
>
> If you already have sysbench on there, I'd suggest comparing the two systems
> by seeing how fast each can execute fsync requests:
>
> sysbench --test=fileio --file-fsync-freq=1 --file-num=1
> --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec"
>
> To help distinguish whether this regression might be coming from the already
> known changes in that area, or if it's instead from something that's
> impacting CPU efficiency.
>
> Also, it's easy to see a performance change of this size just from the
> database files being on a different part of the disk if you didn't control
> for that.  Disks are almost twice as fast at their beginning than their end
> nowadays.

Greg, have you run into any other evidence suggesting a problem with 2.6.32?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-28 03:37:09
Message-ID: 4CA162E5.6010000@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28/09/10 04:28, Robert Haas wrote:
> On Mon, Sep 13, 2010 at 12:05 PM, Greg Smith<greg(at)2ndquadrant(dot)com> wrote:
>
>> Domas Mituzas wrote:
>>
>>> I've been playing around today a lot with sysbench, and observed that
>>> 2.6.32 kernel supplied by Ubuntu is having perf regression with PG (which
>>> does not affect MySQL), compared to 2.6.28 builds I have.
>>> What I observed can be seen in a paste at
>>> http://p.defau.lt/?8_GQV82Pz3_SDZbNOdP93Q (db12 is 2.6.28, db20 is 2.6.32 -
>>> 2.6.32-24-server).
>>> Machines are two socket quad-opterons 2356s.
>>> oprofile output can be seen at http://p.defau.lt/?OIR1vDFK4cze_fmBTQbV9w -
>>> system has>20% of idle cpu, which is somewhere in the top symbol :)
>>>
>>>
>> Are you using the same filesystem setup on both setups? And regardless,
>> what is that filesystem? We know that between 2.6.28 and 2.6.32 the kernel
>> improved how it handles fsync requests in a good way from a reliability
>> perspective (to fix bugs that could cause data loss before), particularly on
>> ext4, so it's possible the regression you're seeing is just the expense of
>> handling things properly.
>>
>> If you already have sysbench on there, I'd suggest comparing the two systems
>> by seeing how fast each can execute fsync requests:
>>
>> sysbench --test=fileio --file-fsync-freq=1 --file-num=1
>> --file-total-size=16384 --file-test-mode=rndwr run | grep "Requests/sec"
>>
>> To help distinguish whether this regression might be coming from the already
>> known changes in that area, or if it's instead from something that's
>> impacting CPU efficiency.
>>
>> Also, it's easy to see a performance change of this size just from the
>> database files being on a different part of the disk if you didn't control
>> for that. Disks are almost twice as fast at their beginning than their end
>> nowadays.
>>
> Greg, have you run into any other evidence suggesting a problem with 2.6.32?
>
>

Not Greg (sorry), but this might be worth a look:

http://www.spinics.net/lists/linux-ext4/msg20299.html

regards

Mark


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-28 03:59:35
Message-ID: AANLkTikTnZrzumZYShTO_o7Vy8UpLucJ=7nWZTNfdL6X@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 27, 2010 at 11:37 PM, Mark Kirkwood
<mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:
> Greg, have you run into any other evidence suggesting a problem with 2.6.32?
>
> Not Greg (sorry), but this might be worth a look:
>
> http://www.spinics.net/lists/linux-ext4/msg20299.html

Oh, interesting. But why wouldn't that also affect MySQL?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-09-28 04:16:41
Message-ID: 4CA16C29.40405@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28/09/10 16:59, Robert Haas wrote:
> On Mon, Sep 27, 2010 at 11:37 PM, Mark Kirkwood
> <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:
>
>> Greg, have you run into any other evidence suggesting a problem with 2.6.32?
>>
>> Not Greg (sorry), but this might be worth a look:
>>
>> http://www.spinics.net/lists/linux-ext4/msg20299.html
>>
> Oh, interesting. But why wouldn't that also affect MySQL?
>
>

Yeah, wondered that myself - perhaps if sysbench is using myisam tables
then there is probably no fsync activity at all for a read only
workload. Be interesting to see if Mysql suffers a hit for sysbench
configured to use innodb storage...


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Domas Mituzas <midom(dot)lists(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Perf regression in 2.6.32 (Ubuntu 10.04 LTS)
Date: 2010-10-07 06:44:01
Message-ID: 4CAD6C31.5010809@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> Greg, have you run into any other evidence suggesting a problem with 2.6.32?
>

I haven't actually checked myself yet. Right now the only distribution
shipping 2.6.32 usefully is Ubuntu 10.04, which I can't recommend anyone
use on a server because their release schedules are way too aggressive
to ever deliver stable versions anymore. So until either RHEL6 or
Debian Squeeze ships, very later this year or early next, the
performance of 2.6.32 is irrelevant to me. And by then I'm hoping that
the early adopters have squashed more of the obvious bugs here. 2.6.32
is 11 months old at this point, which makes it still a bleeding edge
kernel in my book.

--
Greg Smith, 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us