Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From: Niels Kristian Schjødt <nielskristian(at)autouncle(dot)com>
To: <sthomas(at)optionshouse(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance
Date: 2012-12-05 21:45:04
Message-ID: 593C6191-C6EE-419D-BE98-A153662D676E@autouncle.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Where as I can't say I yet tried out the 3.4 kernel, I can say that I am running 3.2 too, and maybe there is a connection to the past issues of strange CPU behavior I have had (as you know and have been so kind to try helping me solve). I will without a doubt try out 3.4 or 3.6 within the coming days, and report back on the topic.


Den 05/12/2012 kl. 19.28 skrev Shaun Thomas <sthomas(at)optionshouse(dot)com>:

> Hey guys,
>
> This isn't a question, but a kind of summary over a ton of investigation
> I've been doing since a recent "upgrade". Anyone else out there with
> "big iron" might want to confirm this, but it seems pretty reproducible.
> This seems to affect the latest 3.2 mainline and by extension, any
> platform using it. My tests are restricted to Ubuntu 12.04, but it may
> apply elsewhere.
>
> Comparing the latest official 3.2 kernel to the latest official 3.4
> kernel (both Ubuntu), there are some rather striking differences. I'll
> start with some pgbench tests.
>
> * This test is 800 read-only clients, with 2 controlling threads on a
> 55GB database (scaling factor of 3600) for 3 minutes.
> * With 3.4:
> * Max TPS was 68933.
> * CPU was between 50 and 55% idle.
> * Load average was between 10 and 15.
> * With 3.2:
> * Max TPS was 17583. A total loss of 75% performance.
> * CPU was between 12 and 25% idle.
> * Load average was between 10 and 60---effectively random.
> * Next, we checked minimal write tests. This time, with only two
> clients. All other metrics are the same.
> * With 3.4:
> * Max TPS was 4548.
> * CPU was between 88 and 92% idle.
> * Load average was between 1.7 and 2.5.
> * With 3.2:
> * Max TPS was 4639.
> * CPU was between 88 and 92% idle.
> * Load average was between 3 and 4.
>
> Overall, performance was _much_ worse in 3.2 by almost every metric
> except for very low contention activity. More CPU for less transactions,
> and wildly inaccurate load reporting. The 3.2 kernel in its current
> state should be considered detrimental and potentially malicious under
> high task contention.
>
> I'll admit not letting the tests run for more than 10 iterations, but I
> didn't really need more than that. Even one iteration is enough to see
> this in action. At least every Ubuntu 3.2 kernel since 3.2.0-31 exhibits
> this, but I haven't tested further back. I've also examined both
> official Ubuntu 3.2 and Ubuntu mainline kernels as obtained from here:
>
> http://kernel.ubuntu.com/~kernel-ppa/mainline
>
> The 3.2.34 mainline also has these problems. For reference, I tested the
> 3.4.20 Quantal release on Precise because the Precise 3.4 kernel hasn't
> been maintained.
>
> Again, anyone running 12.04 LTS, take a good hard look at your systems.
> Hopefully you have a spare machine to test with. I'm frankly appalled
> this thing is in an LTS release.
>
> I'll also note that all kernels exhibit some extent of client threads
> bloating load reports. In a pgbench for-loop (run, sleep 1, repeat), sometimes load will jump to some very high number between iterations, but on a 3.4, it will settle down again. On a 3.2, it just jumps randomly. I tested that with this script:
>
> nLoop=0
>
> while [ 1 -eq 1 ]; do
>
> if [ $[$nLoop % 20] -eq 0 ]; then
> echo -e "Stat Time\t\tSleep\tRun\tLoad Avg"
> fi
>
> stattime=$(date +"%Y-%m-%d %H:%M:%S")
> sleep=$(ps -emo stat | egrep -c 'D')
> run=$(ps -emo stat | egrep -c 'R')
> loadavg=$(cat /proc/loadavg | cut -d ' ' -f 1)
>
> echo -e "${stattime}\t${sleep}\t${run}\t${loadavg}"
> sleep 1
>
> nLoop=$[$nLoop + 1]
>
> done
>
> The jumps look like this:
>
> Stat Time Sleep Run Load Avg
> 2012-12-05 12:23:13 0 16 7.66
> 2012-12-05 12:23:14 0 12 7.66
> 2012-12-05 12:23:15 0 7 7.66
> 2012-12-05 12:23:16 0 17 7.66
> 2012-12-05 12:23:17 0 1 24.51
> 2012-12-05 12:23:18 0 2 24.51
>
> It's much harder to trigger on 3.4, but still happens.
>
> If anyone has tested against 3.6 or 3.7, I'd love to hear your input. Inconsistent load reports are one thing... strangled performance and inflated CPU usage are quite another.
>
> --
> Shaun Thomas
> OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
> 312-444-8534
> sthomas(at)optionshouse(dot)com
> 100
>
> ______________________________________________
>
> See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Daniel Farina 2012-12-05 22:19:41 Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance
Previous Message Guillaume Smet 2012-12-05 19:39:14 Any idea on how to improve the statistics estimates for this plan?