Re: basic pgbench runs with various performance-related patches

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: basic pgbench runs with various performance-related patches
Date: 2012-01-23 13:53:01
Message-ID: CA+TgmoboYJurJEOB22Wp9RECMSEYGNyHDVFv5yisvERqFw=6dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

There was finally some time available on Nate Boley's server, which he
has been kind enough to make highly available for performance testing
throughout this cycle, and I got a chance to run some benchmarks
against a bunch of the perfomance-related patches in the current
CommitFest. Specifically, I did my usual pgbench tests: 3 runs at
scale factor 100, with various client counts. I realize that this is
not the only or even most interesting thing to test, but I felt it
would be useful to have this information as a baseline before
proceeding to more complicated testing. I have another set of tests
running now with a significantly different configuration that will
hopefully provide some useful feedback on some of the things this test
fails to capture, and will post the results of the tests (and the
details of the test configuration) as soon as those results are in.

For the most part, I only tested each patch individually, but in one
case I also tested two patches together (buffreelistlock-reduction-v1
with freelist-ok-v2). Results are the median of three five-minute
test runs, with one exception: buffreelistlock-reduction-v1 crapped
out during one of the test runs with the following errors, so I've
shown the results for both of the successful runs (though I'm not sure
how relevant the numbers are given the errors, as I expect there is a
bug here somewhere):

log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read
block 0 in file "base/20024/11780": read only 0 of 8192 bytes
log.ws.buffreelistlock-reduction-v1.1.100.300:CONTEXT: automatic
analyze of table "rhaas.public.pgbench_branches"
log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read
block 0 in file "base/20024/11780": read only 0 of 8192 bytes
log.ws.buffreelistlock-reduction-v1.1.100.300:CONTEXT: automatic
analyze of table "rhaas.public.pgbench_tellers"
log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read
block 0 in file "base/20024/11780": read only 0 of 8192 bytes
log.ws.buffreelistlock-reduction-v1.1.100.300:CONTEXT: automatic
analyze of table "rhaas.pg_catalog.pg_database"
log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read
block 0 in file "base/20024/11780": read only 0 of 8192 bytes
log.ws.buffreelistlock-reduction-v1.1.100.300:STATEMENT: vacuum
analyze pgbench_branches
log.ws.buffreelistlock-reduction-v1.1.100.300:ERROR: could not read
block 0 in file "base/20024/11780": read only 0 of 8192 bytes
log.ws.buffreelistlock-reduction-v1.1.100.300:STATEMENT: select
count(*) from pgbench_branches

Just for grins, I ran the same set of tests against REL9_1_STABLE, and
the results of those tests are also included below. It's worth
grinning about: on this test, at 32 clients, 9.2devel (as of commit
4f42b546fd87a80be30c53a0f2c897acb826ad52, on which all of these tests
are based) is 25% faster on permanent tables, 109% faster on unlogged
tables, and 474% faster on a SELECT-only test.

Here's the test configuration:

shared_buffers = 8GB
maintenance_work_mem = 1GB
synchronous_commit = off
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_writer_delay = 20ms

And here are the results. For everything against master, I've also
included the percentage speedup or slowdown vs. the same test run
against master. Many of these numbers are likely not statistically
significant, though some clearly are.

** pgbench, permanent tables, scale factor 100, 300 s **
1 master 686.038059
8 master 4425.744449
16 master 7808.389490
24 master 13276.472813
32 master 11920.691220
80 master 12560.803169
1 REL9_1_STABLE 627.879523 -8.5%
8 REL9_1_STABLE 4188.731855 -5.4%
16 REL9_1_STABLE 7433.309556 -4.8%
24 REL9_1_STABLE 10496.411773 -20.9%
32 REL9_1_STABLE 9547.804833 -19.9%
80 REL9_1_STABLE 7197.655050 -42.7%
1 background-clean-slru-v2 629.518668 -8.2%
8 background-clean-slru-v2 4794.662182 +8.3%
16 background-clean-slru-v2 8062.151120 +3.2%
24 background-clean-slru-v2 13275.834722 -0.0%
32 background-clean-slru-v2 12024.410625 +0.9%
80 background-clean-slru-v2 12113.589954 -3.6%
1 buffreelistlock-reduction-v1 512.828482 -25.2%
8 buffreelistlock-reduction-v1 4765.576805 +7.7%
16 buffreelistlock-reduction-v1 8030.477792 +2.8%
24 buffreelistlock-reduction-v1 13118.481248 -1.2%
32 buffreelistlock-reduction-v1 11895.847998 -0.2%
80 buffreelistlock-reduction-v1 12015.291045 -4.3%
1 buffreelistlock-reduction-v1-freelist-ok-v2 621.960997 -9.3%
8 buffreelistlock-reduction-v1-freelist-ok-v2 4650.200642 +5.1%
16 buffreelistlock-reduction-v1-freelist-ok-v2 7999.167629 +2.4%
24 buffreelistlock-reduction-v1-freelist-ok-v2 13070.123153 -1.6%
32 buffreelistlock-reduction-v1-freelist-ok-v2 11808.986473 -0.9%
80 buffreelistlock-reduction-v1-freelist-ok-v2 12136.960028 -3.4%
1 freelist-ok-v2 629.832419 -8.2%
8 freelist-ok-v2 4800.267011 +8.5%
16 freelist-ok-v2 8018.571815 +2.7%
24 freelist-ok-v2 13122.167158 -1.2%
32 freelist-ok-v2 12004.261737 +0.7%
80 freelist-ok-v2 12188.211067 -3.0%
1 group-commit-2012-01-21 614.425851 -10.4%
8 group-commit-2012-01-21 4705.129896 +6.3%
16 group-commit-2012-01-21 7962.131701 +2.0%
24 group-commit-2012-01-21 13074.939290 -1.5%
32 group-commit-2012-01-21 12458.962510 +4.5%
80 group-commit-2012-01-21 12907.062908 +2.8%
1 removebufmgrfreelist-v1 624.232337 -9.0%
8 removebufmgrfreelist-v1 4787.757828 +8.2%
16 removebufmgrfreelist-v1 7987.562255 +2.3%
24 removebufmgrfreelist-v1 13185.179180 -0.7%
32 removebufmgrfreelist-v1 11988.099057 +0.6%
80 removebufmgrfreelist-v1 11998.675541 -4.5%
1 xloginsert-scale-6 615.631353 -10.3%
8 xloginsert-scale-6 4717.698532 +6.6%
16 xloginsert-scale-6 8118.873611 +4.0%
24 xloginsert-scale-6 14017.789384 +5.6%
32 xloginsert-scale-6 17214.720336 +44.4%
80 xloginsert-scale-6 16803.463204 +33.8%

** pgbench, unlogged tables, scale factor 100, 300 s **
1 master 677.610878
8 master 5028.697280
16 master 8335.044876
24 master 15210.853801
32 master 21479.647280
80 master 21290.549767
1 REL9_1_STABLE 666.931288 -1.6%
8 REL9_1_STABLE 4534.211018 -9.8%
16 REL9_1_STABLE 7844.550171 -5.9%
24 REL9_1_STABLE 11825.330626 -22.3%
32 REL9_1_STABLE 10267.087265 -52.2%
80 REL9_1_STABLE 7376.673339 -65.4%
1 background-clean-slru-v2 671.505881 -0.9%
8 background-clean-slru-v2 5104.108071 +1.5%
16 background-clean-slru-v2 8451.940663 +1.4%
24 background-clean-slru-v2 15527.042960 +2.1%
32 background-clean-slru-v2 21613.149203 +0.6%
80 background-clean-slru-v2 20790.135768 -2.4%
1 buffreelistlock-reduction-v1 675.186982 -0.4%
8 buffreelistlock-reduction-v1 5089.185745 +1.2%
16 buffreelistlock-reduction-v1 8456.887468 +1.5%
24 buffreelistlock-reduction-v1 15539.905486 +2.2%
32 buffreelistlock-reduction-v1 21562.413227 +0.4%
80 buffreelistlock-reduction-v1 21122.885930 -0.8%
1 buffreelistlock-reduction-v1-freelist-ok-v2 667.265247 -1.5%
8 buffreelistlock-reduction-v1-freelist-ok-v2 5085.813672 +1.1%
16 buffreelistlock-reduction-v1-freelist-ok-v2 8320.059951 -0.2%
24 buffreelistlock-reduction-v1-freelist-ok-v2 15685.366152 +3.1%
32 buffreelistlock-reduction-v1-freelist-ok-v2 21565.811574 +0.4%
80 buffreelistlock-reduction-v1-freelist-ok-v2 20945.756221 -1.6%
1 freelist-ok-v2 680.578723 +0.4%
8 freelist-ok-v2 4680.063074 -6.9%
16 freelist-ok-v2 8414.815514 +1.0%
24 freelist-ok-v2 15655.998340 +2.9%
32 freelist-ok-v2 21423.826249 -0.3%
80 freelist-ok-v2 21149.608334 -0.7%
1 group-commit-2012-01-21 666.329625 -1.7%
8 group-commit-2012-01-21 4940.074794 -1.8%
16 group-commit-2012-01-21 8293.787275 -0.5%
24 group-commit-2012-01-21 15370.196487 +1.0%
32 group-commit-2012-01-21 21652.117344 +0.8%
80 group-commit-2012-01-21 21154.700111 -0.6%
1 removebufmgrfreelist-v1 672.889249 -0.7%
8 removebufmgrfreelist-v1 5135.192248 +2.1%
16 removebufmgrfreelist-v1 8487.267114 +1.8%
24 removebufmgrfreelist-v1 15561.649674 +2.3%
32 removebufmgrfreelist-v1 21526.256680 +0.2%
80 removebufmgrfreelist-v1 21439.081729 +0.7%
1 xloginsert-scale-6 663.599217 -2.1%
8 xloginsert-scale-6 4928.240201 -2.0%
16 xloginsert-scale-6 8345.715047 +0.1%
24 xloginsert-scale-6 15314.188610 +0.7%
32 xloginsert-scale-6 21382.161572 -0.5%
80 xloginsert-scale-6 20555.003740 -3.5%

** pgbench, SELECT-only, scale factor 100, 300 s **
1 master 4474.415026
8 master 33852.480081
16 master 63367.390439
24 master 103869.975640
32 master 218778.460422
80 master 221926.129900
1 REL9_1_STABLE 4377.493967 -2.2%
8 REL9_1_STABLE 27006.472299 -20.2%
16 REL9_1_STABLE 44503.077293 -29.8%
24 REL9_1_STABLE 42646.367806 -58.9%
32 REL9_1_STABLE 38113.938792 -82.6%
80 REL9_1_STABLE 37158.548724 -83.3%
1 background-clean-slru-v2 4448.990827 -0.6%
8 background-clean-slru-v2 32954.904564 -2.7%
16 background-clean-slru-v2 62163.189691 -1.9%
24 background-clean-slru-v2 104054.424938 +0.2%
32 background-clean-slru-v2 219188.777491 +0.2%
80 background-clean-slru-v2 225528.290724 +1.6%
1 buffreelistlock-reduction-v1 ** 4441.150432 4448.333138
8 buffreelistlock-reduction-v1 34063.227940 +0.6%
16 buffreelistlock-reduction-v1 63506.409797 +0.2%
24 buffreelistlock-reduction-v1 104399.970382 +0.5%
32 buffreelistlock-reduction-v1 216559.933170 -1.0%
80 buffreelistlock-reduction-v1 222285.411884 +0.2%
1 buffreelistlock-reduction-v1-freelist-ok-v2 4440.850402 -0.8%
8 buffreelistlock-reduction-v1-freelist-ok-v2 33818.438901 -0.1%
16 buffreelistlock-reduction-v1-freelist-ok-v2 62024.613901 -2.1%
24 buffreelistlock-reduction-v1-freelist-ok-v2 107318.457734 +3.3%
32 buffreelistlock-reduction-v1-freelist-ok-v2 218993.937402 +0.1%
80 buffreelistlock-reduction-v1-freelist-ok-v2 224804.303649 +1.3%
1 freelist-ok-v2 4448.520427 -0.6%
8 freelist-ok-v2 32987.340692 -2.6%
16 freelist-ok-v2 63427.003052 +0.1%
24 freelist-ok-v2 105891.677170 +1.9%
32 freelist-ok-v2 224901.447195 +2.8%
80 freelist-ok-v2 226073.792525 +1.9%
1 group-commit-2012-01-21 4355.726544 -2.7%
8 group-commit-2012-01-21 33000.320589 -2.5%
16 group-commit-2012-01-21 61813.842365 -2.5%
24 group-commit-2012-01-21 104561.991949 +0.7%
32 group-commit-2012-01-21 215981.557010 -1.3%
80 group-commit-2012-01-21 222421.484864 +0.2%
1 removebufmgrfreelist-v1 4465.215178 -0.2%
8 removebufmgrfreelist-v1 34339.075796 +1.4%
16 removebufmgrfreelist-v1 64186.808150 +1.3%
24 removebufmgrfreelist-v1 105002.934233 +1.1%
32 removebufmgrfreelist-v1 220531.094226 +0.8%
80 removebufmgrfreelist-v1 227728.566369 +2.6%
1 xloginsert-scale-6 4347.609435 -2.8%
8 xloginsert-scale-6 33494.005898 -1.1%
16 xloginsert-scale-6 63033.771029 -0.5%
24 xloginsert-scale-6 104033.236840 +0.2%
32 xloginsert-scale-6 221178.054981 +1.1%
80 xloginsert-scale-6 223804.483593 +0.8%

I also went through the logs of all the test runs, looking for errors
or warnings. Other than the one hard error mentioned above, the only
thing I found was:

WARNING: corrupted statistics file "pg_stat_tmp/pgstat.stat"

...which happened *a lot*. Especially on 9.1. Across all test runs,
here is the total number of occurrences on this message by branch:

5 background-clean-slru-v2
5 master
9 xloginsert-scale-6
11 freelist-ok-v2
13 group-commit-2012-01-21
15 buffreelistlock-reduction-v1
17 buffreelistlock-reduction-v1-freelist-ok-v2
24 removebufmgrfreelist-v1
1509 REL9_1_STABLE

Of the 1509 occurrences of this error messages that occurred on the
REL9_1_STABLE branch, 503 were produced in the 1 client configuration
and 1004 in the 80 client configuration. I have no explanation for
why those particular numbers of clients should be more problematic
than 8, 16, 24, or 32 - it may be that the system randomly gets into
some kind of a bad state that causes it to spew many copies of this
message, and that just happened to occur on those test runs but not
the others. I don't know. But it feels like there's probably a bug
here somewhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-23 14:31:58
Message-ID: CA+U5nM+Yw3=jSDw3vYg_ZVcFVa9ra5U5qB=dr1x3pAto4RQcyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 1:53 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Results are the median of three five-minute test runs

> checkpoint_timeout = 15min

Test duration is important for tests that don't relate to pure
contention reduction, which is every patch apart from XLogInsert.
We've discussed that before, so not sure what value you assign to
these results. Very little, is my view, so I'm a little disappointed
to see this post and the associated comments.

I'm very happy to see that your personal work has resulted in gains
and these results are valid tests of that work, IMHO. If you only
measure throughput you're only measuring half of what users care
about. We've not yet seen any tests that confirm that other important
issues have not been made worse.

Before commenting on individual patches its clear that the tests
you've run aren't even designed to highlight the BufFreelistLock
contention that is present in different configs, so that alone is
sufficient to throw most of this away.

On particular patches....

* background-clean-slru-v2 related very directly to reducing the
response time spikes you showed us in your last set of results. Why
not repeat those same tests??

* removebufmgrfreelist-v1 related to the impact of dropping
tables/index/databases, so given the variability of the results, that
at least shows it has no effect in the general case.

> And here are the results.  For everything against master, I've also
> included the percentage speedup or slowdown vs. the same test run
> against master.  Many of these numbers are likely not statistically
> significant, though some clearly are.

> with one exception: buffreelistlock-reduction-v1 crapped
> out during one of the test runs with the following errors

That patch comes with the proviso, stated in comments:
"We didn't get the lock, but read the value anyway on the assumption
that reading this value is atomic."
So we seem to have proved that reading it without the lock isn't safe.

The remaining patch you tested was withdrawn and not submitted to the CF.

Sigh.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-23 15:09:06
Message-ID: CA+TgmoZ9_UPAO33_+8eM-JF33KK_LNw4gmA4bi-2hgvp-a+Ydw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 9:31 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Test duration is important for tests that don't relate to pure
> contention reduction, which is every patch apart from XLogInsert.

Yes, I know. I already said that I was working on more tests to
address other use cases.

> I'm very happy to see that your personal work has resulted in gains
> and these results are valid tests of that work, IMHO. If you only
> measure throughput you're only measuring half of what users care
> about. We've not yet seen any tests that confirm that other important
> issues have not been made worse.

I personally think throughput is awfully important, but clearly
latency matters as well, and that is why *even as we speak* I am
running more tests. If there are other issues with which you are
concerned besides latency and throughput, please say what they are.

> On particular patches....
>
> * background-clean-slru-v2 related very directly to reducing the
> response time spikes you showed us in your last set of results. Why
> not repeat those same tests??

I'm working on it. Actually, I'm attempting to improve my previous
test configuration by making some alterations per some of your
previous suggestions. I plan to post the results of those tests once
I have run them.

> * removebufmgrfreelist-v1 related to the impact of dropping
> tables/index/databases, so given the variability of the results, that
> at least shows it has no effect in the general case.

I think it needs some tests with a larger scale factor before drawing
any general conclusions, since this test, as you mentioned above,
doesn't involve much buffer eviction. As it turns out, I am working
on running such tests.

> That patch comes with the proviso, stated in comments:
> "We didn't get the lock, but read the value anyway on the assumption
> that reading this value is atomic."
> So we seem to have proved that reading it without the lock isn't safe.

I am not sure what's going on with that patch, but clearly something
isn't working right. I don't know whether it's that or something
else, but it does look like there's a bug.

> The remaining patch you tested was withdrawn and not submitted to the CF.

Oh. Which one was that? I thought all of these were in play.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-23 15:35:12
Message-ID: CA+U5nMKX4+vY=B4gXYtw-a+HH8XKkzcHY1YhKSuCAOuBMWp3+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 3:09 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I'm working on it.

Good, thanks for the update.

>> The remaining patch you tested was withdrawn and not submitted to the CF.
>
> Oh.  Which one was that?  I thought all of these were in play.

freelist_ok was a prototype for testing/discussion, which contained an
arguable heuristic. I guess that means its also "in play", but I
wasn't thinking we'd be able to assemble clear evidence for 9.2.

The other patches have clearer and specific roles without heuristics
(mostly), so are at least viable for 9.2, though still requiring
agreement.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-23 15:49:16
Message-ID: CA+TgmoZ34cmJ2GuzXLdvgzxSzXAZ6b4EYtC4_MWfSs=GTDrk9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 10:35 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> freelist_ok was a prototype for testing/discussion, which contained an
> arguable heuristic. I guess that means its also "in play", but I
> wasn't thinking we'd be able to assemble clear evidence for 9.2.

OK, that one is still in the test runs I am doing right now, but I
will drop it from future batches to save time and energy that can be
better spent on things we have a chance of getting done for 9.2.

> The other patches have clearer and specific roles without heuristics
> (mostly), so are at least viable for 9.2, though still requiring
> agreement.

I think we must also drop removebufmgrfreelist-v1 from consideration,
unless you want to go over it some more and try to figure out a fix
for whatever caused it to crap out on these tests. IIUC, that
corresponds to this CommitFest entry:

https://commitfest.postgresql.org/action/patch_view?id=744

Whatever is wrong must be something that happens pretty darn
infrequently, since it only happened on one test run out of 54, which
also means that if you do want to pursue that one we'll have to go
over it pretty darn carefully to make sure that we've fixed that issue
and don't have any others. I have to admit my personal preference is
for postponing that one to 9.3 anyway, since there are some related
issues I'd like to experiment with. But let me know how you'd like to
proceed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-24 00:52:51
Message-ID: CA+U5nMJdmWDB9UapWU5hvQUtHxsHZLA3f+k86VYHUaGZ8B-zVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 3:49 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

>> The other patches have clearer and specific roles without heuristics
>> (mostly), so are at least viable for 9.2, though still requiring
>> agreement.
>
> I think we must also drop removebufmgrfreelist-v1 from consideration,
...

I think you misidentify the patch. Earlier you said it that
"buffreelistlock-reduction-v1 crapped
out" and I already said that the assumption in the code clearly
doesn't hold, implying the patch was dropped.

The removebufmgrfreelist and its alternate patch is still valid, with
applicability to special cases.

I've written another patch to assist with testing/assessment of the
problems, attached.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
freelist_wait_stats.v2.patch text/x-patch 5.4 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-24 01:40:58
Message-ID: CA+Tgmoa+6wfk5knt2aLApSVcw6h6X7EZPsfmkV-LO5o0ac-skg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 7:52 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Mon, Jan 23, 2012 at 3:49 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>>> The other patches have clearer and specific roles without heuristics
>>> (mostly), so are at least viable for 9.2, though still requiring
>>> agreement.
>>
>> I think we must also drop removebufmgrfreelist-v1 from consideration,
> ...
>
> I think you misidentify the patch. Earlier you said it that
> "buffreelistlock-reduction-v1 crapped
> out"  and I already said that the assumption in the code clearly
> doesn't hold, implying the patch was dropped.

Argh. I am clearly having a senior moment here, a few years early.
So is it correct to say that both of the patches associated with
message attached to the following CommitFest entry are now off the
table for 9.2?

https://commitfest.postgresql.org/action/patch_view?id=743

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: robertmhaas(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-24 06:26:15
Message-ID: 20120124.152615.179743663452846570.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> ** pgbench, permanent tables, scale factor 100, 300 s **
> 1 group-commit-2012-01-21 614.425851 -10.4%
> 8 group-commit-2012-01-21 4705.129896 +6.3%
> 16 group-commit-2012-01-21 7962.131701 +2.0%
> 24 group-commit-2012-01-21 13074.939290 -1.5%
> 32 group-commit-2012-01-21 12458.962510 +4.5%
> 80 group-commit-2012-01-21 12907.062908 +2.8%

Interesting. Comparing with this:
http://archives.postgresql.org/pgsql-hackers/2012-01/msg00804.php
you achieved very small enhancement. Do you think of any reason which
makes the difference?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-24 06:59:57
Message-ID: CAEYLb_VxFQGM2acVmBabaMpWNt6CP8jRDo0nRtXoVM48OLHSdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 January 2012 06:26, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>> ** pgbench, permanent tables, scale factor 100, 300 s **
>> 1 group-commit-2012-01-21 614.425851 -10.4%
>> 8 group-commit-2012-01-21 4705.129896 +6.3%
>> 16 group-commit-2012-01-21 7962.131701 +2.0%
>> 24 group-commit-2012-01-21 13074.939290 -1.5%
>> 32 group-commit-2012-01-21 12458.962510 +4.5%
>> 80 group-commit-2012-01-21 12907.062908 +2.8%
>
> Interesting. Comparing with this:
> http://archives.postgresql.org/pgsql-hackers/2012-01/msg00804.php
> you achieved very small enhancement. Do you think of any reason which
> makes the difference?

Presumably this system has a battery-backed cache, whereas my numbers
were obtained on my laptop.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-24 13:58:58
Message-ID: CA+TgmobCCm9svXYcOwFerSJmaNCjePrHYjm=ACxFWVYQ9fkxXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 24, 2012 at 1:26 AM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>> ** pgbench, permanent tables, scale factor 100, 300 s **
>> 1 group-commit-2012-01-21 614.425851 -10.4%
>> 8 group-commit-2012-01-21 4705.129896 +6.3%
>> 16 group-commit-2012-01-21 7962.131701 +2.0%
>> 24 group-commit-2012-01-21 13074.939290 -1.5%
>> 32 group-commit-2012-01-21 12458.962510 +4.5%
>> 80 group-commit-2012-01-21 12907.062908 +2.8%
>
> Interesting. Comparing with this:
> http://archives.postgresql.org/pgsql-hackers/2012-01/msg00804.php
> you achieved very small enhancement. Do you think of any reason which
> makes the difference?

My test was run with synchronous_commit=off, so I didn't expect the
group commit patch to have much of an impact. I included it mostly to
see whether by chance it helped anyway (since it also helps other WAL
flushes, not just commits) or whether it caused any regression.

One somewhat odd thing about these numbers is that, on permanent
tables, all of the patches seemed to show regressions vs. master in
single-client throughput. That's a slightly difficult result to
believe, though, so it's probably a testing artifact of some kind.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: robertmhaas(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-01-25 00:24:50
Message-ID: 20120125.092450.128144703244102426.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> My test was run with synchronous_commit=off, so I didn't expect the
> group commit patch to have much of an impact. I included it mostly to
> see whether by chance it helped anyway (since it also helps other WAL
> flushes, not just commits) or whether it caused any regression.

Oh, I see.

> One somewhat odd thing about these numbers is that, on permanent
> tables, all of the patches seemed to show regressions vs. master in
> single-client throughput. That's a slightly difficult result to
> believe, though, so it's probably a testing artifact of some kind.

Maybe kernel cache effect?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-02-04 16:59:42
Message-ID: 4F2D63FE.30200@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/24/2012 08:58 AM, Robert Haas wrote:
> One somewhat odd thing about these numbers is that, on permanent
> tables, all of the patches seemed to show regressions vs. master in
> single-client throughput. That's a slightly difficult result to
> believe, though, so it's probably a testing artifact of some kind.

It looks like you may have run the ones against master first, then the
ones applying various patches. The one test artifact I have to be very
careful to avoid in that situation is that later files on the physical
disk are slower than earlier ones. There's a >30% differences between
the fastest part of a regular hard drive, the logical beginning, and its
end. Multiple test runs tend to creep forward onto later sections of
disk, and be biased toward the earlier run in that case. To eliminate
that bias when it gets bad, I normally either a) run each test 3 times,
interleaved, or b) rebuild the filesystem in between each initdb.

I'm not sure that's the problem you're running into, but it's the only
one I've been hit by that matches the suspicious part of your results.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: basic pgbench runs with various performance-related patches
Date: 2012-02-05 14:31:54
Message-ID: CA+TgmoYF1rsRGH1OE7BJ_=t0MRZ3pCo7Dtvnkd4489ONs3aWNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 4, 2012 at 11:59 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> On 01/24/2012 08:58 AM, Robert Haas wrote:
>>
>> One somewhat odd thing about these numbers is that, on permanent
>> tables, all of the patches seemed to show regressions vs. master in
>> single-client throughput.  That's a slightly difficult result to
>> believe, though, so it's probably a testing artifact of some kind.
>
> It looks like you may have run the ones against master first, then the ones
> applying various patches.  The one test artifact I have to be very careful
> to avoid in that situation is that later files on the physical disk are
> slower than earlier ones.  There's a >30% differences between the fastest
> part of a regular hard drive, the logical beginning, and its end.  Multiple
> test runs tend to creep forward onto later sections of disk, and be biased
> toward the earlier run in that case.  To eliminate that bias when it gets
> bad, I normally either a) run each test 3 times, interleaved, or b) rebuild
> the filesystem in between each initdb.
>
> I'm not sure that's the problem you're running into, but it's the only one
> I've been hit by that matches the suspicious part of your results.

I don't think that's it, because tests on various branches were
interleaved; moreover, I don't believe master was the first one in the
rotation. I think I had then in alphabetical order by branch name,
actually.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company