Re: testing ProcArrayLock patches

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: testing ProcArrayLock patches
Date: 2011-11-20 16:33:47
Message-ID: 4EC8D78B02000025000432BE@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Hmm. There's obviously something that's different in your
> environment or configuration from what I tested, but I don't know
> what it is. The fact that your scale factor is larger than
> shared_buffers might matter; or Intel vs. AMD. Or maybe you're
> running with synchronous_commit=on?

Yes, I had synchronous_commit = on for these runs. Here are the
settings:

cat >> $PGDATA/postgresql.conf <<EOM;
max_connections = 200
max_pred_locks_per_transaction = 256
shared_buffers = 10GB
maintenance_work_mem = 1GB
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_writer_delay = 20ms
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.05
effective_cache_size = 40GB
default_transaction_isolation = '$iso'
EOM

Is there any chance that having pg_xlog on a separate RAID 10 set of
drives with it's own BBU controller would explain anything? I mean,
I always knew that was a good idea for a big, heavily-loaded box,
but I remember being surprised at how *big* a difference that made
when a box accidentally went into production without moving the
pg_xlog directory there.

There is one other things which might matter, I didn't use the -n
pgbench option, and on the sample you showed, you were using it.

Here is the median of five from the latest runs. On these
read/write tests there was very little spread within each set of
five samples, with no extreme outliers like I had on the SELECT-only
tests. In the first position s means simple protocol and p means
prepared protocol. In the second position m means master, f means
with the flexlock patch.

sm1 tps = 1092.269228 (including connections establishing)
sf1 tps = 1090.511552 (including connections establishing)
sm2 tps = 2171.867100 (including connections establishing)
sf2 tps = 2158.609189 (including connections establishing)
sm4 tps = 4278.541453 (including connections establishing)
sf4 tps = 4269.921594 (including connections establishing)
sm8 tps = 8472.257182 (including connections establishing)
sf8 tps = 8476.150588 (including connections establishing)
sm16 tps = 15905.074160 (including connections establishing)
sf16 tps = 15937.372689 (including connections establishing)
sm32 tps = 22331.817413 (including connections establishing)
sf32 tps = 22861.258757 (including connections establishing)
sm64 tps = 26388.391614 (including connections establishing)
sf64 tps = 26529.152361 (including connections establishing)
sm80 tps = 25617.651194 (including connections establishing)
sf80 tps = 26560.541237 (including connections establishing)
sm96 tps = 24105.455175 (including connections establishing)
sf96 tps = 26569.244384 (including connections establishing)
sm128 tps = 21467.530210 (including connections establishing)
sf128 tps = 25883.023093 (including connections establishing)

pm1 tps = 1629.265970 (including connections establishing)
pf1 tps = 1619.024905 (including connections establishing)
pm2 tps = 3164.061963 (including connections establishing)
pf2 tps = 3137.469377 (including connections establishing)
pm4 tps = 6114.787505 (including connections establishing)
pf4 tps = 6061.750200 (including connections establishing)
pm8 tps = 11884.534375 (including connections establishing)
pf8 tps = 11870.670086 (including connections establishing)
pm16 tps = 20575.737107 (including connections establishing)
pf16 tps = 20437.648809 (including connections establishing)
pm32 tps = 27664.381103 (including connections establishing)
pf32 tps = 28046.846479 (including connections establishing)
pm64 tps = 26764.294547 (including connections establishing)
pf64 tps = 26631.589294 (including connections establishing)
pm80 tps = 27716.198263 (including connections establishing)
pf80 tps = 28393.642871 (including connections establishing)
pm96 tps = 26616.076293 (including connections establishing)
pf96 tps = 28055.921427 (including connections establishing)
pm128 tps = 23282.912620 (including connections establishing)
pf128 tps = 23072.766829 (including connections establishing)

Note that on this 32 core box, performance on the read/write pgbench
is peaking at 64 clients, but without a lot of variance between 32
and 96 clients. And with the patch, performance still hasn't fallen
off too badly at 128 clients. This is good news in terms of not
having to sweat connection pool sizing quite as much as earlier
releases.

Next I will get the profile for the SELECT-only runs. It seems to
make sense to profile at the peak performance level, which was 64
clients.

I can run one more set of tests tonight before I have to give it
back to the guy who's putting it into production. It sounds like a
set like the above except with synchronous_commit = off might be
desirable?

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Steve Singer 2011-11-20 18:14:17 Re: plpython SPI cursors
Previous Message Josh Kupershmidt 2011-11-20 16:07:07 Re: psql setenv command