Re: MVCC catalog access

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MVCC catalog access
Date: 2013-06-20 13:45:26
Message-ID: CA+Tgmob5krXFM2VcdxUi11K2wbbL9Z6=WDSYhhTyXP=Tbt6+JA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 17, 2013 at 8:12 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> So, the biggest issue with the patch seems to be performance worries. I
> tried to create a worst case scenario:
> postgres (patched and HEAD) running with:
> -c shared_buffers=4GB \
> -c max_connections=2000 \
> -c maintenance_work_mem=2GB \
> -c checkpoint_segments=300 \
> -c wal_buffers=64MB \
> -c synchronous_commit=off \
> -c autovacuum=off \
> -p 5440
>
> With one background pgbench running:
> pgbench -p 5440 -h /tmp -f /tmp/readonly-busy.sql -c 1000 -j 10 -T 100 postgres
> readonly-busy.sql:
> BEGIN;
> SELECT txid_current();
> SELECT pg_sleep(0.0001);
> COMMIT;
>
> I measured the performance of one other pgbench:
> pgbench -h /tmp -p 5440 postgres -T 10 -c 100 -j 100 -n -f /tmp/simplequery.sql -C
> simplequery.sql:
> SELECT * FROM af1, af2 WHERE af1.x = af2.x;
> tables:
> create table af1 (x) as select g from generate_series(1,4) g;
> create table af2 (x) as select g from generate_series(4,7) g;
>
> With that setup one can create quite a noticeable overhead for the mvcc
> patch (best of 5):
>
> master-optimize:
> tps = 1261.629474 (including connections establishing)
> tps = 15121.648834 (excluding connections establishing)
>
> dev-optimize:
> tps = 773.719637 (including connections establishing)
> tps = 2804.239979 (excluding connections establishing)
>
> Most of the time in both, patched and unpatched is by far spent in
> GetSnapshotData. I think the reason this shows a far higher overhead
> than what you previously measured is that a) in your test the other
> backends were idle, in mine they actually modify PGXACT which causes
> noticeable cacheline bouncing b) I have higher numer of connections &
> #max_connections
>
> A quick test shows that even with max_connection=600, 400 background,
> and 100 foreground pgbenches there's noticeable overhead:
> master-optimize:
> tps = 2221.226711 (including connections establishing)
> tps = 31203.259472 (excluding connections establishing)
> dev-optimize:
> tps = 1629.734352 (including connections establishing)
> tps = 4754.449726 (excluding connections establishing)
>
> Now I grant that's a somewhat harsh test for postgres, but I don't
> think it's entirely unreasonable and the performance impact is quite
> stark.

It's not entirely unreasonable, but it *is* mostly unreasonable. I
mean, nobody is going to run 1000 connections in the background that
do nothing but thrash PGXACT on a real system. I just can't get
concerned about that. What I am concerned about is that there may be
other, more realistic workloads that show similar regressions. But I
don't know how to find out whether that's actually the case. On the
IBM POWER box where I tested this, it's not even GetSnapshotData()
that kills you; it's the system CPU scheduler.

The thing about this particular test is that it's artificial -
normally, any operation that wants to modify PGXACT will spend a lot
more time fighting of WALInsertLock and maybe waiting for disk I/O
than is the case here. Of course, with Heikki's WAL scaling patch and
perhaps other optimizations we expect that other overhead to go down,
which might make the problems here more visible; and some of Heikki's
existing testing has shown significant contention around ProcArrayLock
as things stand. But I'm still on the fence about whether this is
really a valid test.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-06-20 13:53:36 Re: dynamic background workers
Previous Message Magnus Hagander 2013-06-20 13:33:24 Re: Config reload/restart preview