Re: MVCC catalog access

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MVCC catalog access
Date: 2013-06-28 02:33:38
Message-ID: CA+TgmoaQW7jo50=VEij=ssTNpQ-qm1CrAqpVkSEULb1UsT7bXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 20, 2013 at 11:13 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I actually wasn't thinking of adding it at that level. More like just
> not fetching a new snapshot in systable_* if a) the global ddl counter
> hasn't been increased b) our transactions hasn't performed any
> ddl. Instead use the one used the last time we fetched one for that
> purpose.

All right, so here's a patch that does something along those lines.
We have to always take a new snapshot when somebody scans a catalog
that has no syscache, because there won't be any invalidation messages
to work off of in that case. The only catalog in that category that's
accessed during backend startup (which is mostly what your awful test
case is banging on) is pg_db_role_setting. We could add a syscache
for that catalog or somehow force invalidation messages to be sent
despite the lack of a syscache, but what I chose to do instead is
refactor things slightly so that we use the same snapshot for all four
scans of pg_db_role_setting, instead of taking a new one each time. I
think that's unimpeachable on correctness grounds; it's no different
than if we'd finished all four scans in the time it took us to finish
the first one, and then gotten put to sleep by the OS scheduler for as
long as it took us to scan the other three. Point being that there's
no interlock there.

Anyway, with that change, plus the general mechanism of the patch,
each backend takes just two MVCC scans during startup. The first
catalog access takes an MVCC snapshot for obvious reasons, and then
there's one additional snapshot for the access to pg_db_role_setting
for the reasons stated above. Everything else piggybacks on those
snapshots, unless of course an invalidation intervenes.

In my testing, this did not completely eliminate the performance hit
on your test case, but it got it down to 3-4%. Best of five runs,
with max_connections=2000 and 1000 connections running
readonly-busy.sql:

(patched)
tps = 183.224651 (including connections establishing)
tps = 1091.813178 (excluding connections establishing)
(unpatched)
tps = 190.598045 (including connections establishing)
tps = 1129.422537 (excluding connections establishing)

The difference is 3-4%, which is quite a lot less than what you
measured before, although on different hardware, so results may vary.

Now, I'm not sure this fix actually helps the other test scenarios
very much; for example, it's not going to help the cases that pound on
pg_depend, because that doesn't have a system cache either. As with
pg_db_role_setting, we could optimize this mechanism by sending
invalidation messages for pg_depend changes, but I'm not sure it's
worth it.

I'm also attaching a fixed version of pg_cxn.c; the last version had a few bugs.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
mvcc-catalog-access-v4.patch application/octet-stream 104.6 KB
pg_cxn.c text/x-csrc 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2013-06-28 03:54:01 Re: Documentation/help for materialized and recursive views
Previous Message Bruce Momjian 2013-06-28 02:11:38 Re: updated emacs configuration