Re: MVCC catalog access

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MVCC catalog access
Date: 2013-06-06 09:30:19
Message-ID: 20130606093019.GI28067@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Robert,

Took a quick look through the patch to understand what your current
revision is actually doing and to facilitate thinking about possible
pain points.

Here are the notes I made during my reading:

On 2013-06-03 14:57:12 -0400, Robert Haas wrote:
> +++ b/src/backend/catalog/catalog.c
> @@ -232,6 +232,10 @@ IsReservedName(const char *name)
> * know if it's shared. Fortunately, the set of shared relations is
> * fairly static, so a hand-maintained list of their OIDs isn't completely
> * impractical.
> + *
> + * XXX: Now that we have MVCC catalog access, the reasoning above is no longer
> + * true. Are there other good reasons to hard-code this, or should we revisit
> + * that decision?
> */

We could just the function by looking in the shared
relmapper. Everything that can be mapped via it is shared.

> --- a/src/backend/commands/cluster.c
> +++ b/src/backend/commands/cluster.c
> @@ -480,6 +480,11 @@ check_index_is_clusterable(Relation OldHeap, Oid indexOid, bool recheck, LOCKMOD
> * against concurrent SnapshotNow scans of pg_index. Therefore this is unsafe
> * to execute with less than full exclusive lock on the parent table;
> * otherwise concurrent executions of RelationGetIndexList could miss indexes.
> + *
> + * XXX: Now that we have MVCC catalog access, SnapshotNow scans of pg_index
> + * shouldn't be common enough to worry about. The above comment needs
> + * to be updated, and it may be possible to simplify the logic here in other
> + * ways also.
> */

You're right, the comment needs to be changed, but I don't think the
effect can. A non-inplace upgrade changes the xmin of the row which is
relevant for indcheckxmin.
(In fact, isn't this update possibly causing problems like delaying the
use of such an index already)

> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -2738,7 +2738,7 @@ AlterTableGetLockLevel(List *cmds)
> * multiple DDL operations occur in a stream against frequently accessed
> * tables.
> *
> - * 1. Catalog tables are read using SnapshotNow, which has a race bug that
> + * 1. Catalog tables were read using SnapshotNow, which has a race bug that

Heh.

> --- a/src/backend/utils/time/snapmgr.c
> +++ b/src/backend/utils/time/snapmgr.c
> @@ -207,6 +207,19 @@ GetLatestSnapshot(void)
> /*
> + * GetInstantSnapshot
> + * Get a snapshot that is up-to-date as of the current instant,
> + * but don't set the transaction snapshot.
> + */
> +Snapshot
> +GetInstantSnapshot(void)
> +{
> + SecondarySnapshot = GetSnapshotData(&SecondarySnapshotData);
> +
> + return SecondarySnapshot;
> +}

Hm. Looks like this should also change the description of SecondarySnapshot:

/*
* CurrentSnapshot points to the only snapshot taken in transaction-snapshot
* mode, and to the latest one taken in a read-committed transaction.
* SecondarySnapshot is a snapshot that's always up-to-date as of the current
* instant, even in transaction-snapshot mode. It should only be used for
* special-purpose code (say, RI checking.)
*
and
/*
* Checking SecondarySnapshot is probably useless here, but it seems
* better to be sure.
*/

Also looks like BuildEventTriggerCache() in evtcache.c should use
GetInstanSnapshot() now.

I actually wonder if we shouldn't just abolish GetLatestSnapshot(). None
of the callers seem to rely on it's behaviour from a quick look and it
seems rather confusing to have both.

> --- a/src/bin/pg_dump/pg_dump.c
> +++ b/src/bin/pg_dump/pg_dump.c
> @@ -14,13 +14,13 @@
> * Note that pg_dump runs in a transaction-snapshot mode transaction,
> * so it sees a consistent snapshot of the database including system
> * catalogs. However, it relies in part on various specialized backend
> - * functions like pg_get_indexdef(), and those things tend to run on
> - * SnapshotNow time, ie they look at the currently committed state. So
> - * it is possible to get 'cache lookup failed' error if someone
> - * performs DDL changes while a dump is happening. The window for this
> - * sort of thing is from the acquisition of the transaction snapshot to
> - * getSchemaData() (when pg_dump acquires AccessShareLock on every
> - * table it intends to dump). It isn't very large, but it can happen.
> + * functions like pg_get_indexdef(), and those things tend to look at
> + * the currently committed state. So it is possible to get 'cache
> + * lookup failed' error if someone performs DDL changes while a dump is
> + * happening. The window for this sort of thing is from the acquisition
> + * of the transaction snapshot to getSchemaData() (when pg_dump acquires
> + * AccessShareLock on every table it intends to dump). It isn't very large,
> + * but it can happen.

I think we need another term for what SnapshotNow used to express
here... Imo this description got less clear with this change.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitriy Igrishin 2013-06-06 09:39:24 Re: About large objects asynchronous and non-blocking support
Previous Message Joshua D. Drake 2013-06-06 08:42:42 Re: Redesigning checkpoint_segments