Re: Support for REINDEX CONCURRENTLY

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for REINDEX CONCURRENTLY
Date: 2012-12-19 02:24:55
Message-ID: 20121219022455.GE7666@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2012-12-17 11:44:00 +0900, Michael Paquier wrote:
> Thanks for all your comments.
> The new version (v5) of this patch fixes the error you found when
> reindexing indexes being referenced in foreign keys.
> The fix is done with switchIndexConstraintOnForeignKey:pg_constraint.c, in
> charge of scanning pg_constraint for foreign keys that refer the parent
> relation (confrelid) of the index being swapped and then switch conindid to
> the new index if the old index was referenced.
> This API also takes care of switching the dependency between the foreign
> key and the old index by calling changeDependencyFor.
> I also added a regression test for this purpose.

Ok. Are there no other depencencies towards indexes? I don't know of any
right now, but I have the feeling there were some other cases.

> On Tue, Dec 11, 2012 at 12:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:
>
> > Some review comments:
> >
> > * Some of the added !is_reindex in index_create don't seem safe to
> > me.
> >
> This is added to control concurrent index relation for toast indexes. If we
> do not add an additional flag for that it will not be possible to reindex
> concurrently a toast index.

I think some of them were added for cases that didn't seem to be related
to that. I'll recheck in the current version.

> > * Why do we now support reindexing exclusion constraints?
> >
> CREATE INDEX CONCURRENTLY is not supported for exclusive constraints but I
> played around with exclusion constraints with my patch and did not
> particularly see any problems in supporting them as for example index_build
> performs a second scan of the heap when running so it looks enough solid
> for that. Is it because the structure of REINDEX CONCURRENTLY patch is
> different? Honestly I think no so is there something I am not aware of?

I think I asked because you had added an && !is_reindex to one of the
checks.

If I recall the reason why concurrent index builds couldn't support
exclusion constraints correctly - namely that we cannot use them to
check for new row versions when the index is in the ready && !valid
state - that shouldn't be a problem when we have a valid version of an
old index arround because that enforces everything. It would maybe need
an appropriate if (!isvalid) in the exclusion constraint code, but that
should be it.

> * REINDEX DATABASE .. CONCURRENTLY doesn't work, a variant that does the
> > concurrent reindexing for user-tables and non-concurrent for system
> > tables would be very useful. E.g. for the upgrade from 9.1.5->9.1.6...
> >
> OK. I thought that this was out of scope for the time being. I haven't done
> anything about that yet. Supporting that will not be complicated as
> ReindexRelationsConcurrently (new API) is more flexible now, the only thing
> needed is to gather the list of relations that need to be reindexed.

Imo that so greatly reduces the usability of this patch that you should
treat it as in scope ;). Especially as you say, it really shouldn't be
that much work with all the groundwork built.

> > * would be nice (but thats probably a step #2 thing) to do the
> > individual steps of concurrent reindex over multiple relations to
> > avoid too much overall waiting for other transactions.
> >
> I think I did that by now using one transaction per index for each
> operation except the drop phase...

Without yet having read the new version, I think thats not what I
meant. There currently is a wait for concurrent transactions to end
after most of the phases for every relation, right? If you have a busy
database with somewhat longrunning transactions thats going to slow
everything down with waiting quite bit. I wondered whether it would make
sense to do PHASE1 for all indexes in all relations, then wait once,
then PHASE2...
That obviously has some space and index maintainece overhead issues, but
its probably sensible anyway in many cases.

> > * PHASE 6 should acquire exlusive locks on the indexes
> >
> The necessary lock is taken when calling index_drop through
> performMultipleDeletion. Do you think it is not enough and that i should
> add an Exclusive lock inside index_concurrent_drop?

It seems to be safer to acquire it earlier, otherwise the likelihood for
deadlocks seems to be slightly higher as youre increasing the lock
severity. And it shouldn't cause any disadvantages,s o ...

Starts to look really nice now!

Isn't the following block content thats mostly available somewhere else
already?

> + <refsect2 id="SQL-REINDEX-CONCURRENTLY">
> + <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
> +
> + <indexterm zone="SQL-REINDEX-CONCURRENTLY">
> + <primary>index</primary>
> + <secondary>rebuilding concurrently</secondary>
> + </indexterm>
> +
> + <para>
> + Rebuilding an index can interfere with regular operation of a database.
> + Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
> + against writes and performs the entire index build with a single scan of the
> + table. Other transactions can still read the table, but if they try to
> + insert, update, or delete rows in the table they will block until the
> + index rebuild is finished. This could have a severe effect if the system is
> + a live production database. Very large tables can take many hours to be
> + indexed, and even for smaller tables, an index rebuild can lock out writers
> + for periods that are unacceptably long for a production system.
> + </para>
...
> + <para>
> + Regular index builds permit other regular index builds on the
> + same table to occur in parallel, but only one concurrent index build
> + can occur on a table at a time. In both cases, no other types of schema
> + modification on the table are allowed meanwhile. Another difference
> + is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
> + command can be performed within a transaction block, but
> + <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
> + by default not allowed to run inside a transaction block, so in this case
> + <command>CONCURRENTLY</> is not supported.
> + </para>
> +

> - if (concurrent && is_exclusion)
> + if (concurrent && is_exclusion && !is_reindex)
> ereport(ERROR,
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> errmsg_internal("concurrent index creation for exclusion constraints is not supported")));

This is what I referred to above wrt reindex and CONCURRENTLY. We
shouldn't pass concurrently if we don't deem it to be safe for exlusion
constraints.

> +/*
> + * index_concurrent_drop
> + *
> + * Drop a list of indexes as the last step of a concurrent process. Deletion is
> + * done through performDeletion or dependencies of the index are not dropped.
> + * At this point all the indexes are already considered as invalid and dead so
> + * they can be dropped without using any concurrent options.
> + */
> +void
> +index_concurrent_drop(List *indexIds)
> +{
> + ListCell *lc;
> + ObjectAddresses *objects = new_object_addresses();
> +
> + Assert(indexIds != NIL);
> +
> + /* Scan the list of indexes and build object list for normal indexes */
> + foreach(lc, indexIds)
> + {
> + Oid indexOid = lfirst_oid(lc);
> + Oid constraintOid = get_index_constraint(indexOid);
> + ObjectAddress object;
> +
> + /* Register constraint or index for drop */
> + if (OidIsValid(constraintOid))
> + {
> + object.classId = ConstraintRelationId;
> + object.objectId = constraintOid;
> + }
> + else
> + {
> + object.classId = RelationRelationId;
> + object.objectId = indexOid;
> + }
> +
> + object.objectSubId = 0;
> +
> + /* Add object to list */
> + add_exact_object_address(&object, objects);
> + }
> +
> + /* Perform deletion for normal and toast indexes */
> + performMultipleDeletions(objects,
> + DROP_RESTRICT,
> + 0);
> +}

Just for warm and fuzzy feeling I think it would be a good idea to
recheck that indexes are !indislive here.

> diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c
> index 5e8c6da..55c092d 100644
> +
> +/*
> + * switchIndexConstraintOnForeignKey
> + *
> + * Switch foreign keys references for a given index to a new index created
> + * concurrently. This process is used when swapping indexes for a concurrent
> + * process. All the constraints that are not referenced externally like primary
> + * keys or unique indexes should be switched using the structure of index.c for
> + * concurrent index creation and drop.
> + * This function takes care of also switching the dependencies of the foreign
> + * key from the old index to the new index in pg_depend.
> + *
> + * In order to complete this process, the following process is done:
> + * 1) Scan pg_constraint and extract the list of foreign keys that refer to the
> + * parent relation of the index being swapped as conrelid.
> + * 2) Check in this list the foreign keys that use the old index as reference
> + * here with conindid
> + * 3) Update field conindid to the new index Oid on all the foreign keys
> + * 4) Switch dependencies of the foreign key to the new index
> + */
> +void
> +switchIndexConstraintOnForeignKey(Oid parentOid,
> + Oid oldIndexOid,
> + Oid newIndexOid)
> +{
> + ScanKeyData skey[1];
> + SysScanDesc conscan;
> + Relation conRel;
> + HeapTuple htup;
> +
> + /*
> + * Search pg_constraint for the foreign key constraints associated
> + * with the index by scanning using conrelid.
> + */
> + ScanKeyInit(&skey[0],
> + Anum_pg_constraint_confrelid,
> + BTEqualStrategyNumber, F_OIDEQ,
> + ObjectIdGetDatum(parentOid));
> +
> + conRel = heap_open(ConstraintRelationId, AccessShareLock);
> + conscan = systable_beginscan(conRel, ConstraintForeignRelidIndexId,
> + true, SnapshotNow, 1, skey);
> +
> + while (HeapTupleIsValid(htup = systable_getnext(conscan)))
> + {
> + Form_pg_constraint contuple = (Form_pg_constraint) GETSTRUCT(htup);
> +
> + /* Check if a foreign constraint uses the index being swapped */
> + if (contuple->contype == CONSTRAINT_FOREIGN &&
> + contuple->confrelid == parentOid &&
> + contuple->conindid == oldIndexOid)
> + {
> + /* Found an index, so update its pg_constraint entry */
> + contuple->conindid = newIndexOid;
> + /* And write it back in place */
> + heap_inplace_update(conRel, htup);

I am pretty doubtful that using heap_inplace_update is the correct thing
to do here. What if we fail later? Even if there's some justification
for it being safe it deserves a big comment.

The other cases where heap_inplace_update is used in the context of
CONCURRENTLY are pretty careful about where to do it and have special
state flags of indicating that this has been done...

>
> +bool
> +ReindexRelationsConcurrently(List *relationIds)
> +{
> + foreach(lc, relationIds)
> + {
> + Oid relationOid = lfirst_oid(lc);
> +
> + switch (get_rel_relkind(relationOid))
> + {
> + case RELKIND_RELATION:
> + {
> + /*
> + * In the case of a relation, find all its indexes
> + * including toast indexes.
> + */
> + Relation heapRelation = heap_open(relationOid,
> + ShareUpdateExclusiveLock);
> +
> + /* Relation on which is based index cannot be shared */
> + if (heapRelation->rd_rel->relisshared)
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("concurrent reindex is not supported for shared relations")));
> +
> + /* Add all the valid indexes of relation to list */
> + foreach(lc2, RelationGetIndexList(heapRelation))
> + {
> + Oid cellOid = lfirst_oid(lc2);
> + Relation indexRelation = index_open(cellOid,
> + ShareUpdateExclusiveLock);
> +
> + if (!indexRelation->rd_index->indisvalid)
> + ereport(WARNING,
> + (errcode(ERRCODE_INDEX_CORRUPTED),
> + errmsg("cannot reindex concurrently invalid index \"%s.%s\", bypassing",
> + get_namespace_name(get_rel_namespace(cellOid)),
> + get_rel_name(cellOid))));
> + else
> + indexIds = list_append_unique_oid(indexIds,
> + cellOid);
> +
> + index_close(indexRelation, ShareUpdateExclusiveLock);
> + }

Why are we releasing the locks here if we are going to reindex the
relations? They might change inbetween. I think we should take an
appropriate lock here, including the locks on the parent relations. Yes,
its slightly more duplicative code, and not acquiring locks multiple
times is somewhat complicated, but I think its required.

I think you should also explicitly do the above in a transaction...

> + /*
> + * Phase 2 of REINDEX CONCURRENTLY
> + *
> + * Build concurrent indexes in a separate transaction for each index to
> + * avoid having open transactions for an unnecessary long time. We also
> + * need to wait until no running transactions could have the parent table
> + * of index open. A concurrent build is done for each concurrent
> + * index that will replace the old indexes.
> + */
> +
> + /* Get the first element of concurrent index list */
> + lc2 = list_head(concurrentIndexIds);
> +
> + foreach(lc, indexIds)
> + {
> + Relation indexRel;
> + Oid indOid = lfirst_oid(lc);
> + Oid concurrentOid = lfirst_oid(lc2);
> + Oid relOid;
> + bool primary;
> + LOCKTAG *heapLockTag = NULL;
> + ListCell *cell;
> +
> + /* Move to next concurrent item */
> + lc2 = lnext(lc2);
> +
> + /* Start new transaction for this index concurrent build */
> + StartTransactionCommand();
> +
> + /* Get the parent relation Oid */
> + relOid = IndexGetRelation(indOid, false);
> +
> + /*
> + * Find the locktag of parent table for this index, we need to wait for
> + * locks on it.
> + */
> + foreach(cell, lockTags)
> + {
> + LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
> + if (relOid == localTag->locktag_field2)
> + heapLockTag = localTag;
> + }
> +
> + Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
> + WaitForVirtualLocks(*heapLockTag, ShareLock);

Why do we have to do the WaitForVirtualLocks here? Shouldn't we do this
once for all relations after each phase? Otherwise the waiting time will
really start to hit when you do this on a somewhat busy server.

> + /*
> + * Invalidate the relcache for the table, so that after this commit all
> + * sessions will refresh any cached plans taht might reference the index.
> + */
> + CacheInvalidateRelcacheByRelid(relOid);

I am not sure whether I suggested adding a
CacheInvalidateRelcacheByRelid here, but afaics its not required yet,
the plan isn't valid yet, so no need for replanning.

> + indexRel = index_open(indOid, ShareUpdateExclusiveLock);

I wonder we should directly open it exlusive here given its going to
opened exclusively in a bit anyway. Not that that will really reduce the
deadlock likelihood since we already hold the ShareUpdateExclusiveLock
in session mode ...

> + /*
> + * Phase 5 of REINDEX CONCURRENTLY
> + *
> + * The old indexes need to be marked as not ready. We need also to wait for
> + * transactions that might use them. Each operation is performed with a
> + * separate transaction.
> + */
> +
> + /* Mark the old indexes as not ready */
> + foreach(lc, indexIds)
> + {
> + LOCKTAG *heapLockTag;
> + Oid indOid = lfirst_oid(lc);
> + Oid relOid;
> +
> + StartTransactionCommand();
> + relOid = IndexGetRelation(indOid, false);
> +
> + /*
> + * Find the locktag of parent table for this index, we need to wait for
> + * locks on it.
> + */
> + foreach(lc2, lockTags)
> + {
> + LOCKTAG *localTag = (LOCKTAG *) lfirst(lc2);
> + if (relOid == localTag->locktag_field2)
> + heapLockTag = localTag;
> + }
> +
> + Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
> +
> + /* Finish the index invalidation and set it as dead */
> + index_concurrent_set_dead(indOid, relOid, *heapLockTag);
> +
> + /* Commit this transaction to make the update visible. */
> + CommitTransactionCommand();
> + }

No waiting here?

> + StartTransactionCommand();
> +
> + /* Get fresh snapshot for next step */
> + PushActiveSnapshot(GetTransactionSnapshot());
> +
> + /*
> + * Phase 6 of REINDEX CONCURRENTLY
> + *
> + * Drop the old indexes. This needs to be done through performDeletion
> + * or related dependencies will not be dropped for the old indexes. The
> + * internal mechanism of DROP INDEX CONCURRENTLY is not used as here the
> + * indexes are already considered as dead and invalid, so they will not
> + * be used by other backends.
> + */
> + index_concurrent_drop(indexIds);
> +
> + /*
> + * Last thing to do is release the session-level lock on the parent table
> + * and the indexes of table.
> + */
> + foreach(lc, relationLocks)
> + {
> + LockRelId lockRel = * (LockRelId *) lfirst(lc);
> + UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
> + }
> +
> + /* We can do away with our snapshot */
> + PopActiveSnapshot();

I think I would do the drop in individual transactions as well.

More at another time, shouldn't have started doing this now...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2012-12-19 02:57:01 Re: Switching timeline over streaming replication
Previous Message Robert Haas 2012-12-19 02:04:01 Re: logical decoding - GetOldestXmin