Re: REINDEX CONCURRENTLY 2.0

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY 2.0
Date: 2014-10-30 21:44:49
Message-ID: 5452B151.60206@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/30/14, 3:19 AM, Michael Paquier wrote:
> Thanks for your input, Jim!
>
> On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>> wrote:
> > Patch applies against current HEAD and builds, but I'm getting 37 failed
> > tests (mostly parallel, but also misc and WITH; results attached). Is that
> > expected?
> This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.

> > The "mark the concurrent index" bit is rather confusing; it sounds like it's
> > referring to the new index instead of the old. Now that I've read the code I
> > understand what's going on here between the concurrent index *entry* and the
> > filenode swap, but I don't think the docs make this sufficiently clear to
> > users.
> >
> > How about something like this:
> >
> > The following steps occur in a concurrent index build, each in a separate
> > transaction. Note that if there are multiple indexes to be rebuilt then each
> > step loops through all the indexes we're rebuilding, using a separate
> > transaction for each one.
> >
> > 1. [blah]
> Definitely a good idea! I took your text and made it more precise, listing the actions done for each step, the pg_index flags switched, using <orderedlist> to make the list of steps described in a separate paragraph more exhaustive for the user. At the same time I reworked the docs removing a part that was somewhat duplicated about dealing with the constraints having invalid index entries and how to drop them.

Awesome! Just a few items here:

+ Then a second pass is performed to add tuples that were added while
+ the first pass build was running. One the validation of the index

s/One the/Once the/

> > + * index_concurrent_create
> > + *
> > + * Create an index based on the given one that will be used for concurrent
> > + * operations. The index is inserted into catalogs and needs to be built
> > later
> > + * on. This is called during concurrent index processing. The heap relation
> > + * on which is based the index needs to be closed by the caller.
> >
> > Last bit presumably should be "on which the index is based".
> What about "Create a concurrent index based on the definition of the one provided by caller"?

That's good too, but my comment was on the last sentence, not the first.

> > + /* Build the list of column names, necessary for index_create */
> > Instead of all this work wouldn't it be easier to create a version of
> > index_create/ConstructTupleDescriptor that will use the IndexInfo for the
> > old index? ISTM index_concurrent_create() is doing a heck of a lot of work
> > to marshal data into one form just to have it get marshaled yet again. Worst
> > case, if we do have to play this game, there should be a stand-alone
> > function to get the columns/expressions for an existing index; you're
> > duplicating a lot of code from pg_get_indexdef_worker().
> Yes, this definitely sucks and the approach creating a function to get all the column names is not productive as well. Then let's define an additional argument in index_create to pass a potential TupleDesc instead of this whole wart. I noticed as well that we need to actually reset attcacheoff to be able to use a fresh version of the tuple descriptor of the old index. I added a small API for this purpose in tupdesc.h called ResetTupleDescCache. Would it make sense instead to extend CreateTupleDescCopyConstr or CreateTupleDescCopy with a boolean flag?

Perhaps there'd be other places that would want to reset the stats, so I lean slightly that direction.

The comment at the beginning of index_create needs to be modified to mention tupdesc and especially that setting tupdesc over-rides indexColNames.

> > index_concurrent_swap(): Perhaps it'd be better to create
> > index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
> > refactor the duplicated code out... the actual function would then become:
> This sentence is not finished :) IMO, index_concurrent_swap looks good as is, taking as arguments the index and its concurrent entry, and swapping their relfilenode after taking AccessExclusiveLock that will be hold until the end of this transaction.

Fair enough.

> > ReindexRelationConcurrently()
> >
> > + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
> > + * either an index or a table. If a table is specified, each step of
> > REINDEX
> > + * CONCURRENTLY is done in parallel with all the table's indexes as well as
> > + * its dependent toast indexes.
> > This comment is a bit misleading; we're not actually doing anything in
> > parallel, right? AFAICT index_concurrent_build is going to block while each
> > index is built the first time.
> Yes, parallel may be misleading. What is meant here is that each step of the process is done one by one on all the valid indexes a table may have.

New comment looks good.

> > + * relkind is an index, this index itself will be rebuilt. The locks
> > taken
> > + * parent relations and involved indexes are kept until this
> > transaction
> > + * is committed to protect against schema changes that might occur
> > until
> > + * the session lock is taken on each relation.
> >
> > This comment is a bit unclear to me... at minimum I think it should be "* on
> > parent relations" instead of "* parent relations", but I think it needs to
> > elaborate on why/when we're also taking session level locks.
> Hum, done as follows:
> @@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid)
> * If the relkind of given relation Oid is a table, all its valid indexes
> * will be rebuilt, including its associated toast table indexes. If
> * relkind is an index, this index itself will be rebuilt. The locks taken
> - * parent relations and involved indexes are kept until this transaction
> + * on parent relations and involved indexes are kept until this transaction
> * is committed to protect against schema changes that might occur until
> - * the session lock is taken on each relation.
> + * the session lock is taken on each relation, session lock used to
> + * similarly protect from any schema change that could happen within the
> + * multiple transactions that are used during this process.
> */

Cool.

> OK, what about that then:
> /*
> - * This case is currently not supported, but there's no way to ask for it
> - * in the grammar anyway, so it can't happen.
> + * This case is currently only supported during a concurrent index
> + * rebuild, but there is no way to ask for it in the grammar otherwise
> + * anyway. If support for exclusion constraints is added in the future,
> + * the check similar to this one in check_exclusion_constraint should as
> + * well be changed accordingly.
>
> Updated patch is attached.

Works for me.

Keep in mind I'm not super familiar with the guts of index creation, so it'd be good for someone else to look at that bit (especially index_concurrent_create and ReindexRelationConcurrently).
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2014-10-30 21:45:20 Re: REINDEX CONCURRENTLY 2.0
Previous Message Tomas Vondra 2014-10-30 21:44:27 SET TABLESPACE ... NOWAIT