Re: CLUSTER not lose indexes

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)atentus(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: CLUSTER not lose indexes
Date: 2002-07-05 05:06:38
Message-ID: 200207050506.g6556cZ06097@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera wrote:
> Hackers:
>
> I've modified commands/cluster.c so that it recreates the indexes on the
> table after clustering the table. I attach the patch.
>
> There are (of course) things I don't understand. For example, whether
> (or when) I should use CommandCounterIncrement() after each
> index_create, or if I should call setRelhasindex() only once (and not
> once per index); or whether I need to acquire some lock on the indexes.
>
> I tested it with one table and several indexes. Truth is I don't know
> how to test for concurrency, or if it's worth the trouble.
>
> The purpose of this experiment (and, I hope, more to follow) is to
> familiarize myself with the guts of PostgreSQL, so I can work on my CS
> thesis with it. If you can point me my misconceptions I'd be happy to
> try again (and again, and...)

I think Tom was suggesting that you may want to continue work on CLUSTER
and make use of relfilenode. After the cluster, you can just update
pg_class.relfilenode with the new file name (random oid generated at
build time) and as soon as you commit, all backends will start using the
new file and you can delete the old one.

The particular case we would like to improve is this:

/* Destroy old heap (along with its index) and rename new. */
heap_drop_with_catalog(OIDOldHeap, allowSystemTableMods);

CommandCounterIncrement();

renamerel(OIDNewHeap, oldrelation->relname);

In this code, we delete the old relation, then rename the new one. It
would be good to have this all happen in one update of
pg_class.relfilenode; that way it is an atomic operation.

So, create a heap (in the temp namespace so it is deleted on crash),
copy the old heap into the new file in cluster order, and when you are
done, point the old pg_class relfilenode at the new clustered heap
filename, then point the new cluster heap pg_class at the old heap file,
and then drop the cluster heap file; that will remove the _old_ file
(I believe on commit) and you are ready to go.

So, you are basically creating a new heap, but at finish, the new heap's
pg_class and the old heap's file go away. I thought about doing it
without creating the pg_class entry for the new heap, but the code
really wants to have a heap it can manipulate.

Same with index rebuilding, I think. The indexes are built on the new
heap, and the relfilenode swap works just the same.

Let us know if you want to pursue that and we can give you additional
assistance. I would like to see CLUSTER really polished up. More
people are using it now and it really needs someone to focus on it.

Glad to see you working on CLUSTER. Welcome aboard.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2002-07-05 05:34:38 Re: CLUSTER not lose indexes
Previous Message Justin Clift 2002-07-05 04:13:10 Re: Should next release by 8.0 (Was: Re: [GENERAL] I am being

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2002-07-05 05:34:38 Re: CLUSTER not lose indexes
Previous Message Tom Lane 2002-07-05 03:44:08 Re: CLUSTER not lose indexes