Re: Support for REINDEX CONCURRENTLY

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Support for REINDEX CONCURRENTLY
Date: 2013-06-19 00:50:07
Message-ID: CAB7nPqRUHxArvgR4ZMZBtO3LtA=Nn0Nh35pXJ8P4-syEGHu96Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 19, 2013 at 12:36 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Tue, Jun 18, 2013 at 10:53 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> An updated patch for the toast part is attached.
>>
>> On Tue, Jun 18, 2013 at 3:26 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> Here are the review comments of the removal_of_reltoastidxid patch.
>>> I've not completed the review yet, but I'd like to post the current comments
>>> before going to bed ;)
>>>
>>> *** a/src/backend/catalog/system_views.sql
>>> - pg_stat_get_blocks_fetched(X.oid) -
>>> - pg_stat_get_blocks_hit(X.oid) AS tidx_blks_read,
>>> - pg_stat_get_blocks_hit(X.oid) AS tidx_blks_hit
>>> + pg_stat_get_blocks_fetched(X.indrelid) -
>>> + pg_stat_get_blocks_hit(X.indrelid) AS tidx_blks_read,
>>> + pg_stat_get_blocks_hit(X.indrelid) AS tidx_blks_hit
>>>
>>> ISTM that X.indrelid indicates the TOAST table not the TOAST index.
>>> Shouldn't we use X.indexrelid instead of X.indrelid?
>> Indeed good catch! We need in this case the statistics on the index
>> and here I used the table OID. Btw, I also noticed that as multiple
>> indexes may be involved for a given toast relation, it makes sense to
>> actually calculate tidx_blks_read and tidx_blks_hit as the sum of all
>> stats of the indexes.
>
> Yep. You seem to need to change X.indexrelid to X.indrelid in GROUP clause.
> Otherwise, you may get two rows of the same table from pg_statio_all_tables.
I changed it a little bit in a different way in my latest patch by
adding a sum on all the indexes when getting tidx_blks stats.

>>> doc/src/sgml/diskusage.sgml
>>>> There will be one index on the
>>>> <acronym>TOAST</> table, if present.
>
> + table (see <xref linkend="storage-toast">). There will be one valid index
> + on the <acronym>TOAST</> table, if present. There also might be indexes
>
> When I used gdb and tracked the code path of concurrent reindex patch,
> I found it's possible that more than one *valid* toast indexes appear. Those
> multiple valid toast indexes are viewable, for example, from pg_indexes.
> I'm not sure whether this is the bug of concurrent reindex patch. But
> if it's not,
> you seem to need to change the above description again.
Not sure about that. The latest code is made such as only one valid
index is present on the toast relation at the same time.

>
>>> *** a/src/bin/pg_dump/pg_dump.c
>>> + "SELECT c.reltoastrelid, t.indexrelid "
>>> "FROM pg_catalog.pg_class c LEFT JOIN "
>>> - "pg_catalog.pg_class t ON (c.reltoastrelid = t.oid) "
>>> - "WHERE c.oid = '%u'::pg_catalog.oid;",
>>> + "pg_catalog.pg_index t ON (c.reltoastrelid = t.indrelid) "
>>> + "WHERE c.oid = '%u'::pg_catalog.oid AND t.indisvalid "
>>> + "LIMIT 1",
>>>
>>> Is there the case where TOAST table has more than one *valid* indexes?
>> I just rechecked the patch and is answer is no. The concurrent index
>> is set as valid inside the same transaction as swap. So only the
>> backend performing the swap will be able to see two valid toast
>> indexes at the same time.
>
> According to my quick gdb testing, this seems not to be true....
Well, I have to disagree. I am not able to reproduce it. Which version
did you use? Here is what I get with the latest version of REINDEX
CONCURRENTLY patch... I checked with the following process:
1) Create this table:
CREATE TABLE aa (a int, b text);
ALTER TABLE aa ALTER COLUMN b SET STORAGE EXTERNAL;
2) Create session 1 and take a breakpoint on
ReindexRelationConcurrently:indexcmds.c
3) Launch REINDEX TABLE CONCURRENTLY aa
4) With a 2nd session, Go through all the phases of the process and
scanned the validity of toast indexes with the following
ioltas=# select pg_class.relname, indisvalid, indisready from
pg_class, pg_index where pg_class.reltoastrelid = pg_index.indrelid
and pg_class.relname = 'aa';
relname | indisvalid | indisready
---------+------------+------------
aa | t | t
aa | f | t
(2 rows)

When scanning all the phases with the 2nd psql session (the concurrent
index creation, build, validation, swap, and drop of the concurrent
index), I saw at no moment that indisvalid was set at true for the two
indexes at the same time. indisready was of course changed to prepare
the concurrent index to be ready for inserts, but that was all and
this is part of the process.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-06-19 00:55:24 Re: Support for REINDEX CONCURRENTLY
Previous Message Jeff Janes 2013-06-18 23:32:01 Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table