Re: Support for REINDEX CONCURRENTLY

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Support for REINDEX CONCURRENTLY
Date: 2013-06-18 01:53:25
Message-ID: CAB7nPqS0RoKj1TTMbivfVJmxTXcDRzgvHW8C3vZnBSqLiMSDRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

An updated patch for the toast part is attached.

On Tue, Jun 18, 2013 at 3:26 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> Here are the review comments of the removal_of_reltoastidxid patch.
> I've not completed the review yet, but I'd like to post the current comments
> before going to bed ;)
>
> *** a/src/backend/catalog/system_views.sql
> - pg_stat_get_blocks_fetched(X.oid) -
> - pg_stat_get_blocks_hit(X.oid) AS tidx_blks_read,
> - pg_stat_get_blocks_hit(X.oid) AS tidx_blks_hit
> + pg_stat_get_blocks_fetched(X.indrelid) -
> + pg_stat_get_blocks_hit(X.indrelid) AS tidx_blks_read,
> + pg_stat_get_blocks_hit(X.indrelid) AS tidx_blks_hit
>
> ISTM that X.indrelid indicates the TOAST table not the TOAST index.
> Shouldn't we use X.indexrelid instead of X.indrelid?
Indeed good catch! We need in this case the statistics on the index
and here I used the table OID. Btw, I also noticed that as multiple
indexes may be involved for a given toast relation, it makes sense to
actually calculate tidx_blks_read and tidx_blks_hit as the sum of all
stats of the indexes.

> You changed some SQLs because of removal of reltoastidxid.
> Could you check that the original SQL and changed one return
> the same value, again?
Sure, here are some results I am getting for pg_statio_all_tables with
a simple example to get stats on a table that has a toast relation.

With patch (after correcting to indexrelid and defining stats as a sum):
ioltas=# select relname, toast_blks_hit, tidx_blks_read from
pg_statio_all_tables where relname ='aa';
relname | toast_blks_hit | tidx_blks_read
---------+----------------+----------------
aa | 433313 | 829
(1 row)

With master:
relname | toast_blks_hit | tidx_blks_read
---------+----------------+----------------
aa | 433313 | 829
(1 row)

So the results are the same.

>
> doc/src/sgml/diskusage.sgml
>> There will be one index on the
>> <acronym>TOAST</> table, if present.
>
> I'm not sure if multiple indexes on TOAST table are viewable by a user.
> If it's viewable, we need to correct the above description.
AFAIK, toast indexes are not directly visible to the user.
ioltas=# \d aa
Table "public.aa"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | text |
ioltas=# select l.relname from pg_class c join pg_class l on
(c.reltoastrelid = l.oid) where c.relname = 'aa';
relname
----------------
pg_toast_16386
(1 row)

However you can still query the schema pg_toast to get details about a
toast relation.
ioltas=# \d pg_toast.pg_toast_16386_index
Index "pg_toast.pg_toast_16386_index"
Column | Type | Definition
-----------+---------+------------
chunk_id | oid | chunk_id
chunk_seq | integer | chunk_seq
primary key, btree, for table "pg_toast.pg_toast_16386"

>
> doc/src/sgml/monitoring.sgml
>> <entry><structfield>tidx_blks_read</></entry>
>> <entry><type>bigint</></entry>
>> <entry>Number of disk blocks read from this table's TOAST table index (if any)</entry>
>> </row>
>> <row>
>> <entry><structfield>tidx_blks_hit</></entry>
>> <entry><type>bigint</></entry>
>> <entry>Number of buffer hits in this table's TOAST table index (if any)</entry>
>
> For the same reason as the above, we need to change "index" to "indexes"
> in these descriptions?
Yes it makes sense. Changed it this way. After some more search with
grep, I haven't noticed any other places where it would be necessary
to correct the docs.

>
> *** a/src/bin/pg_dump/pg_dump.c
> + "SELECT c.reltoastrelid, t.indexrelid "
> "FROM pg_catalog.pg_class c LEFT JOIN "
> - "pg_catalog.pg_class t ON (c.reltoastrelid = t.oid) "
> - "WHERE c.oid = '%u'::pg_catalog.oid;",
> + "pg_catalog.pg_index t ON (c.reltoastrelid = t.indrelid) "
> + "WHERE c.oid = '%u'::pg_catalog.oid AND t.indisvalid "
> + "LIMIT 1",
>
> Is there the case where TOAST table has more than one *valid* indexes?
I just rechecked the patch and is answer is no. The concurrent index
is set as valid inside the same transaction as swap. So only the
backend performing the swap will be able to see two valid toast
indexes at the same time.

> If yes, is it really okay to choose just one index by using LIMIT 1?
> If no, i.e., TOAST table should have only one valid index, we should get rid
> of LIMIT 1 and check that only one row is returned from this query.
> Fortunately, ISTM this check has been already done by the subsequent
> call of ExecuteSqlQueryForSingleRow(). Thought?
Hum, this is debatable, but for simplicity of pg_dump code, let's
remove it this LIMIT clause and rely on the assumption that a toast
relation can only have one valid index at a given moment.
--
Michael

Attachment Content-Type Size
toast_long.sql application/octet-stream 598 bytes
20130617_1_remove_reltoastidxid_v10.patch application/octet-stream 45.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2013-06-18 02:33:06 Re: Patch to add support of "IF NOT EXISTS" to others "CREATE" statements
Previous Message Robins Tharakan 2013-06-18 00:33:35 Re: Add regression tests for SET xxx