Re: Ignore invalid indexes in pg_dump

Lists: pgsql-hackers
From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Ignore invalid indexes in pg_dump
Date: 2013-03-20 02:51:05
Message-ID: CAB7nPqTTTAcQpagm7-7Re1=2auOnBRgBbJnn85vW0QETU0Mr=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

If failures happen with CREATE INDEX CONCURRENTLY, the system will be let
with invalid indexes. I don't think that the user would like to see invalid
indexes of
an existing system being recreated as valid after a restore.
So why not removing from a dump invalid indexes with something like the
patch
attached?
This should perhaps be applied in pg_dump for versions down to 8.2 where
CREATE
INDEX CONCURRENTLY has been implemented?

I noticed some recent discussions about that:
http://www.postgresql.org/message-id/20121207141236.GB4699@alvh.no-ip.org
In this case the problem has been fixed in pg_upgrade directly.

--
Michael

Attachment Content-Type Size
20130317_dump_only_valid_index.patch application/octet-stream 484 bytes

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore invalid indexes in pg_dump
Date: 2013-03-20 09:00:05
Message-ID: CA+U5nMLe+qMQDx-=Ef4WS68QW7RGK4Sokk1jFNzLV6rPaU0XJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20 March 2013 02:51, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:

> If failures happen with CREATE INDEX CONCURRENTLY, the system will be let
> with invalid indexes. I don't think that the user would like to see invalid
> indexes of
> an existing system being recreated as valid after a restore.
> So why not removing from a dump invalid indexes with something like the
> patch
> attached?
> This should perhaps be applied in pg_dump for versions down to 8.2 where
> CREATE
> INDEX CONCURRENTLY has been implemented?

Invalid also means currently-in-progress, so it would be better to keep them in.

> I noticed some recent discussions about that:
> http://www.postgresql.org/message-id/20121207141236.GB4699@alvh.no-ip.org
> In this case the problem has been fixed in pg_upgrade directly.

That is valid because the index build is clearly not in progress.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore invalid indexes in pg_dump
Date: 2013-03-20 15:38:58
Message-ID: CAK3UJRHZCWJiBKiGc8THpUfQ0t78D2j1Q160bQLkZxZynakoug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 20, 2013 at 2:00 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 20 March 2013 02:51, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>
>> If failures happen with CREATE INDEX CONCURRENTLY, the system will be let
>> with invalid indexes. I don't think that the user would like to see invalid
>> indexes of
>> an existing system being recreated as valid after a restore.
>> So why not removing from a dump invalid indexes with something like the
>> patch
>> attached?
>> This should perhaps be applied in pg_dump for versions down to 8.2 where
>> CREATE
>> INDEX CONCURRENTLY has been implemented?
>
> Invalid also means currently-in-progress, so it would be better to keep them in.

For invalid indexes which are left hanging around in the database, if
the index definition is included by pg_dump, it will likely cause pain
during the restore. If the index build failed the first time and
hasn't been manually dropped and recreated since then, it's a good bet
it will fail the next time. Errors during restore can be more than
just a nuisance; consider restores with --single-transaction.

And if the index is simply currently-in-progress, it seems like the
expected behavior would be for pg_dump to ignore it anyway. We don't
include other DDL objects which are not yet committed while pg_dump is
running.

Josh


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore invalid indexes in pg_dump
Date: 2013-03-20 15:58:26
Message-ID: 4468.1363795106@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Kupershmidt <schmiddy(at)gmail(dot)com> writes:
> On Wed, Mar 20, 2013 at 2:00 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Invalid also means currently-in-progress, so it would be better to keep them in.

> For invalid indexes which are left hanging around in the database, if
> the index definition is included by pg_dump, it will likely cause pain
> during the restore. If the index build failed the first time and
> hasn't been manually dropped and recreated since then, it's a good bet
> it will fail the next time. Errors during restore can be more than
> just a nuisance; consider restores with --single-transaction.

> And if the index is simply currently-in-progress, it seems like the
> expected behavior would be for pg_dump to ignore it anyway. We don't
> include other DDL objects which are not yet committed while pg_dump is
> running.

I had been on the fence about what to do here, but I find Josh's
arguments persuasive, particularly the second one. Why shouldn't we
consider an in-progress index to be an uncommitted DDL change?

(Now admittedly, there won't *be* any uncommitted ordinary DDL on tables
while pg_dump is running, because it takes AccessShareLock on all
tables. But there could easily be uncommitted DDL against other types
of database objects, which pg_dump won't even see.)

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Kupershmidt <schmiddy(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore invalid indexes in pg_dump
Date: 2013-03-21 00:51:39
Message-ID: CAB7nPqTzuN5UgBWfVcYaA6X14JyuaGg=i8-c4r61Vg9q22HsDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 21, 2013 at 12:58 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I had been on the fence about what to do here, but I find Josh's
> arguments persuasive, particularly the second one. Why shouldn't we
> consider an in-progress index to be an uncommitted DDL change?
>
> (Now admittedly, there won't *be* any uncommitted ordinary DDL on tables
> while pg_dump is running, because it takes AccessShareLock on all
> tables. But there could easily be uncommitted DDL against other types
> of database objects, which pg_dump won't even see.)
>
+1. Playing it safe is a better thing to do for sure, especially if a
restore would
fail. I didn't think about that first...

On top of checking indisvalid, I think that some additional checks on
indislive
and indisready are also necessary. As indisready has been introduced in 8.3
and
indislive has been added in 9.3, the attached patch is good I think.
I also added a note in the documentation about invalid indexes not being
dumped.
Perhaps this patch should be backpatched to previous versions in order to
have
the same consistent behavior.

Regards,
--
Michael

Attachment Content-Type Size
20130321_no_dump_indisvalid.patch application/octet-stream 4.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Josh Kupershmidt <schmiddy(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore invalid indexes in pg_dump
Date: 2013-03-26 21:47:30
Message-ID: 21002.1364334450@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> On top of checking indisvalid, I think that some additional checks on
> indislive and indisready are also necessary.

Those are not necessary, as an index that is marked indisvalid should
certainly also have those flags set. If it didn't require making two
new version distinctions in getIndexes(), I'd be okay with the extra
checks; but as-is I think the maintenance pain this would add greatly
outweighs any likely value.

I've committed this in the simpler form that just adds indisvalid
checks to the appropriate version cases.

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Kupershmidt <schmiddy(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore invalid indexes in pg_dump
Date: 2013-03-26 23:19:22
Message-ID: CAB7nPqTn660ArfDjpRpzhdoKdJke=+sZKJmUw8bQ6VKMdDheBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 27, 2013 at 6:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > On top of checking indisvalid, I think that some additional checks on
> > indislive and indisready are also necessary.
>
> Those are not necessary, as an index that is marked indisvalid should
> certainly also have those flags set. If it didn't require making two
> new version distinctions in getIndexes(), I'd be okay with the extra
> checks; but as-is I think the maintenance pain this would add greatly
> outweighs any likely value.
>
> I've committed this in the simpler form that just adds indisvalid
> checks to the appropriate version cases.
>
Thanks.
--
Michael