Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

Lists: pgsql-hackers
From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-10-31 06:11:37
Message-ID: 006801cdb72e$96b62330$c4226990$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

There seems to be a problem in behavior of Create Index Concurrently and Hot
Update in HEAD code .
Please see the below testcase

Step-1
-----------
Client-1
Create table t1(c1 int, c2 int, c3 int);
insert into t1 values(1,2,3);

Step-2
-----------
Client - 2
update t1 set c2=4; where c1 = 1; -- This will be Hot update
Select * from t1;
c1 | c2 | c3
----+----+----
1 | 4 | 3
(1 row)

No problem till here.

Step-3
-----------
Client -1
create index concurrently idx_conc_t1 on t1(c2); -- Run this command in
debug mode (by having breakpoint in DefineIndex)

Stop before the CommitTransactionCommand() of phase-2 where index_build is
done and indisready flag is set to TRUE.
As we have stopped before commit, still indexisready will not be visible to
other session/transaction.

Step-4
-----------
Client -2
update t1 set c2=5 where c1=1; -- Update is success, but this is a HOT
update
According to me, here is the problem, it shouldn't have done HOT update.

Step-5
-----------
Client-1
Resume debugging, and complete the command. I have observed in
validate_index(), it doesn't create index entry for c2=5.

Step-6
-----------
Client-2
select * from t1 where c2=5;
c1 | c2 | c3
----+----+----
1 | 5 | 3
(1 row)

postgres=# set enable_seqscan=off; -- This is to ensure index scan
should happen
SET
postgres=# select * from t1 where c2=5; -- Problem, it should have
shown the Row.
c1 | c2 | c3
----+----+----
(0 rows)

postgres=# select * from t1 where c2=4; -- Problem, query is done for
C2=4 and the result shows C2=5.
c1 | c2 | c3
----+----+----
1 | 5 | 3
(1 row)

According to me, the problem happens at Step-4. As at Step-4, it does the
HOT update due to which validate_index() is not able to put an entry for
C2=5

Let me know if I have misunderstood something?

With Regards,
Amit Kapila.


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-24 15:50:05
Message-ID: 20121124155005.GA10299@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2012-10-31 11:41:37 +0530, Amit Kapila wrote:
> There seems to be a problem in behavior of Create Index Concurrently and Hot
> Update in HEAD code .

At pgcon.it I had a chance to discuss with Simon how to fix this
bug. Please check the attached patches - and their commit messages - for
the fix and some regression tests.
The fix contains at least one FIXME where I am not sure if we need to do
something additional and the locking behaviour deserves some close
review.

Opinions?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Don-t-ignore-indisvalid-indisready-indexes-in-Relati.patch text/x-patch 0 bytes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-26 21:39:39
Message-ID: 19546.1353965979@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2012-10-31 11:41:37 +0530, Amit Kapila wrote:
>> There seems to be a problem in behavior of Create Index Concurrently and Hot
>> Update in HEAD code .

> At pgcon.it I had a chance to discuss with Simon how to fix this
> bug. Please check the attached patches - and their commit messages - for
> the fix and some regression tests.

I looked at this a bit. I am very unhappy with the proposed kluge to
open indexes NoLock in some places. Even if that's safe today, which
I don't believe in the least, any future change in this area could break
it.

I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an
additional step that somehow marks the pg_index row in a new way that
makes RelationGetIndexList ignore it, and then wait out existing
transactions before taking the final step of dropping the index. The
Right Way to do this would likely have been to add another bool column,
but we don't have that option anymore in 9.2. Maybe we could abuse
indcheckxmin somehow, ie use a state that can't otherwise occur (in
combination with the values of indisvalid/indisready) to denote an index
being dropped.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 03:31:50
Message-ID: 25941.1353987110@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an
> additional step that somehow marks the pg_index row in a new way that
> makes RelationGetIndexList ignore it, and then wait out existing
> transactions before taking the final step of dropping the index. The
> Right Way to do this would likely have been to add another bool column,
> but we don't have that option anymore in 9.2. Maybe we could abuse
> indcheckxmin somehow, ie use a state that can't otherwise occur (in
> combination with the values of indisvalid/indisready) to denote an index
> being dropped.

I looked into this some more. There are currently three possible states
of the indisvalid/indisready flags:

indisvalid = false, indisready = false

initial state during CREATE INDEX CONCURRENTLY; this should
result in sessions honoring the index for HOT-safety decisions,
but not using it for searches or insertions

indisvalid = false, indisready = true

sessions should now insert into the index, but not use it
for searches

indisvalid = true, indisready = true

normal, fully valid index

Either state of indcheckxmin is valid with all three of these
combinations, so the specific kluge I was contemplating above doesn't
work. But there is no valid reason for an index to be in this state:

indisvalid = true, indisready = false

I suggest that to fix this for 9.2, we could abuse these flags by
defining that combination as meaning "ignore this index completely",
and having DROP INDEX CONCURRENTLY set this state during its final
wait before removing the index.

Obviously, an additional flag column would be a much cleaner fix,
and I think we should add one in HEAD. But it's too late to do
that in 9.2.

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 04:37:05
Message-ID: CAB7nPqRavH5AEp6dOwfg5JZtq7W0VyN1Gu_FmXf=mEjNiSg2VQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 12:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an
> > additional step that somehow marks the pg_index row in a new way that
> > makes RelationGetIndexList ignore it, and then wait out existing
> > transactions before taking the final step of dropping the index. The
> > Right Way to do this would likely have been to add another bool column,
> > but we don't have that option anymore in 9.2. Maybe we could abuse
> > indcheckxmin somehow, ie use a state that can't otherwise occur (in
> > combination with the values of indisvalid/indisready) to denote an index
> > being dropped.
>
> I looked into this some more. There are currently three possible states
> of the indisvalid/indisready flags:
>
> indisvalid = false, indisready = false
>
> initial state during CREATE INDEX CONCURRENTLY; this should
> result in sessions honoring the index for HOT-safety decisions,
> but not using it for searches or insertions
>
> indisvalid = false, indisready = true
>
> sessions should now insert into the index, but not use it
> for searches
>
> indisvalid = true, indisready = true
>
> normal, fully valid index
>
> Either state of indcheckxmin is valid with all three of these
> combinations, so the specific kluge I was contemplating above doesn't
> work. But there is no valid reason for an index to be in this state:
>
> indisvalid = true, indisready = false

> I suggest that to fix this for 9.2, we could abuse these flags by
> defining that combination as meaning "ignore this index completely",
> and having DROP INDEX CONCURRENTLY set this state during its final
> wait before removing the index.
>
> Obviously, an additional flag column would be a much cleaner fix,
> and I think we should add one in HEAD. But it's too late to do
> that in 9.2.
>
For 9.2 as you say the best choice is to ignore in RelationGetIndexList the
indexes that are valid and not ready when fetching the index list of a
relation.

For the fix in master, just thinking but...
Having 3 boolean flags to manage the state of an index might easily lead to
the developer to confusion.
I was thinking about the possibility of using instead of that a list of
states that are defined with a character.
We already know 3 possible states which are:
- indisvalid = false, indisready = false, let's say INDEX_IS_INITIAL 'i'
- indisvalid = false, indisready = true, with INDEX_IS_READY 'r'
- indisvalid = true, indisready = true, with INDEX_IS_VALID 'v'

In case an index needs to be ignored as you mention with the combination
indisvalid = true and indisready = false, it could be possible to add a new
state like INDEX_IS_IGNORE 'g'.

This would avoid the addition of a new column in pg_index and control the
status of an index easily.
This is not that much backward-compatible though...
--
Michael Paquier
http://michael.otacoo.com


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 04:48:37
Message-ID: CABOikdO93xnUogK_=re9m1CKkL=chzfo=UX2OqFQWqtyRex2-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
>
> Either state of indcheckxmin is valid with all three of these
> combinations, so the specific kluge I was contemplating above doesn't
> work. But there is no valid reason for an index to be in this state:
>
> indisvalid = true, indisready = false
>
> I suggest that to fix this for 9.2, we could abuse these flags by
> defining that combination as meaning "ignore this index completely",
> and having DROP INDEX CONCURRENTLY set this state during its final
> wait before removing the index.
>
>
Yeah, this looks much better, given our inability to add a new catalog
column in 9.2. Can we cheat a little though and use a value other than 0
and 1 for indisvalid or indisready to tell the server to interpret it
differently ? I would assume not, but can't see a reason unless these
values are converted to other types and back to boolean.

Andres complained about the fact that many callers of RelationGetIndexList
are probably not ready to process invalid or not-yet-ready indexes and
suggested that those changes should be backpatched to even older releases.
But IMO we should do that with a test case that demonstrates that there is
indeed a bug. Also, we should teach RelationGetIndexList to take a flags
argument and filter out indexes that the caller is not interested instead
of every caller doing the checks separately.

Thanks,
Pavan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 06:08:55
Message-ID: 28861.1353996535@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> writes:
> On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Either state of indcheckxmin is valid with all three of these
>> combinations, so the specific kluge I was contemplating above doesn't
>> work. But there is no valid reason for an index to be in this state:
>> indisvalid = true, indisready = false

> Yeah, this looks much better, given our inability to add a new catalog
> column in 9.2. Can we cheat a little though and use a value other than 0
> and 1 for indisvalid or indisready to tell the server to interpret it
> differently ?

No, not unless you'd like "select * from pg_index" to misbehave.
Personally, I like being able to look at catalogs.

> Andres complained about the fact that many callers of RelationGetIndexList
> are probably not ready to process invalid or not-yet-ready indexes and
> suggested that those changes should be backpatched to even older releases.
> But IMO we should do that with a test case that demonstrates that there is
> indeed a bug. Also, we should teach RelationGetIndexList to take a flags
> argument and filter out indexes that the caller is not interested instead
> of every caller doing the checks separately.

We can't really change the API of RelationGetIndexList in the back
branches, as it's just about certain there is third-party code calling
it. I'm dubious about the advantages of such prefiltering anyway, as:

(1) That would mean that every change to indisvalid/indisready would
require forcing a relcache flush on the parent table. (Either that or
RelationGetIndexList does pg_index lookups internally, which would
mostly defeat the point of caching the index list at all.)

(2) The big picture here is that it's silly to optimize for any case
other than fully valid indexes, because anything else can be expected
to be a low-probability transient state. So generally we might as well
have RelationGetIndexList return all the indexes and let callers filter
any that happen to be inappropriate for their usage.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 10:18:43
Message-ID: 20121127101843.GB3989@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-26 16:39:39 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2012-10-31 11:41:37 +0530, Amit Kapila wrote:
> >> There seems to be a problem in behavior of Create Index Concurrently and Hot
> >> Update in HEAD code .
>
> > At pgcon.it I had a chance to discuss with Simon how to fix this
> > bug. Please check the attached patches - and their commit messages - for
> > the fix and some regression tests.
>
> I looked at this a bit. I am very unhappy with the proposed kluge to
> open indexes NoLock in some places. Even if that's safe today, which
> I don't believe in the least, any future change in this area could break
> it.

I am not happy with it either. But I failed to see a better way. Note
that in most of the cases a lock actually is acquired shortly
afterwards, just a ->indisvalid or an ->indisready check away. The only
exception is RelationGetIndexAttrBitmap which actually needs to look at
!indisvalid && !indisready indexes because of HOT. All the former cases
could just do a syscache lookup beforehand and skip if it doesn't match
their criterion. I wasn't sure about the (performance, notational)
overhead of doing a second syscache lookup in those paths.

Note that any ->indisvalid/indisready change is protected by waiting for
all concurrent backends to finish, so I don't think the separate
syscache lookup/index_open would be a problem.

For RelationGetIndexAttrBitmap, I don't really see a point in the locks
acquired - after all were not protecting the RelationGetIndexList
either.

Greetings,

Andres

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 10:27:30
Message-ID: 20121127102730.GC3989@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-26 22:31:50 -0500, Tom Lane wrote:
> I wrote:
> > I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an
> > additional step that somehow marks the pg_index row in a new way that
> > makes RelationGetIndexList ignore it, and then wait out existing
> > transactions before taking the final step of dropping the index. The
> > Right Way to do this would likely have been to add another bool column,
> > but we don't have that option anymore in 9.2. Maybe we could abuse
> > indcheckxmin somehow, ie use a state that can't otherwise occur (in
> > combination with the values of indisvalid/indisready) to denote an index
> > being dropped.
>
> I looked into this some more. There are currently three possible states
> of the indisvalid/indisready flags:
>
> indisvalid = false, indisready = false
>
> initial state during CREATE INDEX CONCURRENTLY; this should
> result in sessions honoring the index for HOT-safety decisions,
> but not using it for searches or insertions
>
> indisvalid = false, indisready = true
>
> sessions should now insert into the index, but not use it
> for searches
>
> indisvalid = true, indisready = true
>
> normal, fully valid index
>
> Either state of indcheckxmin is valid with all three of these
> combinations, so the specific kluge I was contemplating above doesn't
> work. But there is no valid reason for an index to be in this state:
>
> indisvalid = true, indisready = false
>
> I suggest that to fix this for 9.2, we could abuse these flags by
> defining that combination as meaning "ignore this index completely",
> and having DROP INDEX CONCURRENTLY set this state during its final
> wait before removing the index.
>
> Obviously, an additional flag column would be a much cleaner fix,
> and I think we should add one in HEAD. But it's too late to do
> that in 9.2.

While I aggree that more states would make some stuff cleaner, as long
as were still locking entries in RelationGetIndexAttrBitmap that have
indisvalid = false, indisready = false we still have broken DROP INDEX
CONCURRENTLY due to the exlusive lock acquired during actually dropping
the index. Which can take quite a while on a big index.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 10:48:08
Message-ID: 20121127104808.GD3989@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote:
> On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > I wrote:
> >
> > Either state of indcheckxmin is valid with all three of these
> > combinations, so the specific kluge I was contemplating above doesn't
> > work. But there is no valid reason for an index to be in this state:
> >
> > indisvalid = true, indisready = false
> >
> > I suggest that to fix this for 9.2, we could abuse these flags by
> > defining that combination as meaning "ignore this index completely",
> > and having DROP INDEX CONCURRENTLY set this state during its final
> > wait before removing the index.
> >
> >
> Yeah, this looks much better, given our inability to add a new catalog
> column in 9.2. Can we cheat a little though and use a value other than 0
> and 1 for indisvalid or indisready to tell the server to interpret it
> differently ? I would assume not, but can't see a reason unless these
> values are converted to other types and back to boolean.
>
> Andres complained about the fact that many callers of RelationGetIndexList
> are probably not ready to process invalid or not-yet-ready indexes and
> suggested that those changes should be backpatched to even older releases.
> But IMO we should do that with a test case that demonstrates that there is
> indeed a bug.

I haven't yet looked deeply enough to judge whether there are actually
bugs. But I can say that the e.g. the missing indisvalid checks in
transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
whether indexes are ready isn't nice either.

> Also, we should teach RelationGetIndexList to take a flags
> argument and filter out indexes that the caller is not interested instead
> of every caller doing the checks separately.

Given that RelationGetIndexList currently is just returning a cached
list I don't see how thats going to work without significant overhead.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 10:52:11
Message-ID: 20121127105211.GE3989@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-27 11:48:08 +0100, Andres Freund wrote:
> On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote:
> > On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > > I wrote:
> > >
> > > Either state of indcheckxmin is valid with all three of these
> > > combinations, so the specific kluge I was contemplating above doesn't
> > > work. But there is no valid reason for an index to be in this state:
> > >
> > > indisvalid = true, indisready = false
> > >
> > > I suggest that to fix this for 9.2, we could abuse these flags by
> > > defining that combination as meaning "ignore this index completely",
> > > and having DROP INDEX CONCURRENTLY set this state during its final
> > > wait before removing the index.
> > >
> > >
> > Yeah, this looks much better, given our inability to add a new catalog
> > column in 9.2. Can we cheat a little though and use a value other than 0
> > and 1 for indisvalid or indisready to tell the server to interpret it
> > differently ? I would assume not, but can't see a reason unless these
> > values are converted to other types and back to boolean.
> >
> > Andres complained about the fact that many callers of RelationGetIndexList
> > are probably not ready to process invalid or not-yet-ready indexes and
> > suggested that those changes should be backpatched to even older releases.
> > But IMO we should do that with a test case that demonstrates that there is
> > indeed a bug.
>
> I haven't yet looked deeply enough to judge whether there are actually
> bugs. But I can say that the e.g. the missing indisvalid checks in
> transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
> whether indexes are ready isn't nice either.

At least the former was easy enough to verify after thinking about it
for a minute (<=9.1):

CREATE TABLE clusterbug(id serial primary key, data int);
INSERT INTO clusterbug(data) VALUES(2),(2);
CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int references clusterbug(data));

Now a !indisready index is getting queried (strangely enough that
doesn't cause an error).

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 10:55:59
Message-ID: 20121127105559.GF3989@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-27 11:52:11 +0100, Andres Freund wrote:
> On 2012-11-27 11:48:08 +0100, Andres Freund wrote:
> > On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote:
> > > On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >
> > > > I wrote:
> > > >
> > > > Either state of indcheckxmin is valid with all three of these
> > > > combinations, so the specific kluge I was contemplating above doesn't
> > > > work. But there is no valid reason for an index to be in this state:
> > > >
> > > > indisvalid = true, indisready = false
> > > >
> > > > I suggest that to fix this for 9.2, we could abuse these flags by
> > > > defining that combination as meaning "ignore this index completely",
> > > > and having DROP INDEX CONCURRENTLY set this state during its final
> > > > wait before removing the index.
> > > >
> > > >
> > > Yeah, this looks much better, given our inability to add a new catalog
> > > column in 9.2. Can we cheat a little though and use a value other than 0
> > > and 1 for indisvalid or indisready to tell the server to interpret it
> > > differently ? I would assume not, but can't see a reason unless these
> > > values are converted to other types and back to boolean.
> > >
> > > Andres complained about the fact that many callers of RelationGetIndexList
> > > are probably not ready to process invalid or not-yet-ready indexes and
> > > suggested that those changes should be backpatched to even older releases.
> > > But IMO we should do that with a test case that demonstrates that there is
> > > indeed a bug.
> >
> > I haven't yet looked deeply enough to judge whether there are actually
> > bugs. But I can say that the e.g. the missing indisvalid checks in
> > transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
> > whether indexes are ready isn't nice either.
>
> At least the former was easy enough to verify after thinking about it
> for a minute (<=9.1):
>
> CREATE TABLE clusterbug(id serial primary key, data int);
> INSERT INTO clusterbug(data) VALUES(2),(2);
> CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
> CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int references clusterbug(data));
>
> Now a !indisready index is getting queried (strangely enough that
> doesn't cause an error).

That should work in 9.2 as well, although its slightly more
complicated. You just need a indisready && !indisvalid index and the
foreign key will happily be created. Easy enough to achieve with two
sessions.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 11:09:58
Message-ID: CABOikdMzH50WoOH_E1U+u2X+Rsiz18Q5stODRDEdCxycGJCcSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 4:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2012-11-27 11:48:08 +0100, Andres Freund wrote:
>
> >
> > I haven't yet looked deeply enough to judge whether there are actually
> > bugs. But I can say that the e.g. the missing indisvalid checks in
> > transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
> > whether indexes are ready isn't nice either.
>
> At least the former was easy enough to verify after thinking about it
> for a minute (<=9.1):
>
> CREATE TABLE clusterbug(id serial primary key, data int);
> INSERT INTO clusterbug(data) VALUES(2),(2);
> CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
> CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int
> references clusterbug(data));
>
> Now a !indisready index is getting queried (strangely enough that
> doesn't cause an error).
>
>
There might be a bug there as you are suggesting, but for me the CREATE
INDEX itself fails (and rightly so) because of duplicate keys. Do I need to
run these statements in different sessions ?

Thanks,
Pavan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 11:19:19
Message-ID: 20121127111919.GH3989@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-27 16:39:58 +0530, Pavan Deolasee wrote:
> On Tue, Nov 27, 2012 at 4:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:
>
> > On 2012-11-27 11:48:08 +0100, Andres Freund wrote:
> >
> > >
> > > I haven't yet looked deeply enough to judge whether there are actually
> > > bugs. But I can say that the e.g. the missing indisvalid checks in
> > > transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
> > > whether indexes are ready isn't nice either.
> >
> > At least the former was easy enough to verify after thinking about it
> > for a minute (<=9.1):
> >
> > CREATE TABLE clusterbug(id serial primary key, data int);
> > INSERT INTO clusterbug(data) VALUES(2),(2);
> > CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
> > CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int
> > references clusterbug(data));
> >
> > Now a !indisready index is getting queried (strangely enough that
> > doesn't cause an error).
> >
> >
> There might be a bug there as you are suggesting, but for me the CREATE
> INDEX itself fails (and rightly so) because of duplicate keys. Do I need to
> run these statements in different sessions ?

Sure, the CREATE INDEX fails. The point is that we successfully can
create the foreign key afterwards anyway.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 11:19:50
Message-ID: CABOikdObu2+X1-Cgmg3Ny-hwxeGGvyLjLtEkYMiXUL2O2KALKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 11:38 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> writes:
> > On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Either state of indcheckxmin is valid with all three of these
> >> combinations, so the specific kluge I was contemplating above doesn't
> >> work. But there is no valid reason for an index to be in this state:
> >> indisvalid = true, indisready = false
>
> > Yeah, this looks much better, given our inability to add a new catalog
> > column in 9.2. Can we cheat a little though and use a value other than 0
> > and 1 for indisvalid or indisready to tell the server to interpret it
> > differently ?
>
> No, not unless you'd like "select * from pg_index" to misbehave.
> Personally, I like being able to look at catalogs.
>
>
Hmm.. I thought so. Thanks.

If we add a new column to the catalog for HEAD, I think it will be a good
idea to add an "indflags" kind of column which can store a bitmap of flags.
We probably don't get into this kind of situation often, but if we do, then
we can more flexibility without impacting rebuilding the catalogs. My two
cents.

Thanks,
Pavan


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 11:23:21
Message-ID: CABOikdPOFzTTYr_wnZge1BBSa_LrD4kbCtxShfg-FQs0wTRPpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 4:49 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2012-11-27 16:39:58 +0530, Pavan Deolasee wrote:
>
> > >
> > There might be a bug there as you are suggesting, but for me the CREATE
> > INDEX itself fails (and rightly so) because of duplicate keys. Do I need
> to
> > run these statements in different sessions ?
>
> Sure, the CREATE INDEX fails. The point is that we successfully can
> create the foreign key afterwards anyway.
>
>
Ah OK. Got it.

Thanks,
Pavan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 17:02:09
Message-ID: 24046.1354035729@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2012-11-26 16:39:39 -0500, Tom Lane wrote:
>> I looked at this a bit. I am very unhappy with the proposed kluge to
>> open indexes NoLock in some places. Even if that's safe today, which
>> I don't believe in the least, any future change in this area could break
>> it.

> I am not happy with it either. But I failed to see a better way. Note
> that in most of the cases a lock actually is acquired shortly
> afterwards, just a ->indisvalid or an ->indisready check away.

I think you're missing the point. The proposed patch will result in
RelationGetIndexAttrBitmap trying to do index_open() on an index that
might be getting dropped *right now* by a concurrent DROP INDEX
CONCURRENTLY. If the DROP commits while index_open is running, its
SnapshotNow catalog searches will stop finding relevant rows. From
the point of view of index_open, the relation data structure is then
corrupt (for example, it might not find pg_attribute entries for all
the columns) and so it will throw an error.

We need to fix things so that the final pre-deletion state of the
pg_index row tells observer backends not to even try to open the index.
(Thus, it will not matter whether they come across the pg_index row just
before or just after the deleting transaction commits --- either way,
they don't try to touch the index.) So that has to be a different state
from the initial state used during CREATE INDEX CONCURRENTLY.

The point of not wanting to open the index NoLock (and for that matter
of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it
thinks nobody is touching the index) is to make sure that if somehow
somebody is touching the index anyway, they don't see the index's
catalog entries as corrupt. They'll either all be there or all not.
It's only a belt-and-suspenders safety measure, but I don't want to
give it up.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 17:11:44
Message-ID: CA+TgmoaP=HLqGqRE8n_TeLGbMsPfhwhvTHkcKkupO-Lg9wm0xA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 12:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The point of not wanting to open the index NoLock (and for that matter
> of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it
> thinks nobody is touching the index) is to make sure that if somehow
> somebody is touching the index anyway, they don't see the index's
> catalog entries as corrupt. They'll either all be there or all not.
> It's only a belt-and-suspenders safety measure, but I don't want to
> give it up.

+1. There's a whole crapload of commits that I did for 9.2 with
commit messages like "improve concurrent DDL in case XYZ". A lot of
that had to do with fixing cases where we were examining system
catalogs in unsafe ways before locks had been taken. I didn't manage
to fix them all, unfortunately, but it's significantly better than it
used to be, and I'd really like it if we could try not to go
backwards.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 17:43:57
Message-ID: 20121127174357.GA22677@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-27 12:02:09 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2012-11-26 16:39:39 -0500, Tom Lane wrote:
> >> I looked at this a bit. I am very unhappy with the proposed kluge to
> >> open indexes NoLock in some places. Even if that's safe today, which
> >> I don't believe in the least, any future change in this area could break
> >> it.
>
> > I am not happy with it either. But I failed to see a better way. Note
> > that in most of the cases a lock actually is acquired shortly
> > afterwards, just a ->indisvalid or an ->indisready check away.
>
> I think you're missing the point.

True.

> The proposed patch will result in
> RelationGetIndexAttrBitmap trying to do index_open() on an index that
> might be getting dropped *right now* by a concurrent DROP INDEX
> CONCURRENTLY. If the DROP commits while index_open is running, its
> SnapshotNow catalog searches will stop finding relevant rows. From
> the point of view of index_open, the relation data structure is then
> corrupt (for example, it might not find pg_attribute entries for all
> the columns) and so it will throw an error.

> We need to fix things so that the final pre-deletion state of the
> pg_index row tells observer backends not to even try to open the index.
> (Thus, it will not matter whether they come across the pg_index row just
> before or just after the deleting transaction commits --- either way,
> they don't try to touch the index.) So that has to be a different state
> from the initial state used during CREATE INDEX CONCURRENTLY.
>
> The point of not wanting to open the index NoLock (and for that matter
> of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it
> thinks nobody is touching the index) is to make sure that if somehow
> somebody is touching the index anyway, they don't see the index's
> catalog entries as corrupt. They'll either all be there or all not.
> It's only a belt-and-suspenders safety measure, but I don't want to
> give it up.

So the idea would be to skip about-to-be-dropped indexes in
RelationGetIndexList directly - we don't ever need to watch those, not
even for HOT - because we only have the necessary knowledge there. The
normal valid/ready checks will be done at the callsites of
RelationGetIndexList(). But those checks can be done in a locked manner
now.

Are you already working on a fix? Or shall I work on an updated patch?

I vote for introducing wrapper functions/macro to do the
about-to-be-dropped check, its hard enough to understand as-is.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 17:50:36
Message-ID: 25050.1354038636@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> So the idea would be to skip about-to-be-dropped indexes in
> RelationGetIndexList directly - we don't ever need to watch those, not
> even for HOT - because we only have the necessary knowledge there. The
> normal valid/ready checks will be done at the callsites of
> RelationGetIndexList(). But those checks can be done in a locked manner
> now.

Right.

> Are you already working on a fix? Or shall I work on an updated patch?

I'll work on it.

> I vote for introducing wrapper functions/macro to do the
> about-to-be-dropped check, its hard enough to understand as-is.

Meh. If it's only going to be done in RelationGetIndexList, I'm
not sure that a wrapper macro is worth the trouble. If we needed
the test in multiple places I'd agree, but what other spots do you
see?

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 17:55:09
Message-ID: 20121127175509.GB22677@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-27 12:50:36 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > I vote for introducing wrapper functions/macro to do the
> > about-to-be-dropped check, its hard enough to understand as-is.
>
> Meh. If it's only going to be done in RelationGetIndexList, I'm
> not sure that a wrapper macro is worth the trouble. If we needed
> the test in multiple places I'd agree, but what other spots do you
> see?

I don't really see any other querying locations - but such a macro would
make the code easier backpatchable when we introduce a third column for
the about-to-be-dropped case.

Anyway, don't feel all too strongly about it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 19:08:13
Message-ID: 26565.1354043293@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

BTW, I was thinking that the DROP INDEX CONCURRENTLY logic needed to be:

1. Unset indisvalid, commit, wait out all reading transactions.

2. Unset indisready, commit, wait out all writing transactions.

3. Unset indislive, commit (with parent table relcache flush),
wait out all reading-or-writing transactions.

4. Drop the index.

However, I wonder whether we couldn't combine steps 2 and 3, ie once
there are no readers of the index go directly to the "dead" state.
I don't see a need for a period where the index isn't being inserted
into but is still used for HOT-safety decisions.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Date: 2012-11-27 19:10:38
Message-ID: 20121127191037.GD22677@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-11-27 14:08:13 -0500, Tom Lane wrote:
> BTW, I was thinking that the DROP INDEX CONCURRENTLY logic needed to be:
>
> 1. Unset indisvalid, commit, wait out all reading transactions.
>
> 2. Unset indisready, commit, wait out all writing transactions.
>
> 3. Unset indislive, commit (with parent table relcache flush),
> wait out all reading-or-writing transactions.
>
> 4. Drop the index.
>
> However, I wonder whether we couldn't combine steps 2 and 3, ie once
> there are no readers of the index go directly to the "dead" state.
> I don't see a need for a period where the index isn't being inserted
> into but is still used for HOT-safety decisions.

I think you're right, that state isn't interesting for anyone.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services