Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Date: 2019-05-04 20:24:42
Message-ID: 20190504202442.GD852556@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 01, 2019 at 07:01:23PM -0400, Tom Lane wrote:
> > On 2019-05-01 14:44:12 -0400, Tom Lane wrote:
> >> I'm busy looking into the REINDEX-on-pg_class mess, and one thing I found
> >> while testing HEAD is that with CLOBBER_CACHE_ALWAYS on, once you've
> >> gotten a failure, your session is hosed:
> >>
> >> regression=# select 1;
> >> ?column?
> >> ----------
> >> 1
> >> (1 row)
> >>
> >> regression=# reindex index pg_class_oid_index;
> >> psql: ERROR: could not read block 0 in file "base/16384/35344": read only 0 of 8192 bytes
> >> regression=# select 1;
> >> psql: ERROR: could not open file "base/16384/35344": No such file or directory

> It seems that the reason we fail to recover is that we detect the error
> during CommandEndInvalidationMessages, after we've updated the relcache
> entry for pg_class_oid_index; of course at that point, any additional
> pg_class access is going to fail because it'll try to use the
> not-yet-existent index. However, when we then abort the transaction,
> we fail to revert pg_class_oid_index's relcache entry because *there is
> not yet an invalidation queue entry telling us to do so*.
>
> This is, therefore, an order-of-operations bug in
> CommandEndInvalidationMessages. It should attach the "current command"
> queue entries to PriorCmdInvalidMsgs *before* it processes them, not
> after. In this way they'll be available to be reprocessed by
> AtEOXact_Inval or AtEOSubXact_Inval if we fail during CCI.

That makes sense.

> The attached quick-hack patch changes that around, and seems to survive
> testing (though I've not completed a CCA run with it). The basic change
> I had to make to make this work is to make AppendInvalidationMessageList
> actually append the current-commands list to the prior-cmds list, not
> prepend as it's really always done. (In that way, we can still scan the
> current-commands list just by starting at its head.) The comments for
> struct InvalidationChunk explicitly disclaim any significance to the order
> of the entries, so this should be okay, and I'm not seeing any problem
> in testing. It's been a long time since I wrote that code, but I think
> the reason I did it like that was the thought that in a long transaction,
> the prior-cmds list might be much longer than the current-commands list,
> so iterating down the prior-cmds list could add an O(N^2) cost. The
> easy answer to that, of course, is to make the lists doubly-linked,
> and I'd do that before committing; this version is just for testing.
> (It's short on comments, too.)

Looks reasonable so far.

> The thing I was worried about in RelationCacheInvalidate does seem
> to be a red herring, at least fixing it is not necessary to make
> the broken-session-state problem go away.

Your earlier proposal would have made RelationCacheInvalidate() work more like
RelationFlushRelation() when rd_newRelfilenodeSubid is set. That's a good
direction, all else being equal, though I'm not aware of a specific bug
reachable today. I think RelationCacheInvalidate() would then need the
reference count bits that RelationFlushRelation() has.

> *************** xactGetCommittedInvalidationMessages(Sha
> *** 858,867 ****
> */
> oldcontext = MemoryContextSwitchTo(CurTransactionContext);
>
> - ProcessInvalidationMessagesMulti(&transInvalInfo->CurrentCmdInvalidMsgs,
> - MakeSharedInvalidMessagesArray);
> ProcessInvalidationMessagesMulti(&transInvalInfo->PriorCmdInvalidMsgs,
> MakeSharedInvalidMessagesArray);
> MemoryContextSwitchTo(oldcontext);
>
> Assert(!(numSharedInvalidMessagesArray > 0 &&
> --- 865,874 ----
> */
> oldcontext = MemoryContextSwitchTo(CurTransactionContext);
>
> ProcessInvalidationMessagesMulti(&transInvalInfo->PriorCmdInvalidMsgs,
> MakeSharedInvalidMessagesArray);
> + ProcessInvalidationMessagesMulti(&transInvalInfo->CurrentCmdInvalidMsgs,
> + MakeSharedInvalidMessagesArray);

Why this order change?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-05-04 21:14:36 Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Previous Message Tom Lane 2019-05-04 19:55:05 Re: First-draft release notes for back branches are up