Re: ALTER TABLE lock strength reduction patch is unsafe

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-07-15 14:06:31
Message-ID: CA+TgmoY4GLsXZk0tAO29-LJtcuj0SL1xWCwQ51xb-HFYsgi5RQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 7, 2013 at 9:24 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 3 January 2012 18:42, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I wrote:
>>>> Another point that requires some thought is that switching SnapshotNow
>>>> to be MVCC-based will presumably result in a noticeable increase in each
>>>> backend's rate of wanting to acquire snapshots.
>>
>> BTW, I wonder if this couldn't be ameliorated by establishing some
>> ground rules about how up-to-date a snapshot really needs to be.
>> Arguably, it should be okay for successive SnapshotNow scans to use the
>> same snapshot as long as we have not acquired a new lock in between.
>> If not, reusing an old snap doesn't introduce any race condition that
>> wasn't there already.
>
> Now that has been implemented using the above design, we can resubmit
> the lock level reduction patch, with thanks to Robert.
>
> Submitted patch passes original complaint/benchmark.
>
> Changes
> * various forms of ALTER TABLE, notably ADD constraint and VALIDATE
> * CREATE TRIGGER
>
> One minor coirrections to earlier thinking with respect to toast
> tables. That might be later relaxed.
>
> Full tests including proof of lock level reductions, plus docs.

I'm quite glad to see this patch back again for another round. I
haven't reviewed it in detail yet, so the purpose of this email is
just to lay out some general areas of concern and food for thought
rather than to critique anything in the patch too specifically.

Generally, the question on the table is: to what extent do MVCC
catalog scans make the world safe for concurrent DDL, or put
negatively, what hazards remain?

Noah discovered an interesting one recently: apparently, the relcache
machinery has some logic in it that depends on the use of
AccessExclusiveLock in subtle ways. I'm going to attempt to explain
it, and maybe he can jump in and explain it better. Essentially, the
problem is that when a relcache reload occurs, certain data structures
(like the tuple descriptor, but there are others) are compared against
the old version of the same data structure. If there are no changes,
we do nothing; else, we free the old one and install the new one. The
reason why we don't free the old one and install the new one
unconditionally is because other parts of the backend might have
pointers to the old data structure, so just replacing it all the time
would result in crashes. Since DDL requires AccessExclusiveLock +
CheckTableNotInUse(), any actual change to the data structure will
happen when we haven't got any pointers anyway.

A second thing I'm concerned about is that, although MVCC catalog
scans guarantee that we won't miss a concurrently-updated row
entirely, or see a duplicate, they don't guarantee that the rows we
see during a scan of catalog A will be consistent with the rows we see
during a scan of catalog B moments later. For example, hilarity will
ensue if relnatts doesn't match what we see in pg_attribute. Now I
don't think this particular example matters very much because I think
there are probably lots of other things that would also break if we
try to add a column without a full table lock, so we're probably
doomed there anyway. But there might be other instances of this
problem that are more subtle.

I'll try to find some time to look at this in more detail.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Jaskiewicz 2013-07-15 14:10:51 Re: Listen/notify across clusters
Previous Message Amit Kapila 2013-07-15 13:43:19 Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])