Re: ALTER TABLE lock strength reduction patch is unsafe

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-07-15 15:09:39
Message-ID: CA+TgmoZrV68f=sQmgUPzK5q1kCvmT2LnFgR_oPGNMF+jHqoxbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 15, 2013 at 10:32 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-07-15 10:06:31 -0400, Robert Haas wrote:
>> Noah discovered an interesting one recently: apparently, the relcache
>> machinery has some logic in it that depends on the use of
>> AccessExclusiveLock in subtle ways. I'm going to attempt to explain
>> it, and maybe he can jump in and explain it better. Essentially, the
>> problem is that when a relcache reload occurs, certain data structures
>> (like the tuple descriptor, but there are others) are compared against
>> the old version of the same data structure. If there are no changes,
>> we do nothing; else, we free the old one and install the new one. The
>> reason why we don't free the old one and install the new one
>> unconditionally is because other parts of the backend might have
>> pointers to the old data structure, so just replacing it all the time
>> would result in crashes. Since DDL requires AccessExclusiveLock +
>> CheckTableNotInUse(), any actual change to the data structure will
>> happen when we haven't got any pointers anyway.
>
> Aren't we swapping in the new data on a data level for that reason? See
> RelationClearRelation().

Look at the keep_tupdesc and keep_rules variables.

>> A second thing I'm concerned about is that, although MVCC catalog
>> scans guarantee that we won't miss a concurrently-updated row
>> entirely, or see a duplicate, they don't guarantee that the rows we
>> see during a scan of catalog A will be consistent with the rows we see
>> during a scan of catalog B moments later. For example, hilarity will
>> ensue if relnatts doesn't match what we see in pg_attribute. Now I
>> don't think this particular example matters very much because I think
>> there are probably lots of other things that would also break if we
>> try to add a column without a full table lock, so we're probably
>> doomed there anyway. But there might be other instances of this
>> problem that are more subtle.
>
> Hm. Other transactions basically should be protected against this
> because they can't se uncommitted data anyway, right? ISTM that our own
> session basically already has be safe against hilarity of this kind,
> right?

Other transactions are protected only within a single scan. If they
do two or more separate scans one after the another, some other
transaction can commit in the middle of things. Any commits that
happen after starting the first scan and before starting the second
scan will be reflected in the second scan, but not in the first.
That's what I'm concerned about. Our own session can't have a problem
of this kind, because we'll never commit a subtransaction (or, heaven
forbid, a top-level transaction) while in the middle of a system
catalog scan.

> I am concerned about stuff like violating constraints because we haven't
> yet seen the new constraint definition and the like... Or generating
> wrong plans because we haven't seen that somebody has dropped a
> constraint and inserted data violating the old one.

Yes, we need to carefully audit the commands for dependencies of that type.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-07-15 15:17:57 Re: Add regression tests for ROLE (USER)
Previous Message Robert Haas 2013-07-15 15:00:59 Re: Add more regression tests for dbcommands