Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-07-07 13:24:48
Message-ID: CA+U5nM+za1mX5kTy-y+iUf1Acir4w70kHbBeedJ2iijpE5mJvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3 January 2012 18:42, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>>> Another point that requires some thought is that switching SnapshotNow
>>> to be MVCC-based will presumably result in a noticeable increase in each
>>> backend's rate of wanting to acquire snapshots.
>
> BTW, I wonder if this couldn't be ameliorated by establishing some
> ground rules about how up-to-date a snapshot really needs to be.
> Arguably, it should be okay for successive SnapshotNow scans to use the
> same snapshot as long as we have not acquired a new lock in between.
> If not, reusing an old snap doesn't introduce any race condition that
> wasn't there already.

Now that has been implemented using the above design, we can resubmit
the lock level reduction patch, with thanks to Robert.

Submitted patch passes original complaint/benchmark.

Changes
* various forms of ALTER TABLE, notably ADD constraint and VALIDATE
* CREATE TRIGGER

One minor coirrections to earlier thinking with respect to toast
tables. That might be later relaxed.

Full tests including proof of lock level reductions, plus docs.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v13.patch application/octet-stream 21.4 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-07-15 14:06:31
Message-ID: CA+TgmoY4GLsXZk0tAO29-LJtcuj0SL1xWCwQ51xb-HFYsgi5RQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 7, 2013 at 9:24 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 3 January 2012 18:42, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I wrote:
>>>> Another point that requires some thought is that switching SnapshotNow
>>>> to be MVCC-based will presumably result in a noticeable increase in each
>>>> backend's rate of wanting to acquire snapshots.
>>
>> BTW, I wonder if this couldn't be ameliorated by establishing some
>> ground rules about how up-to-date a snapshot really needs to be.
>> Arguably, it should be okay for successive SnapshotNow scans to use the
>> same snapshot as long as we have not acquired a new lock in between.
>> If not, reusing an old snap doesn't introduce any race condition that
>> wasn't there already.
>
> Now that has been implemented using the above design, we can resubmit
> the lock level reduction patch, with thanks to Robert.
>
> Submitted patch passes original complaint/benchmark.
>
> Changes
> * various forms of ALTER TABLE, notably ADD constraint and VALIDATE
> * CREATE TRIGGER
>
> One minor coirrections to earlier thinking with respect to toast
> tables. That might be later relaxed.
>
> Full tests including proof of lock level reductions, plus docs.

I'm quite glad to see this patch back again for another round. I
haven't reviewed it in detail yet, so the purpose of this email is
just to lay out some general areas of concern and food for thought
rather than to critique anything in the patch too specifically.

Generally, the question on the table is: to what extent do MVCC
catalog scans make the world safe for concurrent DDL, or put
negatively, what hazards remain?

Noah discovered an interesting one recently: apparently, the relcache
machinery has some logic in it that depends on the use of
AccessExclusiveLock in subtle ways. I'm going to attempt to explain
it, and maybe he can jump in and explain it better. Essentially, the
problem is that when a relcache reload occurs, certain data structures
(like the tuple descriptor, but there are others) are compared against
the old version of the same data structure. If there are no changes,
we do nothing; else, we free the old one and install the new one. The
reason why we don't free the old one and install the new one
unconditionally is because other parts of the backend might have
pointers to the old data structure, so just replacing it all the time
would result in crashes. Since DDL requires AccessExclusiveLock +
CheckTableNotInUse(), any actual change to the data structure will
happen when we haven't got any pointers anyway.

A second thing I'm concerned about is that, although MVCC catalog
scans guarantee that we won't miss a concurrently-updated row
entirely, or see a duplicate, they don't guarantee that the rows we
see during a scan of catalog A will be consistent with the rows we see
during a scan of catalog B moments later. For example, hilarity will
ensue if relnatts doesn't match what we see in pg_attribute. Now I
don't think this particular example matters very much because I think
there are probably lots of other things that would also break if we
try to add a column without a full table lock, so we're probably
doomed there anyway. But there might be other instances of this
problem that are more subtle.

I'll try to find some time to look at this in more detail.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-07-15 14:32:10
Message-ID: 20130715143210.GA31361@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-07-15 10:06:31 -0400, Robert Haas wrote:
> Noah discovered an interesting one recently: apparently, the relcache
> machinery has some logic in it that depends on the use of
> AccessExclusiveLock in subtle ways. I'm going to attempt to explain
> it, and maybe he can jump in and explain it better. Essentially, the
> problem is that when a relcache reload occurs, certain data structures
> (like the tuple descriptor, but there are others) are compared against
> the old version of the same data structure. If there are no changes,
> we do nothing; else, we free the old one and install the new one. The
> reason why we don't free the old one and install the new one
> unconditionally is because other parts of the backend might have
> pointers to the old data structure, so just replacing it all the time
> would result in crashes. Since DDL requires AccessExclusiveLock +
> CheckTableNotInUse(), any actual change to the data structure will
> happen when we haven't got any pointers anyway.

Aren't we swapping in the new data on a data level for that reason? See
RelationClearRelation().

> A second thing I'm concerned about is that, although MVCC catalog
> scans guarantee that we won't miss a concurrently-updated row
> entirely, or see a duplicate, they don't guarantee that the rows we
> see during a scan of catalog A will be consistent with the rows we see
> during a scan of catalog B moments later. For example, hilarity will
> ensue if relnatts doesn't match what we see in pg_attribute. Now I
> don't think this particular example matters very much because I think
> there are probably lots of other things that would also break if we
> try to add a column without a full table lock, so we're probably
> doomed there anyway. But there might be other instances of this
> problem that are more subtle.

Hm. Other transactions basically should be protected against this
because they can't se uncommitted data anyway, right? ISTM that our own
session basically already has be safe against hilarity of this kind,
right?

I am concerned about stuff like violating constraints because we haven't
yet seen the new constraint definition and the like... Or generating
wrong plans because we haven't seen that somebody has dropped a
constraint and inserted data violating the old one.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-07-15 15:09:39
Message-ID: CA+TgmoZrV68f=sQmgUPzK5q1kCvmT2LnFgR_oPGNMF+jHqoxbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 15, 2013 at 10:32 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-07-15 10:06:31 -0400, Robert Haas wrote:
>> Noah discovered an interesting one recently: apparently, the relcache
>> machinery has some logic in it that depends on the use of
>> AccessExclusiveLock in subtle ways. I'm going to attempt to explain
>> it, and maybe he can jump in and explain it better. Essentially, the
>> problem is that when a relcache reload occurs, certain data structures
>> (like the tuple descriptor, but there are others) are compared against
>> the old version of the same data structure. If there are no changes,
>> we do nothing; else, we free the old one and install the new one. The
>> reason why we don't free the old one and install the new one
>> unconditionally is because other parts of the backend might have
>> pointers to the old data structure, so just replacing it all the time
>> would result in crashes. Since DDL requires AccessExclusiveLock +
>> CheckTableNotInUse(), any actual change to the data structure will
>> happen when we haven't got any pointers anyway.
>
> Aren't we swapping in the new data on a data level for that reason? See
> RelationClearRelation().

Look at the keep_tupdesc and keep_rules variables.

>> A second thing I'm concerned about is that, although MVCC catalog
>> scans guarantee that we won't miss a concurrently-updated row
>> entirely, or see a duplicate, they don't guarantee that the rows we
>> see during a scan of catalog A will be consistent with the rows we see
>> during a scan of catalog B moments later. For example, hilarity will
>> ensue if relnatts doesn't match what we see in pg_attribute. Now I
>> don't think this particular example matters very much because I think
>> there are probably lots of other things that would also break if we
>> try to add a column without a full table lock, so we're probably
>> doomed there anyway. But there might be other instances of this
>> problem that are more subtle.
>
> Hm. Other transactions basically should be protected against this
> because they can't se uncommitted data anyway, right? ISTM that our own
> session basically already has be safe against hilarity of this kind,
> right?

Other transactions are protected only within a single scan. If they
do two or more separate scans one after the another, some other
transaction can commit in the middle of things. Any commits that
happen after starting the first scan and before starting the second
scan will be reflected in the second scan, but not in the first.
That's what I'm concerned about. Our own session can't have a problem
of this kind, because we'll never commit a subtransaction (or, heaven
forbid, a top-level transaction) while in the middle of a system
catalog scan.

> I am concerned about stuff like violating constraints because we haven't
> yet seen the new constraint definition and the like... Or generating
> wrong plans because we haven't seen that somebody has dropped a
> constraint and inserted data violating the old one.

Yes, we need to carefully audit the commands for dependencies of that type.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-07-15 16:50:40
Message-ID: CA+U5nMLL6GatoeZhE48dmMoun7LieavH3-Hnzbin0+bnuy0Wrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 July 2013 15:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Generally, the question on the table is: to what extent do MVCC
> catalog scans make the world safe for concurrent DDL, or put
> negatively, what hazards remain?

On Tom's test I've been unable to find a single problem.

> Noah discovered an interesting one recently: apparently, the relcache
> machinery has some logic in it that depends on the use of
> AccessExclusiveLock in subtle ways. I'm going to attempt to explain
> it, and maybe he can jump in and explain it better. Essentially, the
> problem is that when a relcache reload occurs, certain data structures
> (like the tuple descriptor, but there are others) are compared against
> the old version of the same data structure. If there are no changes,
> we do nothing; else, we free the old one and install the new one. The
> reason why we don't free the old one and install the new one
> unconditionally is because other parts of the backend might have
> pointers to the old data structure, so just replacing it all the time
> would result in crashes. Since DDL requires AccessExclusiveLock +
> CheckTableNotInUse(), any actual change to the data structure will
> happen when we haven't got any pointers anyway.
>
> A second thing I'm concerned about is that, although MVCC catalog
> scans guarantee that we won't miss a concurrently-updated row
> entirely, or see a duplicate, they don't guarantee that the rows we
> see during a scan of catalog A will be consistent with the rows we see
> during a scan of catalog B moments later. For example, hilarity will
> ensue if relnatts doesn't match what we see in pg_attribute. Now I
> don't think this particular example matters very much because I think
> there are probably lots of other things that would also break if we
> try to add a column without a full table lock, so we're probably
> doomed there anyway. But there might be other instances of this
> problem that are more subtle.

If you look at this as a generalised problem you probably can find
some issues, but that doesn't mean they apply in the specific cases
we're addressing.

The lock reductions we are discussing in all cases have nothing at all
to do with structure and only relate to various options. Except in the
case of constraints, though even there I see no issues as yet.

I've no problem waiting awhile to apply while others try to find loopholes.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-08-01 00:53:51
Message-ID: 20130801005351.GA325106@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 15, 2013 at 05:50:40PM +0100, Simon Riggs wrote:
> On 15 July 2013 15:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > Generally, the question on the table is: to what extent do MVCC
> > catalog scans make the world safe for concurrent DDL, or put
> > negatively, what hazards remain?
>
> On Tom's test I've been unable to find a single problem.
>
> > Noah discovered an interesting one recently: apparently, the relcache
> > machinery has some logic in it that depends on the use of
> > AccessExclusiveLock in subtle ways. I'm going to attempt to explain
> > it, and maybe he can jump in and explain it better. Essentially, the
> > problem is that when a relcache reload occurs, certain data structures
> > (like the tuple descriptor, but there are others) are compared against
> > the old version of the same data structure. If there are no changes,
> > we do nothing; else, we free the old one and install the new one. The
> > reason why we don't free the old one and install the new one
> > unconditionally is because other parts of the backend might have
> > pointers to the old data structure, so just replacing it all the time
> > would result in crashes. Since DDL requires AccessExclusiveLock +
> > CheckTableNotInUse(), any actual change to the data structure will
> > happen when we haven't got any pointers anyway.

> If you look at this as a generalised problem you probably can find
> some issues, but that doesn't mean they apply in the specific cases
> we're addressing.
>
> The lock reductions we are discussing in all cases have nothing at all
> to do with structure and only relate to various options. Except in the
> case of constraints, though even there I see no issues as yet.

I was able to distill the above hypothesis into an actual crash with
reduce_lock_levels.v13.patch. Test recipe:

1. Build with --enable-cassert and with -DCATCACHE_FORCE_RELEASE=1. An
AcceptInvalidationMessages() will then happen at nearly every syscache lookup,
making it far easier to hit a problem of this sort.

2. Run these commands as setup:
create table root (c int);
create table part (check (c > 0), check (c > 0)) inherits (root);

3. Attach a debugger to your session and set a breakpoint at plancat.c:660 (as
of commit 16f38f72ab2b8a3b2d45ba727d213bb31111cea4).

4. Run this in your session; the breakpoint will trip:
select * from root where c = -1;

5. Start another session and run:
alter table part add check (c > 0);

6. Exit the debugger to release the first session. It will crash.

plancache.c:657 stashes a pointer to memory belonging to the rd_att of a
relcache entry. It then proceeds to call eval_const_expressions(), which
performs a syscache lookup in its simplify_function() subroutine. Under
CATCACHE_FORCE_RELEASE, the syscache lookup reliably prompts an
AcceptInvalidationMessages(). The ensuing RelationClearRelation() against
"part" notices the new constraint, decides keep_tupdesc = false, and frees the
existing tupdesc. plancache.c is now left holding a pointer into freed
memory. The next loop iteration will crash when it dereferences a pointer
stored within that freed memory at plancat.c:671.

A remediation strategy that seemed attractive when I last contemplated this
problem is to repoint rd_att immediately but arrange to free the obsolete
TupleDesc in AtEOXact_RelationCache().

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-08-01 06:04:14
Message-ID: CA+U5nMJZ71r6-3ndq-xyeyy2ekP70ZWLP+-VMTOGtuXTjtndDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1 August 2013 01:53, Noah Misch <noah(at)leadboat(dot)com> wrote:

> I was able to distill the above hypothesis into an actual crash with
> reduce_lock_levels.v13.patch. Test recipe:

Thank you for the report; thank you again for reporting in sufficient
time to allow me to investigate and fix by the next CF.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2013-11-13 11:48:15
Message-ID: CA+U5nMJxGa3kJbCwsNBr9oPmUkkVR7zU5_TNjigbNCwyjMdxQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1 August 2013 01:53, Noah Misch <noah(at)leadboat(dot)com> wrote:

> A remediation strategy that seemed attractive when I last contemplated this
> problem is to repoint rd_att immediately but arrange to free the obsolete
> TupleDesc in AtEOXact_RelationCache().

I agree that the best way to resolve this is to retain a copy of the
TupleDesc, so that copied pointers to it remain valid.

EOXact is actually longer than strictly necessary in some cases, but
trying to work out a more minimal approach seems hard and possibly
inefficient.

Comments in relcache.c indicate that the Relation swapping concept
might be replaced by refcounting approach. I can't see how that
differs from your suggested route.

Which means I can't see any other way of doing this other than the way
you suggest. Will implement.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-14 17:36:56
Message-ID: CA+U5nMJEt0Gs4L=ZmtQvP38fUgMCi9QQP0n0LAQKcoXa4BH0XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 July 2013 14:24, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 3 January 2012 18:42, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I wrote:
>>>> Another point that requires some thought is that switching SnapshotNow
>>>> to be MVCC-based will presumably result in a noticeable increase in each
>>>> backend's rate of wanting to acquire snapshots.
>>
>> BTW, I wonder if this couldn't be ameliorated by establishing some
>> ground rules about how up-to-date a snapshot really needs to be.
>> Arguably, it should be okay for successive SnapshotNow scans to use the
>> same snapshot as long as we have not acquired a new lock in between.
>> If not, reusing an old snap doesn't introduce any race condition that
>> wasn't there already.
>
> Now that has been implemented using the above design, we can resubmit
> the lock level reduction patch, with thanks to Robert.
>
> Submitted patch passes original complaint/benchmark.
>
> Changes
> * various forms of ALTER TABLE, notably ADD constraint and VALIDATE
> * CREATE TRIGGER
>
> One minor coirrections to earlier thinking with respect to toast
> tables. That might be later relaxed.
>
> Full tests including proof of lock level reductions, plus docs.

Rebased to v14

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v14.patch application/octet-stream 21.4 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-15 14:57:53
Message-ID: CA+U5nMJ_noDR-YZMVuXb5NaXf=GeYyMUkhDj+dGhOA1E7XvPjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1 August 2013 01:53, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Mon, Jul 15, 2013 at 05:50:40PM +0100, Simon Riggs wrote:
>> On 15 July 2013 15:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> > Generally, the question on the table is: to what extent do MVCC
>> > catalog scans make the world safe for concurrent DDL, or put
>> > negatively, what hazards remain?
>>
>> On Tom's test I've been unable to find a single problem.
>>
>> > Noah discovered an interesting one recently: apparently, the relcache
>> > machinery has some logic in it that depends on the use of
>> > AccessExclusiveLock in subtle ways. I'm going to attempt to explain
>> > it, and maybe he can jump in and explain it better. Essentially, the
>> > problem is that when a relcache reload occurs, certain data structures
>> > (like the tuple descriptor, but there are others) are compared against
>> > the old version of the same data structure. If there are no changes,
>> > we do nothing; else, we free the old one and install the new one. The
>> > reason why we don't free the old one and install the new one
>> > unconditionally is because other parts of the backend might have
>> > pointers to the old data structure, so just replacing it all the time
>> > would result in crashes. Since DDL requires AccessExclusiveLock +
>> > CheckTableNotInUse(), any actual change to the data structure will
>> > happen when we haven't got any pointers anyway.
>
>> If you look at this as a generalised problem you probably can find
>> some issues, but that doesn't mean they apply in the specific cases
>> we're addressing.
>>
>> The lock reductions we are discussing in all cases have nothing at all
>> to do with structure and only relate to various options. Except in the
>> case of constraints, though even there I see no issues as yet.
>
> I was able to distill the above hypothesis into an actual crash with
> reduce_lock_levels.v13.patch. Test recipe:
>
> 1. Build with --enable-cassert and with -DCATCACHE_FORCE_RELEASE=1. An
> AcceptInvalidationMessages() will then happen at nearly every syscache lookup,
> making it far easier to hit a problem of this sort.
>
> 2. Run these commands as setup:
> create table root (c int);
> create table part (check (c > 0), check (c > 0)) inherits (root);
>
> 3. Attach a debugger to your session and set a breakpoint at plancat.c:660 (as
> of commit 16f38f72ab2b8a3b2d45ba727d213bb31111cea4).
>
> 4. Run this in your session; the breakpoint will trip:
> select * from root where c = -1;
>
> 5. Start another session and run:
> alter table part add check (c > 0);
>
> 6. Exit the debugger to release the first session. It will crash.
>
> plancache.c:657 stashes a pointer to memory belonging to the rd_att of a
> relcache entry. It then proceeds to call eval_const_expressions(), which
> performs a syscache lookup in its simplify_function() subroutine. Under
> CATCACHE_FORCE_RELEASE, the syscache lookup reliably prompts an
> AcceptInvalidationMessages(). The ensuing RelationClearRelation() against
> "part" notices the new constraint, decides keep_tupdesc = false, and frees the
> existing tupdesc. plancache.c is now left holding a pointer into freed
> memory. The next loop iteration will crash when it dereferences a pointer
> stored within that freed memory at plancat.c:671.
>
>
> A remediation strategy that seemed attractive when I last contemplated this
> problem is to repoint rd_att immediately but arrange to free the obsolete
> TupleDesc in AtEOXact_RelationCache().

v15 to fix the above problem.

Looking at other potential problems around plancache but nothing found as yet.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v15.patch application/octet-stream 25.9 KB

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-24 06:08:33
Message-ID: CAM3SWZR5Gq-6gLA68RHUqiueneNBb2M0artJKwERH0ZWg+qLkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 6:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> v15 to fix the above problem.

Minor quibble: I get a compiler warning with the patch applied.
"relcache.c: In function ‘RememberToFreeTupleDescAtEOX’:
relcache.c:2317:3: warning: ISO C90 forbids mixed declarations and
code [-Werror=declaration-after-statement]".

--
Peter Geoghegan


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-24 08:33:32
Message-ID: CA+U5nMKU1WhHUBSwNDK1suC+jnXXSEqN_REanjPhpVvBKDViUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 January 2014 07:08, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Wed, Jan 15, 2014 at 6:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> v15 to fix the above problem.
>
> Minor quibble: I get a compiler warning with the patch applied.
> "relcache.c: In function 'RememberToFreeTupleDescAtEOX':
> relcache.c:2317:3: warning: ISO C90 forbids mixed declarations and
> code [-Werror=declaration-after-statement]".

Thanks, that was a wart, now fixed.

v16 attached

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v16.patch application/octet-stream 25.9 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-27 17:58:02
Message-ID: CA+U5nMJhrdSiop+kHX3WmJdoy6qaKfd=iMB3GUPe54QnJS3-JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 January 2014 08:33, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 24 January 2014 07:08, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> On Wed, Jan 15, 2014 at 6:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> v15 to fix the above problem.
>>
> v16 attached

v17 attached

This version adds a GUC called ddl_exclusive_locks which allows us to
keep the 9.3 behaviour if we wish it. Some people may be surprised
that their programs don't wait in the same places they used to. We
hope that is a positive and useful behaviour, but it may not always be
so.

I'll commit this on Thurs 30 Jan unless I hear objections.

Thanks

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-27 17:58:42
Message-ID: CA+U5nMKmWtAyqQS7gxzL9MjwLNXXWjLmFe0jEbgB0FX9F4p9GA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 January 2014 17:58, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 24 January 2014 08:33, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On 24 January 2014 07:08, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>>> On Wed, Jan 15, 2014 at 6:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>> v15 to fix the above problem.
>>>
>> v16 attached
>
> v17 attached

Frostbite...

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v17.patch application/octet-stream 28.8 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-27 20:25:22
Message-ID: CA+TgmoY_3JXfh7_x7u+DkhE4Hb459XdsUbrMsFdS7YAbZDuY5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 27, 2014 at 12:58 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 24 January 2014 08:33, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On 24 January 2014 07:08, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>>> On Wed, Jan 15, 2014 at 6:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>> v15 to fix the above problem.
>>>
>> v16 attached
>
> v17 attached
>
> This version adds a GUC called ddl_exclusive_locks which allows us to
> keep the 9.3 behaviour if we wish it. Some people may be surprised
> that their programs don't wait in the same places they used to. We
> hope that is a positive and useful behaviour, but it may not always be
> so.
>
> I'll commit this on Thurs 30 Jan unless I hear objections.

I haven't reviewed the patch, but -1 for adding a GUC.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-27 20:35:58
Message-ID: CAM3SWZS7ziMoNnSZxLQk=RR955WYUiO7GtG3tKaX5AqYwHk3rQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 27, 2014 at 12:25 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I haven't reviewed the patch, but -1 for adding a GUC.

I'm pretty surprised that it's been suggested that some people might
prefer AccessExclusiveLocks. Why would anyone prefer that?

--
Peter Geoghegan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-27 20:47:00
Message-ID: 21924.1390855620@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Geoghegan <pg(at)heroku(dot)com> writes:
> On Mon, Jan 27, 2014 at 12:25 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I haven't reviewed the patch, but -1 for adding a GUC.

> I'm pretty surprised that it's been suggested that some people might
> prefer AccessExclusiveLocks. Why would anyone prefer that?

For one thing, so they can back this out if it proves to be broken,
as the last committed version was. Given that this patch was marked
(by its author) as Ready for Committer without any review in the current
CF, I can't say that I have any faith in it.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-27 20:57:26
Message-ID: CA+U5nMK0wac4kNDWNvB0et4OR38r+yhbO1ceCiEKyDUpa8iUWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 January 2014 20:35, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Mon, Jan 27, 2014 at 12:25 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I haven't reviewed the patch, but -1 for adding a GUC.
>
> I'm pretty surprised that it's been suggested that some people might
> prefer AccessExclusiveLocks. Why would anyone prefer that?

Nobody has said "prefer". I said...

> Some people may be surprised
> that their programs don't wait in the same places they used to. We
> hope that is a positive and useful behaviour, but it may not always be
> so.

We get the new behaviour by default and I expect we'll be very happy with it.

A second thought is that if we have problems of some kind in the field
as a result of the new lock modes then we will be able to turn them
off. I'm happy to fix any problems that occur, but that doesn't mean
there won't be any. If everybody is confident that we've foreseen
every bug, then no problem, lets remove it. I recall being asked to
add hot_standby = on | off for similar reasons.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-27 21:08:49
Message-ID: CA+U5nMK21BWW=a-ooZCYD6oVhuaMBoSGeuahd6HvSaDVxsZDRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 January 2014 20:47, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> On Mon, Jan 27, 2014 at 12:25 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> I haven't reviewed the patch, but -1 for adding a GUC.
>
>> I'm pretty surprised that it's been suggested that some people might
>> prefer AccessExclusiveLocks. Why would anyone prefer that?
>
> For one thing, so they can back this out if it proves to be broken,
> as the last committed version was.

Agreed

> Given that this patch was marked
> (by its author) as Ready for Committer without any review in the current
> CF

True. The main review happened in a previous commitfest and there was
a small addition for this CF.

It was my understanding that you wanted us to indicate that to allow
you to review. I am happy to set status differently, as you wish,
presumably back to needs review.

>I can't say that I have any faith in it.

That's a shame.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 14:55:15
Message-ID: 20140128145515.GC20898@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 27, 2014 at 08:57:26PM +0000, Simon Riggs wrote:
> We get the new behaviour by default and I expect we'll be very happy with it.
>
> A second thought is that if we have problems of some kind in the field
> as a result of the new lock modes then we will be able to turn them
> off. I'm happy to fix any problems that occur, but that doesn't mean
> there won't be any. If everybody is confident that we've foreseen
> every bug, then no problem, lets remove it. I recall being asked to
> add hot_standby = on | off for similar reasons.

Addressing a larger issue, I have no problem with systematically adding
GUCs to turn off features we add in each major release if we can also
systematically remove those GUCs in the next major release.

This would require putting all these settings in the compatibility
section of postgresql.conf.

However, I don't think it makes sense to do this in a one-off manner.
It is also possible that there are enough cases where we _can't_ turn
the feature off with a GUC that this would be unworkable.

So, if we can't do it systematically, that means we will have enough
breakage cases that we just need to rush out new versions to fix major
breakage and one-off GUCs just don't buy us much, and add confusion.

Does that make sense?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 16:36:39
Message-ID: CA+U5nMKp3D5D025qfrEv=BAE7RXuuSwiq_PojYVgXJHaz-1X=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 January 2014 14:55, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Mon, Jan 27, 2014 at 08:57:26PM +0000, Simon Riggs wrote:
>> We get the new behaviour by default and I expect we'll be very happy with it.
>>
>> A second thought is that if we have problems of some kind in the field
>> as a result of the new lock modes then we will be able to turn them
>> off. I'm happy to fix any problems that occur, but that doesn't mean
>> there won't be any. If everybody is confident that we've foreseen
>> every bug, then no problem, lets remove it. I recall being asked to
>> add hot_standby = on | off for similar reasons.
>
> Addressing a larger issue, I have no problem with systematically adding
> GUCs to turn off features we add in each major release if we can also
> systematically remove those GUCs in the next major release.

Agreed. I propose 2 releases since the time between release of 9.4 and
the feature freeze of 9.5 is only 4 months, not usually enough for
subtle bugs to be discovered.

> This would require putting all these settings in the compatibility
> section of postgresql.conf.

Agreed, that is where I have added the parameter.

> However, I don't think it makes sense to do this in a one-off manner.
> It is also possible that there are enough cases where we _can't_ turn
> the feature off with a GUC that this would be unworkable.
>
> So, if we can't do it systematically, that means we will have enough
> breakage cases that we just need to rush out new versions to fix major
> breakage and one-off GUCs just don't buy us much, and add confusion.
>
> Does that make sense?

For me, reducing the strength of DDL locking is a major change in
RDBMS behaviour that could both delight and surprise our users. Maybe
a few actually depend upon the locking behaviour, maybe. After some
years of various people looking at this, I think we've got it right.
Experience tells me that while I think this is the outcome, we are
well advised to protect against the possibility that it is not correct
and that if we have corner case issues, it would be good to easily
disable this in the field. In the current case, a simple parameter
works very well to disable the feature; in other cases, not.

Summary: This is an atypical case. I do not normally propose such
things - this is the third time in 10 years, IIRC.

I have no problem removing the parameter if required to. In that case,
I would like to leave the parameter in until mid beta, to allow
greater certainty. In any case, I would wish to retain as a minimum an
extern bool variable allowing it to be turned off by C function if
desired.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:15:28
Message-ID: 20140128171528.GJ20898@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 28, 2014 at 04:36:39PM +0000, Simon Riggs wrote:
> For me, reducing the strength of DDL locking is a major change in
> RDBMS behaviour that could both delight and surprise our users. Maybe
> a few actually depend upon the locking behaviour, maybe. After some
> years of various people looking at this, I think we've got it right.
> Experience tells me that while I think this is the outcome, we are
> well advised to protect against the possibility that it is not correct
> and that if we have corner case issues, it would be good to easily
> disable this in the field. In the current case, a simple parameter
> works very well to disable the feature; in other cases, not.
>
> Summary: This is an atypical case. I do not normally propose such
> things - this is the third time in 10 years, IIRC.

Uh, in my memory, you are the person who is most likely to suggest a
GUC of all our developers.

> I have no problem removing the parameter if required to. In that case,
> I would like to leave the parameter in until mid beta, to allow
> greater certainty. In any case, I would wish to retain as a minimum an
> extern bool variable allowing it to be turned off by C function if
> desired.

Anything changed to postgresql.conf during beta is going to require an
initdb.

Also, lots of backward-compatibility infrastructure, as you are
suggesting above, add complexity to the system.

I am basically against a GUC on this. We have far larger problems with
9.3 than backward compatibility, and limited resources.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:21:50
Message-ID: 52E7E72E.3050601@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/28/2014 07:15 PM, Bruce Momjian wrote:
> On Tue, Jan 28, 2014 at 04:36:39PM +0000, Simon Riggs wrote:
>> For me, reducing the strength of DDL locking is a major change in
>> RDBMS behaviour that could both delight and surprise our users. Maybe
>> a few actually depend upon the locking behaviour, maybe. After some
>> years of various people looking at this, I think we've got it right.
>> Experience tells me that while I think this is the outcome, we are
>> well advised to protect against the possibility that it is not correct
>> and that if we have corner case issues, it would be good to easily
>> disable this in the field. In the current case, a simple parameter
>> works very well to disable the feature; in other cases, not.

I don't understand why anyone would want to turn this feature off, ie.
require stronger locks than necessary for a DDL change.

If we're not confident that the patch is correct, then it should not be
applied. If we are confident enough to commit it, and a bug crops up
later, we'll fix the bug as usual.

A user savvy enough to fiddle with such a GUC can also do their DDL with:

BEGIN;
LOCK TABLE <table>
<DDL>
COMMIT;

>> I have no problem removing the parameter if required to. In that case,
>> I would like to leave the parameter in until mid beta, to allow
>> greater certainty. In any case, I would wish to retain as a minimum an
>> extern bool variable allowing it to be turned off by C function if
>> desired.
>
> Anything changed to postgresql.conf during beta is going to require an
> initdb.

Huh? Surely not.

- Heikki


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:26:56
Message-ID: 20140128172656.GK20898@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 28, 2014 at 07:21:50PM +0200, Heikki Linnakangas wrote:
> >>I have no problem removing the parameter if required to. In that case,
> >>I would like to leave the parameter in until mid beta, to allow
> >>greater certainty. In any case, I would wish to retain as a minimum an
> >>extern bool variable allowing it to be turned off by C function if
> >>desired.
> >
> >Anything changed to postgresql.conf during beta is going to require an
> >initdb.
>
> Huh? Surely not.

Uh, if we ship beta1 with a GUC in postgresql.conf, and then we remove
support for the GUC in beta2, anyone starting a server initdb-ed with
beta1 is going to get an error and the server is not going to start:

LOG: unrecognized configuration parameter "xxx" in file "/u/pgsql/data/postgresql.conf" line 1
FATAL: configuration file "/u/pgsql/data/postgresql.conf" contains errors

so, yeah, it isn't going to require an initdb, but it is going to
require everyone to edit their postgresql.conf. My guess is a lot of
people are going to assume the old postgresql.conf is not compatible and
are going to initdb and reload.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:30:47
Message-ID: 52E7E947.7070102@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/28/2014 07:26 PM, Bruce Momjian wrote:
> On Tue, Jan 28, 2014 at 07:21:50PM +0200, Heikki Linnakangas wrote:
>>>> I have no problem removing the parameter if required to. In that case,
>>>> I would like to leave the parameter in until mid beta, to allow
>>>> greater certainty. In any case, I would wish to retain as a minimum an
>>>> extern bool variable allowing it to be turned off by C function if
>>>> desired.
>>>
>>> Anything changed to postgresql.conf during beta is going to require an
>>> initdb.
>>
>> Huh? Surely not.
>
> Uh, if we ship beta1 with a GUC in postgresql.conf, and then we remove
> support for the GUC in beta2, anyone starting a server initdb-ed with
> beta1 is going to get an error and the server is not going to start:
>
> LOG: unrecognized configuration parameter "xxx" in file "/u/pgsql/data/postgresql.conf" line 1
> FATAL: configuration file "/u/pgsql/data/postgresql.conf" contains errors
>
> so, yeah, it isn't going to require an initdb, but it is going to
> require everyone to edit their postgresql.conf.

Only if you uncommented the value in the first place.

- Heikki


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:36:36
Message-ID: 20140128173636.GR10723@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian escribió:

> > I have no problem removing the parameter if required to. In that case,
> > I would like to leave the parameter in until mid beta, to allow
> > greater certainty.

Uhm. If we remove a GUC during beta we don't need to force an initdb.
At worst we will need to keep a no-op GUC variable that is removed in
the next devel cycle. That said, if we're going to have a GUC that's
going to disappear later, I think it's better to wait for 2 releases as
proposed, not remove mid-beta.

> > In any case, I would wish to retain as a minimum an extern bool
> > variable allowing it to be turned off by C function if desired.

I think this amounts to having an undocumented GUC that is hard to
change. I prefer the GUC, myself.

> Anything changed to postgresql.conf during beta is going to require an
> initdb.
> Also, lots of backward-compatibility infrastructure, as you are
> suggesting above, add complexity to the system.
>
> I am basically against a GUC on this. We have far larger problems with
> 9.3 than backward compatibility, and limited resources.

If we have a clear plan on removing the parameter, it's easy enough to
follow through. I don't think lack of resources is a good argument,
because at that point there will be little to discuss and it's an easy
change to make. And I think the plan is clear: if no bug is found the
parameter is removed. If a bug is found, it is fixed and the parameter
is removed anyway.

Honestly, I would prefer that we push a patch that has been thoroughly
reviewed and in which we have more confidence, so that we can push
without a GUC. This means mark in CF as needs-review, then some other
developer reviews it and marks it as ready-for-committer.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:40:12
Message-ID: 20140128174012.GM20898@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 28, 2014 at 07:30:47PM +0200, Heikki Linnakangas wrote:
> On 01/28/2014 07:26 PM, Bruce Momjian wrote:
> >On Tue, Jan 28, 2014 at 07:21:50PM +0200, Heikki Linnakangas wrote:
> >>>>I have no problem removing the parameter if required to. In that case,
> >>>>I would like to leave the parameter in until mid beta, to allow
> >>>>greater certainty. In any case, I would wish to retain as a minimum an
> >>>>extern bool variable allowing it to be turned off by C function if
> >>>>desired.
> >>>
> >>>Anything changed to postgresql.conf during beta is going to require an
> >>>initdb.
> >>
> >>Huh? Surely not.
> >
> >Uh, if we ship beta1 with a GUC in postgresql.conf, and then we remove
> >support for the GUC in beta2, anyone starting a server initdb-ed with
> >beta1 is going to get an error and the server is not going to start:
> >
> > LOG: unrecognized configuration parameter "xxx" in file "/u/pgsql/data/postgresql.conf" line 1
> > FATAL: configuration file "/u/pgsql/data/postgresql.conf" contains errors
> >
> >so, yeah, it isn't going to require an initdb, but it is going to
> >require everyone to edit their postgresql.conf.
>
> Only if you uncommented the value in the first place.

Oh, I had forgotten that. Right. It would still be odd to have a
commented-out line in postgresql.conf that was not support.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:42:10
Message-ID: 25773.1390930930@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Honestly, I would prefer that we push a patch that has been thoroughly
> reviewed and in which we have more confidence, so that we can push
> without a GUC. This means mark in CF as needs-review, then some other
> developer reviews it and marks it as ready-for-committer.

FWIW, that was the point I was trying to make as well ;-)

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:46:10
Message-ID: 20140128174610.GE18333@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-01-27 15:25:22 -0500, Robert Haas wrote:
> On Mon, Jan 27, 2014 at 12:58 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > This version adds a GUC called ddl_exclusive_locks which allows us to
> > keep the 9.3 behaviour if we wish it. Some people may be surprised
> > that their programs don't wait in the same places they used to. We
> > hope that is a positive and useful behaviour, but it may not always be
> > so.
> >
> > I'll commit this on Thurs 30 Jan unless I hear objections.
>
> I haven't reviewed the patch, but -1 for adding a GUC.

Dito. I don't think the patch in a bad shape otherwise. I'd very quickly
looked at a previous version and it did look rather sane, and several
other people had looked at earlier versions as well. IIRC Noah had a
fairly extensive look at some intricate race conditions...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:47:08
Message-ID: CA+U5nM+Vo9V24q_tM-ssgfSt3WAXr4cD8EdTM8dYkkNQa7sYTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 January 2014 17:15, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Tue, Jan 28, 2014 at 04:36:39PM +0000, Simon Riggs wrote:
>> For me, reducing the strength of DDL locking is a major change in
>> RDBMS behaviour that could both delight and surprise our users. Maybe
>> a few actually depend upon the locking behaviour, maybe. After some
>> years of various people looking at this, I think we've got it right.
>> Experience tells me that while I think this is the outcome, we are
>> well advised to protect against the possibility that it is not correct
>> and that if we have corner case issues, it would be good to easily
>> disable this in the field. In the current case, a simple parameter
>> works very well to disable the feature; in other cases, not.
>>
>> Summary: This is an atypical case. I do not normally propose such
>> things - this is the third time in 10 years, IIRC.
>
> Uh, in my memory, you are the person who is most likely to suggest a
> GUC of all our developers.

(smiles) I have suggested parameters for many features, mostly in the
early days of my developments before I saw the light if autotuning.
But those were control parameters for tuning features. So yes, I have
probably introduced more parameters than most, amongst the many things
I've done. I'm guessing you don't recall how much trouble I went to in
order to allow sync rep to have only 1 parameter, c'est la vie.

What I'm discussing here is a compatibility parameter to allow new
features to be disabled. This would be the third time in 10 years I
suggested a parameter for that reason, i.e. one that the user would
hardly ever wish to set.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-28 17:54:53
Message-ID: CA+U5nM+7JxUKSqnk_aQUv8cOvFJykCqUP_QpCnT8XmjNyr+B6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 January 2014 17:21, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

> I don't understand why anyone would want to turn this feature off, ie.
> require stronger locks than necessary for a DDL change.

Nobody would *want* to turn it off. They might need to, as explained.

> If we're not confident that the patch is correct, then it should not be
> applied. If we are confident enough to commit it, and a bug crops up later,
> we'll fix the bug as usual.

I'd like to point out here that my own customers are already well
covered by the support services we offer. They will receive any such
fix very quickly.

My proposal was of assistance only to those without such contracts in
place, as are many of my proposals.

It doesn't bother me at all if you insist it should not be added. Just
choose v16 of the patch for review rather than v17.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-29 05:04:51
Message-ID: CA+TgmoZjc=gcS_1uPrGU8h2TBj3qTKsJNmwz3pxbmSYLw779Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 28, 2014 at 12:36 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Bruce Momjian escribió:
>> > I have no problem removing the parameter if required to. In that case,
>> > I would like to leave the parameter in until mid beta, to allow
>> > greater certainty.
>
> Uhm. If we remove a GUC during beta we don't need to force an initdb.
> At worst we will need to keep a no-op GUC variable that is removed in
> the next devel cycle. That said, if we're going to have a GUC that's
> going to disappear later, I think it's better to wait for 2 releases as
> proposed, not remove mid-beta.
>
>> > In any case, I would wish to retain as a minimum an extern bool
>> > variable allowing it to be turned off by C function if desired.
>
> I think this amounts to having an undocumented GUC that is hard to
> change. I prefer the GUC, myself.
>
>> Anything changed to postgresql.conf during beta is going to require an
>> initdb.
>> Also, lots of backward-compatibility infrastructure, as you are
>> suggesting above, add complexity to the system.
>>
>> I am basically against a GUC on this. We have far larger problems with
>> 9.3 than backward compatibility, and limited resources.
>
> If we have a clear plan on removing the parameter, it's easy enough to
> follow through. I don't think lack of resources is a good argument,
> because at that point there will be little to discuss and it's an easy
> change to make. And I think the plan is clear: if no bug is found the
> parameter is removed. If a bug is found, it is fixed and the parameter
> is removed anyway.
>
> Honestly, I would prefer that we push a patch that has been thoroughly
> reviewed and in which we have more confidence, so that we can push
> without a GUC. This means mark in CF as needs-review, then some other
> developer reviews it and marks it as ready-for-committer.

I also believe that would be the correct procedure. Personally, I
think it would be great if Noah can review this, as he's very good at
finding the kind of problems that are likely to crop up here, and has
examined previous versions. I also have some interest in looking at
it myself. But I doubt that either of us (or any other senior hacker)
can do that by Thursday. I think all such people are hip-deep in
other patches at the moment; I certainly am.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-29 05:43:40
Message-ID: 16401.1390974220@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Jan 28, 2014 at 12:36 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
>> Honestly, I would prefer that we push a patch that has been thoroughly
>> reviewed and in which we have more confidence, so that we can push
>> without a GUC. This means mark in CF as needs-review, then some other
>> developer reviews it and marks it as ready-for-committer.

> I also believe that would be the correct procedure. Personally, I
> think it would be great if Noah can review this, as he's very good at
> finding the kind of problems that are likely to crop up here, and has
> examined previous versions. I also have some interest in looking at
> it myself. But I doubt that either of us (or any other senior hacker)
> can do that by Thursday. I think all such people are hip-deep in
> other patches at the moment; I certainly am.

Yeah. I actually have little doubt that the patch is sane on its own
terms of discussion, that is that Simon has chosen locking levels that
are mutually consistent in an abstract sense. What sank the previous
iteration was that he'd failed to consider an implementation detail,
namely possible inconsistencies in SnapshotNow-based catalog fetches.
I'm afraid that there may be more such problems lurking under the
surface. Noah's pretty good at finding such things, but really two
or three of us need to sit down and think about it for awhile before
I'd have much confidence in such a fundamental change. And I sure don't
have time for this patch right now myself.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-01-29 19:54:26
Message-ID: CA+U5nMKeg=d-W8jb5ZCWYbJBTODuanrKDSGpL53xf6ddpKLdwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 January 2014 05:43, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Jan 28, 2014 at 12:36 PM, Alvaro Herrera
>> <alvherre(at)2ndquadrant(dot)com> wrote:
>>> Honestly, I would prefer that we push a patch that has been thoroughly
>>> reviewed and in which we have more confidence, so that we can push
>>> without a GUC. This means mark in CF as needs-review, then some other
>>> developer reviews it and marks it as ready-for-committer.
>
>> I also believe that would be the correct procedure. Personally, I
>> think it would be great if Noah can review this, as he's very good at
>> finding the kind of problems that are likely to crop up here, and has
>> examined previous versions. I also have some interest in looking at
>> it myself. But I doubt that either of us (or any other senior hacker)
>> can do that by Thursday. I think all such people are hip-deep in
>> other patches at the moment; I certainly am.
>
> Yeah. I actually have little doubt that the patch is sane on its own
> terms of discussion, that is that Simon has chosen locking levels that
> are mutually consistent in an abstract sense. What sank the previous
> iteration was that he'd failed to consider an implementation detail,
> namely possible inconsistencies in SnapshotNow-based catalog fetches.
> I'm afraid that there may be more such problems lurking under the
> surface.

Agreed

> Noah's pretty good at finding such things, but really two
> or three of us need to sit down and think about it for awhile before
> I'd have much confidence in such a fundamental change. And I sure don't
> have time for this patch right now myself.

I've reviewed Noah's earlier comments, fixed things and also further
reviewed for any similar relcache related issues.

I've also reviewed Hot Standby to see if any locking issues arise,
since the ALTER TABLE now won't generate an AccessExclusiveLock as it
used to do for certain operations. I can't see any problems there.

While doing those reviews, I'd remind everybody that this only affects
roughly half of ALTER TABLE subcommands and definitely nothing that
affects SELECTs. So the risk profile is much less than it sounds at
first glance.

If anybody else has any hints or clues about where to look, please
mention them and I will investigate. Thanks.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-24 16:01:26
Message-ID: 20140224160126.GE6718@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I took a quick peek at this, and noticed the following things:
* I am pretty sure this patch doesn't compile anymore after the latest
set of releases.
* This definitely should include isolationtester tests actually
performing concurrent ALTER TABLEs. All that's currently there is
tests that the locklevel isn't too high, but not that it actually works.
* So far no DDL uses ShareRowExclusiveLocks, why do so now? Not sure if
there aren't relevant cases for with foreign key checks (which afair
*do* acquire SRE locks).
* Why is AddConstraint "so complicated" when it's all used SRE locks?
* Are you sure AlterConstraint is generally safe without an AEL? It
should be safe to change whether something is deferred, but not
necessarily whether it's deferrable?
* Why does ChangeOwner need AEL?
* You changed several places to take out lower locks, which in itself is
fine, but doesn't that lead to lock upgrade hazard if a later step
acquires a stronger lock? Or do we take out a strong enough lock from
the beginning.
* There's no explanation why the EOXact TupleDesc stuff is needed?

That's it for now,

Andres

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-26 07:32:45
Message-ID: CA+U5nMJoR54C2sqUZ__bfkFX06wVjiVaXo5pQo8OnVhvtH5EGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 February 2014 16:01, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> I took a quick peek at this, and noticed the following things:
> * I am pretty sure this patch doesn't compile anymore after the latest
> set of releases.

Refreshed to v18, will post shortly.

> * This definitely should include isolationtester tests actually
> performing concurrent ALTER TABLEs. All that's currently there is
> tests that the locklevel isn't too high, but not that it actually works.

There is no concurrent behaviour here, hence no code that would be
exercised by concurrent tests.

Lock levels are proven in regression tests, so no further tests needed.

> * So far no DDL uses ShareRowExclusiveLocks, why do so now? Not sure if
> there aren't relevant cases for with foreign key checks (which afair
> *do* acquire SRE locks).

That was discussed long ago. We can relax it more in the future, if
that is considered safe.

> * Why is AddConstraint "so complicated" when it's all used SRE locks?

To ensure each case was considered, rather than just assume all cases
are the same.

> * Are you sure AlterConstraint is generally safe without an AEL? It
> should be safe to change whether something is deferred, but not
> necessarily whether it's deferrable?

We change the lock levels for individual commands. This patch provides
some initial settings and infrastructure.

It is possible you are correct that changing the deferrability is not
safe without an AEL. That was enabled for the first time in this
release in a patch added by me after this patch was written. I will
think on that and change, if required.

> * Why does ChangeOwner need AEL?

Ownership affects privileges, which includes SELECTs, hence AEL.

This is not a critical aspect of the patch.

> * You changed several places to take out lower locks, which in itself is
> fine, but doesn't that lead to lock upgrade hazard if a later step
> acquires a stronger lock? Or do we take out a strong enough lock from
> the beginning.

We asess the lock needed at parse time, then use it consistently
later. There is no lock upgrade hazard since that has been
specifically considered and removed.

> * There's no explanation why the EOXact TupleDesc stuff is needed?

I will update comments.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-26 13:38:43
Message-ID: 20140226133843.GY6718@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
> > * This definitely should include isolationtester tests actually
> > performing concurrent ALTER TABLEs. All that's currently there is
> > tests that the locklevel isn't too high, but not that it actually works.
>
> There is no concurrent behaviour here, hence no code that would be
> exercised by concurrent tests.

Huh? There's most definitely new concurrent behaviour. Previously no
other backends could have a relation open (and locked) while it got
altered (which then sends out relcache invalidations). That's something
that should be tested.

> > * Why does ChangeOwner need AEL?
>
> Ownership affects privileges, which includes SELECTs, hence AEL.

So?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-26 15:15:00
Message-ID: CA+U5nMK8dtDqAtHzoQvY=MEB3mqp4jLS0za=bhf35JWUzox0Lg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2014 13:38, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
>> > * This definitely should include isolationtester tests actually
>> > performing concurrent ALTER TABLEs. All that's currently there is
>> > tests that the locklevel isn't too high, but not that it actually works.
>>
>> There is no concurrent behaviour here, hence no code that would be
>> exercised by concurrent tests.
>
> Huh? There's most definitely new concurrent behaviour. Previously no
> other backends could have a relation open (and locked) while it got
> altered (which then sends out relcache invalidations). That's something
> that should be tested.

It has been. High volume concurrent testing has been performed, per
Tom's original discussion upthread, but that's not part of the test
suite.
For other tests I have no guide as to how to write a set of automated
regression tests. Anything could cause a failure, so I'd need to write
an infinite set of tests to prove there is no bug *somewhere*. How
many tests are required? 0, 1, 3, 30?

>> > * Why does ChangeOwner need AEL?
>>
>> Ownership affects privileges, which includes SELECTs, hence AEL.
>
> So?

That reply could be added to any post. Please explain your concern.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-26 15:25:40
Message-ID: 20140226152540.GZ6718@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-26 15:15:00 +0000, Simon Riggs wrote:
> On 26 February 2014 13:38, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > Hi,
> >
> > On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
> >> > * This definitely should include isolationtester tests actually
> >> > performing concurrent ALTER TABLEs. All that's currently there is
> >> > tests that the locklevel isn't too high, but not that it actually works.
> >>
> >> There is no concurrent behaviour here, hence no code that would be
> >> exercised by concurrent tests.
> >
> > Huh? There's most definitely new concurrent behaviour. Previously no
> > other backends could have a relation open (and locked) while it got
> > altered (which then sends out relcache invalidations). That's something
> > that should be tested.
>
> It has been. High volume concurrent testing has been performed, per
> Tom's original discussion upthread, but that's not part of the test
> suite.

Yea, that's not what I am looking for.

> For other tests I have no guide as to how to write a set of automated
> regression tests. Anything could cause a failure, so I'd need to write
> an infinite set of tests to prove there is no bug *somewhere*. How
> many tests are required? 0, 1, 3, 30?

I think some isolationtester tests for the most important changes in
lock levels are appropriate. Say, create a PRIMARY KEY, DROP INHERIT,
... while a query is in progress in a nother session.

> >> > * Why does ChangeOwner need AEL?
> >>
> >> Ownership affects privileges, which includes SELECTs, hence AEL.
> >
> > So?
>
> That reply could be added to any post. Please explain your concern.

I don't understand why that means it needs an AEL. After all,
e.g. changes in table inheritance do *not* require an AEL. I think it's
perfectly ok to not go for the minimally required locklevel for all
subcommands, but then it should be commented as such and not with
"change visible to SELECT" where other operations that do so as well
require lower locklevels.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-27 08:06:32
Message-ID: CA+U5nM+cZ1HPAOL1C5GVK3dyJVdhBvJDiJ9E+axBDyJv0V8bvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2014 07:32, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>> * Are you sure AlterConstraint is generally safe without an AEL? It
>> should be safe to change whether something is deferred, but not
>> necessarily whether it's deferrable?
>
> We change the lock levels for individual commands. This patch provides
> some initial settings and infrastructure.
>
> It is possible you are correct that changing the deferrability is not
> safe without an AEL. That was enabled for the first time in this
> release in a patch added by me after this patch was written. I will
> think on that and change, if required.

Thoughts...

It would be a problem to change a DEFERRABLE constraint into a NOT
DEFERRABLE constraint concurrently with a transaction that was
currently deferring its constraint checks. It would not be a problem
to go in the other direction, relaxing the constraint attributes.

The patch uses ShareRowExclusive for AlterConstraint, so no writes are
possible concurrently with the table being ALTERed, hence the problem
situation cannot arise.

So in my understanding, no issue is possible.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-27 08:12:47
Message-ID: CA+U5nMLTUdSQhNDeR_egYo1G4=k7kC87_A0bF_eeCsxoFggazQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2014 15:25, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:

>> >> > * Why does ChangeOwner need AEL?
>> >>
>> >> Ownership affects privileges, which includes SELECTs, hence AEL.
>> >
>> > So?
>>
>> That reply could be added to any post. Please explain your concern.
>
> I don't understand why that means it needs an AEL. After all,
> e.g. changes in table inheritance do *not* require an AEL. I think it's
> perfectly ok to not go for the minimally required locklevel for all
> subcommands, but then it should be commented as such and not with
> "change visible to SELECT" where other operations that do so as well
> require lower locklevels.

Those are two separate cases, with separate lock levels, so that
argument doesn't hold.

My understanding of the argument as to why Inheritance doesn't need
AEL is that adding/removing children is akin to inserting or deleting
rows from the parent.

Removing SELECT privilege while running a SELECT would be a different
matter. This is all a matter of definition; we can make up any rules
we like. Doing so is IMHO a separate patch and not something to hold
up the main patch.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-02-27 08:48:46
Message-ID: CA+U5nMJY_iV81kkHRqoq-TLCk8=hybZfzkusVRpYSWEv9QaV=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2014 15:25, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-02-26 15:15:00 +0000, Simon Riggs wrote:
>> On 26 February 2014 13:38, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > Hi,
>> >
>> > On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
>> >> > * This definitely should include isolationtester tests actually
>> >> > performing concurrent ALTER TABLEs. All that's currently there is
>> >> > tests that the locklevel isn't too high, but not that it actually works.
>> >>
>> >> There is no concurrent behaviour here, hence no code that would be
>> >> exercised by concurrent tests.
>> >
>> > Huh? There's most definitely new concurrent behaviour. Previously no
>> > other backends could have a relation open (and locked) while it got
>> > altered (which then sends out relcache invalidations). That's something
>> > that should be tested.
>>
>> It has been. High volume concurrent testing has been performed, per
>> Tom's original discussion upthread, but that's not part of the test
>> suite.
>
> Yea, that's not what I am looking for.
>
>> For other tests I have no guide as to how to write a set of automated
>> regression tests. Anything could cause a failure, so I'd need to write
>> an infinite set of tests to prove there is no bug *somewhere*. How
>> many tests are required? 0, 1, 3, 30?
>
> I think some isolationtester tests for the most important changes in
> lock levels are appropriate. Say, create a PRIMARY KEY, DROP INHERIT,
> ... while a query is in progress in a nother session.

OK, I'll work on some tests.

v18 attached, with v19 coming soon

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v18.patch application/octet-stream 26.1 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-01 11:06:27
Message-ID: CA+U5nM+oprCeV3PU0ZyfxwO7ZJdnsta7Jij15bXRiA0pv=zsKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 February 2014 08:48, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 26 February 2014 15:25, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> On 2014-02-26 15:15:00 +0000, Simon Riggs wrote:
>>> On 26 February 2014 13:38, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>> > Hi,
>>> >
>>> > On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
>>> >> > * This definitely should include isolationtester tests actually
>>> >> > performing concurrent ALTER TABLEs. All that's currently there is
>>> >> > tests that the locklevel isn't too high, but not that it actually works.
>>> >>
>>> >> There is no concurrent behaviour here, hence no code that would be
>>> >> exercised by concurrent tests.
>>> >
>>> > Huh? There's most definitely new concurrent behaviour. Previously no
>>> > other backends could have a relation open (and locked) while it got
>>> > altered (which then sends out relcache invalidations). That's something
>>> > that should be tested.
>>>
>>> It has been. High volume concurrent testing has been performed, per
>>> Tom's original discussion upthread, but that's not part of the test
>>> suite.
>>
>> Yea, that's not what I am looking for.
>>
>>> For other tests I have no guide as to how to write a set of automated
>>> regression tests. Anything could cause a failure, so I'd need to write
>>> an infinite set of tests to prove there is no bug *somewhere*. How
>>> many tests are required? 0, 1, 3, 30?
>>
>> I think some isolationtester tests for the most important changes in
>> lock levels are appropriate. Say, create a PRIMARY KEY, DROP INHERIT,
>> ... while a query is in progress in a nother session.
>
> OK, I'll work on some tests.
>
> v18 attached, with v19 coming soon

v19 complete apart from requested comment additions

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v19.patch application/octet-stream 175.0 KB

From: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-01 21:25:29
Message-ID: 53125049.8080603@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/01/2014 12:06 PM, Simon Riggs wrote:
> On 27 February 2014 08:48, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On 26 February 2014 15:25, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>> On 2014-02-26 15:15:00 +0000, Simon Riggs wrote:
>>>> On 26 February 2014 13:38, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>>>> Hi,
>>>>>
>>>>> On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
>>>>>>> * This definitely should include isolationtester tests actually
>>>>>>> performing concurrent ALTER TABLEs. All that's currently there is
>>>>>>> tests that the locklevel isn't too high, but not that it actually works.
>>>>>> There is no concurrent behaviour here, hence no code that would be
>>>>>> exercised by concurrent tests.
>>>>> Huh? There's most definitely new concurrent behaviour. Previously no
>>>>> other backends could have a relation open (and locked) while it got
>>>>> altered (which then sends out relcache invalidations). That's something
>>>>> that should be tested.
>>>> It has been. High volume concurrent testing has been performed, per
>>>> Tom's original discussion upthread, but that's not part of the test
>>>> suite.
>>> Yea, that's not what I am looking for.
>>>
>>>> For other tests I have no guide as to how to write a set of automated
>>>> regression tests. Anything could cause a failure, so I'd need to write
>>>> an infinite set of tests to prove there is no bug *somewhere*. How
>>>> many tests are required? 0, 1, 3, 30?
>>> I think some isolationtester tests for the most important changes in
>>> lock levels are appropriate. Say, create a PRIMARY KEY, DROP INHERIT,
>>> ... while a query is in progress in a nother session.
>> OK, I'll work on some tests.
>>
>> v18 attached, with v19 coming soon
> v19 complete apart from requested comment additions

I've started to look at this patch and re-read the thread. The first
thing I noticed is what seems like an automated replace error. The docs
say "This form requires only an SHARE x EXCLUSIVE lock." where the "an"
was not changed to "a".

Attached is a patch-on-patch to fix this. A more complete review will
come later.

--
Vik

Attachment Content-Type Size
reduce_lock_levels.v19b.patch text/x-diff 7.9 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-02 09:50:41
Message-ID: CA+U5nMLATN8-09h7XgcbPY3VvLOGKPHkWWXLMi6n48g5F1h-yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1 March 2014 21:25, Vik Fearing <vik(dot)fearing(at)dalibo(dot)com> wrote:
> On 03/01/2014 12:06 PM, Simon Riggs wrote:
>> On 27 February 2014 08:48, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> On 26 February 2014 15:25, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>>> On 2014-02-26 15:15:00 +0000, Simon Riggs wrote:
>>>>> On 26 February 2014 13:38, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
>>>>>>>> * This definitely should include isolationtester tests actually
>>>>>>>> performing concurrent ALTER TABLEs. All that's currently there is
>>>>>>>> tests that the locklevel isn't too high, but not that it actually works.
>>>>>>> There is no concurrent behaviour here, hence no code that would be
>>>>>>> exercised by concurrent tests.
>>>>>> Huh? There's most definitely new concurrent behaviour. Previously no
>>>>>> other backends could have a relation open (and locked) while it got
>>>>>> altered (which then sends out relcache invalidations). That's something
>>>>>> that should be tested.
>>>>> It has been. High volume concurrent testing has been performed, per
>>>>> Tom's original discussion upthread, but that's not part of the test
>>>>> suite.
>>>> Yea, that's not what I am looking for.
>>>>
>>>>> For other tests I have no guide as to how to write a set of automated
>>>>> regression tests. Anything could cause a failure, so I'd need to write
>>>>> an infinite set of tests to prove there is no bug *somewhere*. How
>>>>> many tests are required? 0, 1, 3, 30?
>>>> I think some isolationtester tests for the most important changes in
>>>> lock levels are appropriate. Say, create a PRIMARY KEY, DROP INHERIT,
>>>> ... while a query is in progress in a nother session.
>>> OK, I'll work on some tests.
>>>
>>> v18 attached, with v19 coming soon
>> v19 complete apart from requested comment additions
>
> I've started to look at this patch and re-read the thread. The first
> thing I noticed is what seems like an automated replace error. The docs
> say "This form requires only an SHARE x EXCLUSIVE lock." where the "an"
> was not changed to "a".
>
> Attached is a patch-on-patch to fix this. A more complete review will
> come later.

v20 includes slightly re-ordered checks in GetLockLevel, plus more
detailed comments on each group of subcommands.

Also corrects grammar as noted by Vik.

Plus adds an example of usage to the docs.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v20.patch application/octet-stream 177.1 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 15:19:55
Message-ID: CA+Tgmob8cu_886p1h0ukknNVyiqnt=3EQWdt+C1uchfHUZqjig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Removing SELECT privilege while running a SELECT would be a different
> matter. This is all a matter of definition; we can make up any rules
> we like. Doing so is IMHO a separate patch and not something to hold
> up the main patch.

So I think this is an interesting point. There are various things
that could go wrong as a result of using the wrong lock level. Worst
would be that the server crashes or corrupts data. A little less bad
would be that sessions error out with inexplicable error conditions,
as in SnapshotNow days. Alternatively, we could just have arguably
wrong behavior, like basing query results on the old version of the
table's metadata even after it's been changed.

I don't really care about that second category of behavior. If
somebody changes some property of a table and existing sessions
continue to use the old value until eoxact, well, we can argue about
that, but at least until we have concrete reports of really
undesirable behavior, I don't think it's the primary issue. What I'm
really concerned about is whether there are other things like the
SnapshotNow issues that can cause stuff to halt and catch fire. I
don't know whether there are or are not, but that's my concern.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 15:43:46
Message-ID: CA+U5nM+f9Cyv4Ufs14ODO7jrT2vkEYhGRcwJANwLgebX49Cw+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3 March 2014 15:19, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> What I'm
> really concerned about is whether there are other things like the
> SnapshotNow issues that can cause stuff to halt and catch fire. I
> don't know whether there are or are not, but that's my concern.

Of course its a concern, I feel it also. But that's why we have beta
period to handle the unknowns.

The question is are there any specific areas of concern here? If not,
then we commit because we've done a lot of work on it and at the
moment the balance is high benefit to users against a non-specific
feeling of risk.

@Noah - Last call...

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 15:53:20
Message-ID: 8223.1393862000@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On 3 March 2014 15:19, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> What I'm
>> really concerned about is whether there are other things like the
>> SnapshotNow issues that can cause stuff to halt and catch fire. I
>> don't know whether there are or are not, but that's my concern.

> Of course its a concern, I feel it also. But that's why we have beta
> period to handle the unknowns.

I have exactly zero faith that beta testing would catch low-probability
problems in this area. What's needed, and hasn't happened AFAIK, is
detailed study of the patch by assorted senior hackers.

> The question is are there any specific areas of concern here? If not,
> then we commit because we've done a lot of work on it and at the
> moment the balance is high benefit to users against a non-specific
> feeling of risk.

This is backwards. The default decision around here has never been
to commit when in doubt.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 16:06:45
Message-ID: CA+TgmoYVoPK76MjcM5fZ416zwOKZFtLQFufXoFfK=9_y4L5UrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 2, 2014 at 4:50 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> v20 includes slightly re-ordered checks in GetLockLevel, plus more
> detailed comments on each group of subcommands.
>
> Also corrects grammar as noted by Vik.
>
> Plus adds an example of usage to the docs.

This patch contains a one line change to
src/bin/pg_dump/pg_backup_archiver.c which seems not to belong.

This hunk in ATRewriteCatalogs() looks scary:

+ /*
+ * If we think we might need to add/re-add toast tables then
+ * we currently need to hold an AccessExclusiveLock.
+ */
+ if (lockmode < AccessExclusiveLock)
+ return;

It would make sense to me to add an Assert() or elog() check inside
the subsequent loop to verify that the lock level is adequate ... but
just returning silently seems like a bad idea.

I have my doubts about whether it's safe to do AT_AddInherit,
AT_DropInherit, AT_AddOf, or AT_DropOf with a full lock. All of those
can change the tuple descriptor, and we discussed, back when we did
this the first time, the fact that the executor may get *very* unhappy
if the tuple descriptor changes in mid-execution. I strongly suspect
these are unsafe with less than a full AccessExclusiveLock.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 16:22:55
Message-ID: CA+U5nMJYqDrL59rLVB08J0byy-PeZOaOX7h=4Av+MVQy-PdP5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3 March 2014 15:53, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> On 3 March 2014 15:19, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> What I'm
>>> really concerned about is whether there are other things like the
>>> SnapshotNow issues that can cause stuff to halt and catch fire. I
>>> don't know whether there are or are not, but that's my concern.
>
>> Of course its a concern, I feel it also. But that's why we have beta
>> period to handle the unknowns.
>
> I have exactly zero faith that beta testing would catch low-probability
> problems in this area. What's needed, and hasn't happened AFAIK, is
> detailed study of the patch by assorted senior hackers.
>
>> The question is are there any specific areas of concern here? If not,
>> then we commit because we've done a lot of work on it and at the
>> moment the balance is high benefit to users against a non-specific
>> feeling of risk.
>
> This is backwards. The default decision around here has never been
> to commit when in doubt.

I'm not in doubt.

If anybody can give me some more pointers of things to look at, I will.

But I've looked and I can't see anything more.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 16:29:16
Message-ID: CA+U5nMLmMV1HrNfAN+OY=pTHg+V6sqfE=dQzU-Fibwji_WL8mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3 March 2014 16:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Mar 2, 2014 at 4:50 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> v20 includes slightly re-ordered checks in GetLockLevel, plus more
>> detailed comments on each group of subcommands.
>>
>> Also corrects grammar as noted by Vik.
>>
>> Plus adds an example of usage to the docs.
>
> This patch contains a one line change to
> src/bin/pg_dump/pg_backup_archiver.c which seems not to belong.
>
> This hunk in ATRewriteCatalogs() looks scary:
>
> + /*
> + * If we think we might need to add/re-add toast tables then
> + * we currently need to hold an AccessExclusiveLock.
> + */
> + if (lockmode < AccessExclusiveLock)
> + return;
>
> It would make sense to me to add an Assert() or elog() check inside
> the subsequent loop to verify that the lock level is adequate ... but
> just returning silently seems like a bad idea.

OK, I will check elog.

> I have my doubts about whether it's safe to do AT_AddInherit,
> AT_DropInherit, AT_AddOf, or AT_DropOf with a full lock. All of those
> can change the tuple descriptor, and we discussed, back when we did
> this the first time, the fact that the executor may get *very* unhappy
> if the tuple descriptor changes in mid-execution. I strongly suspect
> these are unsafe with less than a full AccessExclusiveLock.

I'm happy to change those if you feel there is insufficient evidence.

I don't personally feel that it would matter to usability to keep
locks for those at AccessExclusiveLock, especially since they are
otherwise fast.

Some others might be kept higher also. I'm merely trying to balance
between requests to reduce to minimal theoretical level and fears that
anything less than AccessExclusiveLock is a problem.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 16:36:16
Message-ID: CA+TgmoZXcL2UQho4tvp19eGpukSWn6K=gJ1HitQriPkcrDzj+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 11:29 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 3 March 2014 16:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Sun, Mar 2, 2014 at 4:50 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> v20 includes slightly re-ordered checks in GetLockLevel, plus more
>>> detailed comments on each group of subcommands.
>>>
>>> Also corrects grammar as noted by Vik.
>>>
>>> Plus adds an example of usage to the docs.
>>
>> This patch contains a one line change to
>> src/bin/pg_dump/pg_backup_archiver.c which seems not to belong.
>>
>> This hunk in ATRewriteCatalogs() looks scary:
>>
>> + /*
>> + * If we think we might need to add/re-add toast tables then
>> + * we currently need to hold an AccessExclusiveLock.
>> + */
>> + if (lockmode < AccessExclusiveLock)
>> + return;
>>
>> It would make sense to me to add an Assert() or elog() check inside
>> the subsequent loop to verify that the lock level is adequate ... but
>> just returning silently seems like a bad idea.
>
> OK, I will check elog.
>
>> I have my doubts about whether it's safe to do AT_AddInherit,
>> AT_DropInherit, AT_AddOf, or AT_DropOf with a full lock. All of those
>> can change the tuple descriptor, and we discussed, back when we did
>> this the first time, the fact that the executor may get *very* unhappy
>> if the tuple descriptor changes in mid-execution. I strongly suspect
>> these are unsafe with less than a full AccessExclusiveLock.
>
> I'm happy to change those if you feel there is insufficient evidence.

Not sure what that means, but yes, I think the lock levels on those
need to be increased.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 18:57:36
Message-ID: 20140303185736.GA3476935@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 03, 2014 at 10:19:55AM -0500, Robert Haas wrote:
> On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Removing SELECT privilege while running a SELECT would be a different
> > matter. This is all a matter of definition; we can make up any rules
> > we like. Doing so is IMHO a separate patch and not something to hold
> > up the main patch.
>
> So I think this is an interesting point. There are various things
> that could go wrong as a result of using the wrong lock level. Worst
> would be that the server crashes or corrupts data. A little less bad
> would be that sessions error out with inexplicable error conditions,
> as in SnapshotNow days. Alternatively, we could just have arguably
> wrong behavior, like basing query results on the old version of the
> table's metadata even after it's been changed.

I would order the concerns like this:

1. Data corruption
2. Transient, clearly-wrong answers without an error
3. Server crash
4. Catalog logical inconsistency
5. Inexplicable, transient errors
6. Valid behavior capable of surprising more than zero upgraders

> I don't really care about that second category of behavior. If
> somebody changes some property of a table and existing sessions
> continue to use the old value until eoxact, well, we can argue about
> that, but at least until we have concrete reports of really
> undesirable behavior, I don't think it's the primary issue. What I'm
> really concerned about is whether there are other things like the
> SnapshotNow issues that can cause stuff to halt and catch fire. I
> don't know whether there are or are not, but that's my concern.

Since we can't know whether something qualifies as (2) or (6) without
analyzing it, I don't find waiting for user complaints to be a good strategy
here. An ownership change not immediately affecting ACL checks does fall
under (6), for me. (However, changing ownership without AccessExclusiveLock
might also create hazards in category (4) for concurrent DDL that performs
owner checks.)

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 18:58:46
Message-ID: 20140303185846.GB3476935@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 03, 2014 at 03:43:46PM +0000, Simon Riggs wrote:
> The question is are there any specific areas of concern here? If not,
> then we commit because we've done a lot of work on it and at the
> moment the balance is high benefit to users against a non-specific
> feeling of risk.
>
> @Noah - Last call...

I am not specifically aware of any outstanding problems. I have planned to
give this a close look, but it will be at least two weeks before I dig out far
enough to do so. If that makes it a post-commit review, so be it.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 19:19:45
Message-ID: CA+U5nM+e63Dyuz0dH7WTzFLW6UpDemR2z5K7i3r25cJh3g6YAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3 March 2014 18:57, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Mon, Mar 03, 2014 at 10:19:55AM -0500, Robert Haas wrote:
>> On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > Removing SELECT privilege while running a SELECT would be a different
>> > matter. This is all a matter of definition; we can make up any rules
>> > we like. Doing so is IMHO a separate patch and not something to hold
>> > up the main patch.
>>
>> So I think this is an interesting point. There are various things
>> that could go wrong as a result of using the wrong lock level. Worst
>> would be that the server crashes or corrupts data. A little less bad
>> would be that sessions error out with inexplicable error conditions,
>> as in SnapshotNow days. Alternatively, we could just have arguably
>> wrong behavior, like basing query results on the old version of the
>> table's metadata even after it's been changed.
>
> I would order the concerns like this:
>
> 1. Data corruption
> 2. Transient, clearly-wrong answers without an error
> 3. Server crash
> 4. Catalog logical inconsistency
> 5. Inexplicable, transient errors
> 6. Valid behavior capable of surprising more than zero upgraders

I like your model for risk assessment. How can we apply it in detail
in a way that helps us decide? Or do we just go on gut feel?

My experience with mentioning such topics is that without structure it
results in an assessment of "unacceptable risk" just simply because
somebody has mentioned some scary words.

>> I don't really care about that second category of behavior. If
>> somebody changes some property of a table and existing sessions
>> continue to use the old value until eoxact, well, we can argue about
>> that, but at least until we have concrete reports of really
>> undesirable behavior, I don't think it's the primary issue. What I'm
>> really concerned about is whether there are other things like the
>> SnapshotNow issues that can cause stuff to halt and catch fire. I
>> don't know whether there are or are not, but that's my concern.
>
> Since we can't know whether something qualifies as (2) or (6) without
> analyzing it, I don't find waiting for user complaints to be a good strategy
> here. An ownership change not immediately affecting ACL checks does fall
> under (6), for me. (However, changing ownership without AccessExclusiveLock
> might also create hazards in category (4) for concurrent DDL that performs
> owner checks.)

err, guys, you do realise that changing ownership is staying at
AccessExclusiveLock in this patch?
(and I haven't ever suggested lowering that).

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 21:18:20
Message-ID: CA+U5nM+mpKzanYe-0skJfB829QeBaauecen9oZqk2JBGWTu7fA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3 March 2014 16:36, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

>>> This hunk in ATRewriteCatalogs() looks scary:
>>>
>>> + /*
>>> + * If we think we might need to add/re-add toast tables then
>>> + * we currently need to hold an AccessExclusiveLock.
>>> + */
>>> + if (lockmode < AccessExclusiveLock)
>>> + return;
>>>
>>> It would make sense to me to add an Assert() or elog() check inside
>>> the subsequent loop to verify that the lock level is adequate ... but
>>> just returning silently seems like a bad idea.
>>
>> OK, I will check elog.
>>
>>> I have my doubts about whether it's safe to do AT_AddInherit,
>>> AT_DropInherit, AT_AddOf, or AT_DropOf with a full lock. All of those
>>> can change the tuple descriptor, and we discussed, back when we did
>>> this the first time, the fact that the executor may get *very* unhappy
>>> if the tuple descriptor changes in mid-execution. I strongly suspect
>>> these are unsafe with less than a full AccessExclusiveLock.
>>
>> I'm happy to change those if you feel there is insufficient evidence.
>
> Not sure what that means, but yes, I think the lock levels on those
> need to be increased.

v21 with all requested changes, comments and cleanup

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v21.patch application/octet-stream 183.8 KB

From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-03 22:42:35
Message-ID: 20140303224235.GA3477828@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 03, 2014 at 07:19:45PM +0000, Simon Riggs wrote:
> On 3 March 2014 18:57, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Mon, Mar 03, 2014 at 10:19:55AM -0500, Robert Haas wrote:
> >> On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >> > Removing SELECT privilege while running a SELECT would be a different
> >> > matter. This is all a matter of definition; we can make up any rules
> >> > we like. Doing so is IMHO a separate patch and not something to hold
> >> > up the main patch.
> >>
> >> So I think this is an interesting point. There are various things
> >> that could go wrong as a result of using the wrong lock level. Worst
> >> would be that the server crashes or corrupts data. A little less bad
> >> would be that sessions error out with inexplicable error conditions,
> >> as in SnapshotNow days. Alternatively, we could just have arguably
> >> wrong behavior, like basing query results on the old version of the
> >> table's metadata even after it's been changed.
> >
> > I would order the concerns like this:
> >
> > 1. Data corruption
> > 2. Transient, clearly-wrong answers without an error
> > 3. Server crash
> > 4. Catalog logical inconsistency
> > 5. Inexplicable, transient errors
> > 6. Valid behavior capable of surprising more than zero upgraders
>
> I like your model for risk assessment. How can we apply it in detail
> in a way that helps us decide? Or do we just go on gut feel?
>
> My experience with mentioning such topics is that without structure it
> results in an assessment of "unacceptable risk" just simply because
> somebody has mentioned some scary words.

True; it's tough to make use of such a prioritization. By the time you can
confidently assign something to a category, you can probably just fix it.
(Or, in the case of category (6), document/release-note it.)

Just to be clear, that list is not a commentary on the particular patch at
hand. Those are merely the kinds of regressions to look for in a patch
affecting this area of the code.

> err, guys, you do realise that changing ownership is staying at
> AccessExclusiveLock in this patch?
> (and I haven't ever suggested lowering that).

Yep.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 00:15:27
Message-ID: 20893.1393892127@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> Just to be clear, that list is not a commentary on the particular patch at
> hand. Those are merely the kinds of regressions to look for in a patch
> affecting this area of the code.

A complaint on pgsql-bugs just now reminded me of a specific area that
needs to be looked at hard: how bad are the implications for pg_dump?

Up to now, pg_dump could be reasonably confident that once it had
AccessShareLock on every table it intended to dump, there would be no
schema changes happening on those tables until it got done. This greatly
ameliorates the snapshot-skew problems that arise from its habit of doing
some things for itself and other things via backend-internal functions
(which historically used SnapshotNow and now use a fresh MVCC snapshot,
either way potentially quite newer than the transaction snapshot pg_dump's
own queries will use).

I suspect that lowering the lock requirements for anything that affects
what pg_dump can see is going to make things a whole lot worse in terms of
consistency of pg_dump output in the face of concurrent DDL. Admittedly,
we're not perfect in that area now, but I'm not sure that's an adequate
excuse for making it worse.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 01:07:34
Message-ID: 20140304010734.GD18320@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-03 19:15:27 -0500, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > Just to be clear, that list is not a commentary on the particular patch at
> > hand. Those are merely the kinds of regressions to look for in a patch
> > affecting this area of the code.
>
> A complaint on pgsql-bugs just now reminded me of a specific area that
> needs to be looked at hard: how bad are the implications for pg_dump?
>
> Up to now, pg_dump could be reasonably confident that once it had
> AccessShareLock on every table it intended to dump, there would be no
> schema changes happening on those tables until it got done.

The guarantee wasn't actually that strong. It already was quite possible
that indexes got created/dropped during that time, which probably is the
by far most frequent DDL run in production.

> This greatly
> ameliorates the snapshot-skew problems that arise from its habit of doing
> some things for itself and other things via backend-internal functions
> (which historically used SnapshotNow and now use a fresh MVCC snapshot,
> either way potentially quite newer than the transaction snapshot pg_dump's
> own queries will use).

Yea, I wonder if we shouldn't start to make them use a different
snapshot. It's the pg_get_*def() functions, or is there something else?

Afair (I really haven't rechecked) all the actions that have a changed
locklevels affect things that pg_dump recreates clientside, using a
repeatable read snapshot, so there shouldn't be much change there?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 01:32:13
Message-ID: 22518.1393896733@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-03-03 19:15:27 -0500, Tom Lane wrote:
>> This greatly
>> ameliorates the snapshot-skew problems that arise from its habit of doing
>> some things for itself and other things via backend-internal functions
>> (which historically used SnapshotNow and now use a fresh MVCC snapshot,
>> either way potentially quite newer than the transaction snapshot pg_dump's
>> own queries will use).

> Yea, I wonder if we shouldn't start to make them use a different
> snapshot. It's the pg_get_*def() functions, or is there something else?

See past discussions. By the time you trace all of ruleutils.c's
dependencies, a dauntingly large fraction of the backend's basic catalog
access support is implicated. For example, you'd need some way of making
the catcaches return data that they know to be outdated. And I'm pretty
sure pg_dump is making free use of stuff that isn't even in ruleutils.

I would like to see pg_dump doing something much more bulletproof, but
I'm afraid that there isn't any nice simple fix available.

One relatively narrow fix that would probably make it a lot better
*in the current state of affairs* is to provide a way whereby, once
pg_dump has locks on everything it wants to dump, it can advance
its transaction snapshot to current time. Then at least both its own
queries and the backend's lookups will see post-locking state. However,
if AccessShareLock isn't enough to block DDL on the tables then we're
still hosed.

> Afair (I really haven't rechecked) all the actions that have a changed
> locklevels affect things that pg_dump recreates clientside, using a
> repeatable read snapshot, so there shouldn't be much change there?

You're missing the point entirely if you think pg_dump recreates
everything client-side. It's never done that 100%, and we've migrated
more and more stuff to the server side over time to avoid duplicate
implementations of things like index expression decompilation. So while
a theoretical answer would be to remove all of pg_dump's dependency on
server-side functions, I am pretty sure that's never going to happen.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 01:45:24
Message-ID: 20140304014524.GE18320@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-03 20:32:13 -0500, Tom Lane wrote:
> > Afair (I really haven't rechecked) all the actions that have a changed
> > locklevels affect things that pg_dump recreates clientside, using a
> > repeatable read snapshot, so there shouldn't be much change there?
>
> You're missing the point entirely if you think pg_dump recreates
> everything client-side.

No, I am not obviously not thinking that. What I mean is that the things
that actually change their locking requirement in the proposed patch
primarily influence things that are reconstructed clientside by
pg_dump. E.g ALTER TABLE ... CLUSTER ON, SET(...), ...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 02:16:47
Message-ID: 23377.1393899407@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-03-03 20:32:13 -0500, Tom Lane wrote:
>> You're missing the point entirely if you think pg_dump recreates
>> everything client-side.

> No, I am not obviously not thinking that. What I mean is that the things
> that actually change their locking requirement in the proposed patch
> primarily influence things that are reconstructed clientside by
> pg_dump. E.g ALTER TABLE ... CLUSTER ON, SET(...), ...

[ raised eyebrow... ] I'm pretty sure that no such constraint was
part of the design discussion to start with. Even if it accidentally
happens to be the case now, it sounds utterly fragile.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 08:31:33
Message-ID: CA+U5nM+EgBo8YpEEE4MEhU_EiAAohHL8yyfuRbTVj7ft=VF=6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4 March 2014 01:07, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-03-03 19:15:27 -0500, Tom Lane wrote:
>> Noah Misch <noah(at)leadboat(dot)com> writes:
>> > Just to be clear, that list is not a commentary on the particular patch at
>> > hand. Those are merely the kinds of regressions to look for in a patch
>> > affecting this area of the code.
>>
>> A complaint on pgsql-bugs just now reminded me of a specific area that
>> needs to be looked at hard: how bad are the implications for pg_dump?
>>
>> Up to now, pg_dump could be reasonably confident that once it had
>> AccessShareLock on every table it intended to dump, there would be no
>> schema changes happening on those tables until it got done.
>
> The guarantee wasn't actually that strong. It already was quite possible
> that indexes got created/dropped during that time, which probably is the
> by far most frequent DDL run in production.

Good points.

In most cases, DDL is applied manually after careful thought, so
people seldom dump at the same time they upgrade the database. This is
especially true for pg_dump since it captures the logical definition
of tables. So most people will be happy with the default locking, but
we could make the lock level optional.

Currently we use AccessShareLock. Locking out all DDL, even with this
patch applied would only require ShareUpdateExclusiveLock.

Looking at the code, it will take about an hour to add an option to
pg_dump that specifies the lock level used when dumping. I would be
happy to include that as part of this patch.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 08:39:28
Message-ID: CAOeZVifePi5ZcoRZPNO3jh=BkrfCYepQwDdgsa1RsSKUp9QEyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> Good points.
>
> In most cases, DDL is applied manually after careful thought, so
> people seldom dump at the same time they upgrade the database. This is
> especially true for pg_dump since it captures the logical definition
> of tables. So most people will be happy with the default locking, but
> we could make the lock level optional.
>
> Currently we use AccessShareLock. Locking out all DDL, even with this
> patch applied would only require ShareUpdateExclusiveLock.
>
> Looking at the code, it will take about an hour to add an option to
> pg_dump that specifies the lock level used when dumping. I would be
> happy to include that as part of this patch.
>

I think the use case for specifying multiple locks is pretty slim given
that a ShareUpdateExclusiveLock is good enough mostly for everybody.

If its not the case, the user should be more careful about when he is
scheduling backups to so that they dont conflict with DDL changes.

I am not too comfortable with exposing the locking type to the user. That
may be just me though.

Regards,

Atri
--
Regards,

Atri
*l'apprenant*


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 09:31:14
Message-ID: CA+U5nMLVXmqZ9hNXbUDTnqouJPEdD7p9n-fd6cJC9TqqWL-k7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4 March 2014 08:39, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
>
>>
>> Good points.
>>
>> In most cases, DDL is applied manually after careful thought, so
>> people seldom dump at the same time they upgrade the database. This is
>> especially true for pg_dump since it captures the logical definition
>> of tables. So most people will be happy with the default locking, but
>> we could make the lock level optional.
>>
>> Currently we use AccessShareLock. Locking out all DDL, even with this
>> patch applied would only require ShareUpdateExclusiveLock.
>>
>> Looking at the code, it will take about an hour to add an option to
>> pg_dump that specifies the lock level used when dumping. I would be
>> happy to include that as part of this patch.
>
>
>
> I think the use case for specifying multiple locks is pretty slim given that
> a ShareUpdateExclusiveLock is good enough mostly for everybody.

Increasing the lock strength would be a change in behaviour that might
adversely affect existing users.

> If its not the case, the user should be more careful about when he is
> scheduling backups to so that they dont conflict with DDL changes.

That is most certainly the wise choice.

> I am not too comfortable with exposing the locking type to the user. That
> may be just me though.

Why would that be a problem? Hard reasons, please.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 09:51:40
Message-ID: CAOeZVifj=icp8+0TGovWqr3WK7RNqPN5QsRtTrAfo2_1YuAQKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > If its not the case, the user should be more careful about when he is
> > scheduling backups to so that they dont conflict with DDL changes.
>
> That is most certainly the wise choice.
>
> > I am not too comfortable with exposing the locking type to the user. That
> > may be just me though.
>
> Why would that be a problem? Hard reasons, please.
>

Should we genuinely depend on the user's good judgement to decide the
locking types?
--
Regards,

Atri
*l'apprenant*


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 11:57:10
Message-ID: CA+U5nMLWX7Xa-segLxr9j8yf=nMXjCKN6Ty2joSZ_SxL-Q-DXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4 March 2014 09:31, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 4 March 2014 08:39, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
>>
>>>
>>> Good points.
>>>
>>> In most cases, DDL is applied manually after careful thought, so
>>> people seldom dump at the same time they upgrade the database. This is
>>> especially true for pg_dump since it captures the logical definition
>>> of tables. So most people will be happy with the default locking, but
>>> we could make the lock level optional.
>>>
>>> Currently we use AccessShareLock. Locking out all DDL, even with this
>>> patch applied would only require ShareUpdateExclusiveLock.
>>>
>>> Looking at the code, it will take about an hour to add an option to
>>> pg_dump that specifies the lock level used when dumping. I would be
>>> happy to include that as part of this patch.
>>
>>
>>
>> I think the use case for specifying multiple locks is pretty slim given that
>> a ShareUpdateExclusiveLock is good enough mostly for everybody.
>
> Increasing the lock strength would be a change in behaviour that might
> adversely affect existing users.

The main impact I see is that this would block VACUUM while pg_dump runs.

But then, while pg_dump runs VACUUM is ineffective anyway so perhaps
that is no bad thing.

Autovacuum requests VACOPT_NOWAIT so would skip the relations being
dumped rather than waiting.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 12:18:55
Message-ID: CA+Tgmoa4g93c7haMviwox5L7jx3ZYDNb7BNK4QDcGrbPDw1yZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 6:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> The main impact I see is that this would block VACUUM while pg_dump runs.
>
> But then, while pg_dump runs VACUUM is ineffective anyway so perhaps
> that is no bad thing.

Well, a vacuum that's already running when pg_dump starts up may be
doing a lot of good, so it would be a shame to see pg_dump kill them
all off.

Also, this would put us in the surprising situation that you can't run
two simultaneous dumps of overlapping sets of tables, which doesn't
strike me as a great thing.

I'd really like to see us find a way to apply some version of this
patch. I was in favor of the concept 3 years ago when we did this the
first time, and I've subsequently done quite a bit of work (viz., MVCC
catalog snapshots) to eliminate the main objection that was raised at
that time. But it's really hard to reason about what might happen
with lowered lock levels, and convince yourself that there's
absolutely nothing that can ever go wrong. I don't know what to do
about that tension, but I think even modest improvements in this area
stand to benefit an awful lot of users.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 13:03:57
Message-ID: CAOeZVie+TaRCmMnGVeAetYQSQavwQdc7QCdgrwr_Q544v=goTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'd really like to see us find a way to apply some version of this
> patch. I was in favor of the concept 3 years ago when we did this the
> first time, and I've subsequently done quite a bit of work (viz., MVCC
> catalog snapshots) to eliminate the main objection that was raised at
> that time. But it's really hard to reason about what might happen
> with lowered lock levels, and convince yourself that there's
> absolutely nothing that can ever go wrong. I don't know what to do
> about that tension, but I think even modest improvements in this area
> stand to benefit an awful lot of users.
>

Wouldnt MVCC's strictness rules pose harder restrictions on pg_dump instead
of relaxing them?

Regards,

Atri

--
Regards,

Atri
*l'apprenant*


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 13:59:39
Message-ID: CA+U5nM+zYcCMZNBFF5zOKzXQdJ1FhRiHqyxtHYOLAigUNunrEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4 March 2014 12:18, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Mar 4, 2014 at 6:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> The main impact I see is that this would block VACUUM while pg_dump runs.
>>
>> But then, while pg_dump runs VACUUM is ineffective anyway so perhaps
>> that is no bad thing.
>
> Well, a vacuum that's already running when pg_dump starts up may be
> doing a lot of good, so it would be a shame to see pg_dump kill them
> all off.
>
> Also, this would put us in the surprising situation that you can't run
> two simultaneous dumps of overlapping sets of tables, which doesn't
> strike me as a great thing.

These changes in concurrency are the most serious objections and a
definite change in previous behaviour. So we cannot pick a single lock
level that suits all goals the user may have.

> I'd really like to see us find a way to apply some version of this
> patch. I was in favor of the concept 3 years ago when we did this the
> first time, and I've subsequently done quite a bit of work (viz., MVCC
> catalog snapshots) to eliminate the main objection that was raised at
> that time. But it's really hard to reason about what might happen
> with lowered lock levels, and convince yourself that there's
> absolutely nothing that can ever go wrong. I don't know what to do
> about that tension, but I think even modest improvements in this area
> stand to benefit an awful lot of users.

Agreed. The question is, which subset? The issue just raised would
affect whichever subset we choose, so reducing the scope of the patch
does nothing to the impact of the pg_dump issue.

I will add the option to change lock level for pg_dump. It's simple to
use, clear as to why it would be needed and effective at removing this
as an obstacle.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 14:49:49
Message-ID: 20140304144949.GE12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Atri Sharma (atri(dot)jiit(at)gmail(dot)com) wrote:
> If its not the case, the user should be more careful about when he is
> scheduling backups to so that they dont conflict with DDL changes.

I'm not following this as closely as I'd like to, but I wanted to voice
my opinion that this is just not acceptable as a general answer. There
are a good many applications out there which do DDL as part of ongoing
activity (part of ETL, or something else) and still need to be able to
get a pg_dump done. It's not a design I'd recommend, but I don't think
we get to just write it off either.

Thanks,

Stephen


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 15:17:52
Message-ID: 20140304151752.GE4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost escribió:
> * Atri Sharma (atri(dot)jiit(at)gmail(dot)com) wrote:
> > If its not the case, the user should be more careful about when he is
> > scheduling backups to so that they dont conflict with DDL changes.
>
> I'm not following this as closely as I'd like to, but I wanted to voice
> my opinion that this is just not acceptable as a general answer. There
> are a good many applications out there which do DDL as part of ongoing
> activity (part of ETL, or something else) and still need to be able to
> get a pg_dump done. It's not a design I'd recommend, but I don't think
> we get to just write it off either.

Agreed -- "user caution" is a recipe for trouble, because these things
cannot always be planned in minute detail (or such planning creates an
excessive cost.)

One concern is schema changes that make a dump unrestorable, for
instance if there's a foreign key relationship between tables A and B,
such that pg_dump dumps the FK for table A but by the time it dumps
table B the unique index has gone and thus restoring the FK fails.
If this is a realistic failure scenario, then we need some mechanism to
avoid it.

One possible idea would be to create a new lock level which conflicts
with DDL changes but not with regular operation including dumps; so it
wouldn't self-conflict but it would conflict with ShareUpdateExclusive.
pg_dump would acquire a lock of that level instead of AccessShare; thus
two pg_dumps would be able to run on the same table simultaneously, but
it would block and be blocked by DDL changes that grab SUE. The big
hole in this is that pg_dump would still block vacuum, which is a
problem. I hesitate two suggest two extra levels, one for dumps (which
wouldn't conflict with SUE) and one for non-exclusive DDL changes (which
would.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 15:49:46
Message-ID: CAOeZVidb=3kZ1oY7rmwkYti9BGh==j086QsNJMUVzhPx6COqVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 8:19 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> * Atri Sharma (atri(dot)jiit(at)gmail(dot)com) wrote:
> > If its not the case, the user should be more careful about when he is
> > scheduling backups to so that they dont conflict with DDL changes.
>
> I'm not following this as closely as I'd like to, but I wanted to voice
> my opinion that this is just not acceptable as a general answer. There
> are a good many applications out there which do DDL as part of ongoing
> activity (part of ETL, or something else) and still need to be able to
> get a pg_dump done. It's not a design I'd recommend, but I don't think
> we get to just write it off either.
>
>
Well, that will require something like MVCC or stricter locking in general.
That is not in line with the aim of this patch, hence I raised this point.

Regards,

Atri

Regards,

Atri
*l'apprenant*


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 16:25:20
Message-ID: CA+Tgmob1B4r8ynuFfCjOaqhUKGP6Fa3NZ36jvzRWjKJdbqj0ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 10:17 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> One possible idea would be to create a new lock level which conflicts
> with DDL changes but not with regular operation including dumps; so it
> wouldn't self-conflict but it would conflict with ShareUpdateExclusive.
> pg_dump would acquire a lock of that level instead of AccessShare; thus
> two pg_dumps would be able to run on the same table simultaneously, but
> it would block and be blocked by DDL changes that grab SUE. The big
> hole in this is that pg_dump would still block vacuum, which is a
> problem. I hesitate two suggest two extra levels, one for dumps (which
> wouldn't conflict with SUE) and one for non-exclusive DDL changes (which
> would.)

AFAIK, the only reason why vacuum takes ShareUpdateExclusive lock is
because it can't run at the same time as another vacuum. I tend to
think (and have thought for some time) that we really ought to have
vacuum take AccessShareLock on the relation plus some other lock that
is specific to vacuum (say, a "relation vacuum" lock, just as we have
"relation extension" locks).

Your idea of a lock strong enough to conflict with DDL but not
self-conflicting is interesting, too, but I can't claim to have
thought through it all that carefully just yet.

I think this is all too late for 9.4, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 16:27:20
Message-ID: 8730.1393950440@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> One concern is schema changes that make a dump unrestorable, for
> instance if there's a foreign key relationship between tables A and B,

Yeah. Ideally, what pg_dump would produce would be a consistent snapshot
of the database state as of its transaction snapshot time. We have always
had that guarantee so far as user data was concerned, but it's been shaky
(and getting worse) so far as the database schema is concerned. What
bothers me about the current patch is that it's going to make it a whole
lot more worse.

Also, I don't have any love at all for proposals that we increase the lock
level that pg_dump holds. pg_dump tends to run for a long time.

I've not been paying all that much attention to the logical-decoding
patches, but wasn't there something in there about being able to see
the catalog state as it was at some point in the past? If so, maybe
we could leverage that to allow a backend to enter a "pg_dump state"
wherein its view of the catalogs was frozen at its transaction start
snapshot. We'd have to restrict it to read-only operation for safety,
but that's surely no problem for pg_dump. If we had that, then this
whole problem of server-side computations producing inconsistent
results would go away.

There might still be a window wherein tables visible at transaction start
could be dropped before AccessShareLock could be acquired, but I think
we could let pg_dump error out in that case.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 16:38:37
Message-ID: 20140304163837.GG12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Yeah. Ideally, what pg_dump would produce would be a consistent snapshot
> of the database state as of its transaction snapshot time. We have always
> had that guarantee so far as user data was concerned, but it's been shaky
> (and getting worse) so far as the database schema is concerned. What
> bothers me about the current patch is that it's going to make it a whole
> lot more worse.

> Also, I don't have any love at all for proposals that we increase the lock
> level that pg_dump holds. pg_dump tends to run for a long time.

Agreed.

> I've not been paying all that much attention to the logical-decoding
> patches, but wasn't there something in there about being able to see
> the catalog state as it was at some point in the past? If so, maybe
> we could leverage that to allow a backend to enter a "pg_dump state"
> wherein its view of the catalogs was frozen at its transaction start
> snapshot. We'd have to restrict it to read-only operation for safety,
> but that's surely no problem for pg_dump. If we had that, then this
> whole problem of server-side computations producing inconsistent
> results would go away.

That certainly sounds like a tempting idea.

> There might still be a window wherein tables visible at transaction start
> could be dropped before AccessShareLock could be acquired, but I think
> we could let pg_dump error out in that case.

I don't have too much of an issue with the above, but I would like to
have us figure out a solution to the deadlock problem with parallel
pg_dump. The issue arises when pg_dump gets an AccessShareLock and then
another process attempts to acquire an AccessExclusiveLock, which then
blocks, and then the pg_dump worker process tries to get its
AccessShareLock- we end up not being able to make any progress on
anything at that point.

One suggestion that was discussed at PGConf.EU was having processes
which share the same snapshot (the pg_dump master and worker processes)
able to either share the same locks or at least be able to "jump" the
lock queue (that is, the worker process wouldn't have to wait being the
AEL to get an ASL, since the ASL was already aquired for the snapshot
which was exported and shared with it).

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 16:40:10
Message-ID: 9029.1393951210@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I think this is all too late for 9.4, though.

I agree with the feeling that a meaningful fix for pg_dump isn't going
to get done for 9.4. So that leaves us with the alternatives of
(1) put off the lock-strength-reduction patch for another year;
(2) push it anyway and accept a reduction in pg_dump reliability.

I don't care for (2). I'd like to have lock strength reduction as
much as anybody, but it can't come at the price of reduction of
reliability.

The bigger picture here is that it seems like anytime I've thought
for more than five minutes about the lock strength reduction patch,
I've come up with some fundamental problem. That doesn't leave me
with a warm feeling that we're getting close to having something
committable.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 16:45:03
Message-ID: 20140304164503.GG4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escribió:

> I'd like to have lock strength reduction as much as anybody, but it
> can't come at the price of reduction of reliability.

Can we have at least a cut-down version of it? If we can just reduce
the lock level required for ALTER TABLE / VALIDATE, that would be an
enormous improvement already.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 16:46:10
Message-ID: 9165.1393951570@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> I don't have too much of an issue with the above, but I would like to
> have us figure out a solution to the deadlock problem with parallel
> pg_dump. The issue arises when pg_dump gets an AccessShareLock and then
> another process attempts to acquire an AccessExclusiveLock, which then
> blocks, and then the pg_dump worker process tries to get its
> AccessShareLock- we end up not being able to make any progress on
> anything at that point.

The original ASL was acquired by the pg_dump master, right?

> One suggestion that was discussed at PGConf.EU was having processes
> which share the same snapshot (the pg_dump master and worker processes)
> able to either share the same locks or at least be able to "jump" the
> lock queue (that is, the worker process wouldn't have to wait being the
> AEL to get an ASL, since the ASL was already aquired for the snapshot
> which was exported and shared with it).

Yeah, it seems like we need lock export not only snapshot export to make
this work nicely. But that sounds orthogonal to the issues being
discussed in this thread.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 16:50:22
Message-ID: 20140304165022.GH12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > I don't have too much of an issue with the above, but I would like to
> > have us figure out a solution to the deadlock problem with parallel
> > pg_dump. The issue arises when pg_dump gets an AccessShareLock and then
> > another process attempts to acquire an AccessExclusiveLock, which then
> > blocks, and then the pg_dump worker process tries to get its
> > AccessShareLock- we end up not being able to make any progress on
> > anything at that point.
>
> The original ASL was acquired by the pg_dump master, right?

Yup. It goes through and gets ASLs on everything first.

> > One suggestion that was discussed at PGConf.EU was having processes
> > which share the same snapshot (the pg_dump master and worker processes)
> > able to either share the same locks or at least be able to "jump" the
> > lock queue (that is, the worker process wouldn't have to wait being the
> > AEL to get an ASL, since the ASL was already aquired for the snapshot
> > which was exported and shared with it).
>
> Yeah, it seems like we need lock export not only snapshot export to make
> this work nicely. But that sounds orthogonal to the issues being
> discussed in this thread.

Indeed, just figured I'd mention it since we're talking about
pg_dump-related locking.

Thanks,

Stephen


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 17:21:14
Message-ID: CAOeZVier5OL7EV0a6iFBNGh_T+cg-UwNm8Jjwzx-qNuPz+=0GQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 10:20 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > > I don't have too much of an issue with the above, but I would like to
> > > have us figure out a solution to the deadlock problem with parallel
> > > pg_dump. The issue arises when pg_dump gets an AccessShareLock and
> then
> > > another process attempts to acquire an AccessExclusiveLock, which then
> > > blocks, and then the pg_dump worker process tries to get its
> > > AccessShareLock- we end up not being able to make any progress on
> > > anything at that point.
> >
> > The original ASL was acquired by the pg_dump master, right?
>
> Yup. It goes through and gets ASLs on everything first.
>
> > > One suggestion that was discussed at PGConf.EU was having processes
> > > which share the same snapshot (the pg_dump master and worker processes)
> > > able to either share the same locks or at least be able to "jump" the
> > > lock queue (that is, the worker process wouldn't have to wait being the
> > > AEL to get an ASL, since the ASL was already aquired for the snapshot
> > > which was exported and shared with it).
> >
> > Yeah, it seems like we need lock export not only snapshot export to make
> > this work nicely. But that sounds orthogonal to the issues being
> > discussed in this thread.
>
> Indeed, just figured I'd mention it since we're talking about
> pg_dump-related locking.
>
>
What happens for foreign key constraints? For pg_dump, do we lock the
tables referenced by the table which is being dumped right now? If that is
the case, wouldnt MVCC based approach be the best way for this?

Please ignore if I said anything silly. I am just trying to understand how
it works here.

Regards,

Atri


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-04 17:50:17
Message-ID: 20140304175017.GB27273@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-04 11:40:10 -0500, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes: > I think this is all too
> late for 9.4, though.
>
> I agree with the feeling that a meaningful fix for pg_dump isn't going
> to get done for 9.4. So that leaves us with the alternatives of (1)
> put off the lock-strength-reduction patch for another year; (2) push
> it anyway and accept a reduction in pg_dump reliability.
>
> I don't care for (2). I'd like to have lock strength reduction as
> much as anybody, but it can't come at the price of reduction of
> reliability.

I am sorry, but I think this is vastly overstating the scope of the
pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the
amount of problems that has caused in the past is surprisingly low. If
such a frequently used command doesn't cause problems, why are you
assuming other commands to be that problematic? And I think it's hard to
argue that the proposed changes are more likely to cause problems.

Let's try to go at this a bit more methodically. The commands that -
afaics - change their locklevel due to latest patch (v21) are:

01) ALTER TABLE .. ADD CONSTRAINT
02) ALTER TABLE .. ADD CONSTRAINT ... USING
03) ALTER TABLE .. ENABLE | DISABLE [ REPLICA | ALWAYS ] TRIGGER [ ALL ]
04) ALTER TABLE .. ALTER CONSTRAINT
05) ALTER TABLE .. REPLICA IDENTITY
06) ALTER TABLE .. ALTER COLUMN .. SET NOT NULL (*not* DROP NULL)
cmd_lockmode = ShareRowExclusiveLock;

07) ALTER TABLE ... ALTER COLUMN ... SET STATISTICS
08) ALTER TABLE ... CLUSTER ON ...
09) ALTER TABLE ... SET WITHOUT CLUSTER
10) ALTER TABLE ... SET (...)
11) ALTER TABLE ... RESET (...)
12) ALTER TABLE ... ALTER COLUMN ... SET (...)
13) ALTER TABLE ... ALTER COLUMN ... RESET (...)
14) ALTER TABLE ... VALIDATE CONSTRAINT constraint_name
cmd_lockmode = ShareUpdateExclusiveLock;

I have my reservations about ADD CONSTRAINT (including SET NOT NULL)
being unproblematic (mostly because I haven't thought through possible
consquences for the planner making different choices with added
constraints).

From the perspective of pg_dump consistency, except ADD CONSTRAINT, none
of those seem to have graver possible consequences than CREATE INDEX
(and DROP INDEX CONCURRENTLY) already being unsafe.

In fact all of those should actually end up being *safer*, even ending
up always being dumped consistently since they are all reconstructed
clientside by pg_dump.
You argue elsewhere that that's a fragile coincidence. But so what, even
if it changes, the consequences still are going to be *far* less
significant than missing various index, trigger, and whatnot commands.

I think the set of problems you mention are going to be really important
when we someday get around to make stuff like ALTER TABLE ... ADD/DROP
COLUMN require lower lock levels, but that's not what's proposed.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7
Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 19:39:55
Message-ID: CA+U5nML2maQ38d1Zx-Z3JZyaZ6wHk2bqOXwOgB4PbVkDg-aJ1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4 March 2014 16:27, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> One concern is schema changes that make a dump unrestorable, for
>> instance if there's a foreign key relationship between tables A and B,
>
> Yeah. Ideally, what pg_dump would produce would be a consistent snapshot
> of the database state as of its transaction snapshot time. We have always
> had that guarantee so far as user data was concerned, but it's been shaky
> (and getting worse) so far as the database schema is concerned. What
> bothers me about the current patch is that it's going to make it a whole
> lot more worse.

While thinking this through it occurs to me that there is no problem at all.

Your earlier claim that the dump is inconsistent just isn't accurate.
We now have MVCC catalogs, so any dump is going to see a perfectly
consistent set of data plus DDL. OK the catalogs may change AFTER the
snapshot was taken for the dump, but then so can the data change -
that's just MVCC.

I was going to add an option to increase lock level, but I can't see
why you'd want it even. The dumps are consistent...

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>,Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 19:43:08
Message-ID: db27916d-b55f-4fba-8613-a8574b0bc815@email.android.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On March 4, 2014 8:39:55 PM CET, Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>On 4 March 2014 16:27, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>>> One concern is schema changes that make a dump unrestorable, for
>>> instance if there's a foreign key relationship between tables A and
>B,
>>
>> Yeah. Ideally, what pg_dump would produce would be a consistent
>snapshot
>> of the database state as of its transaction snapshot time. We have
>always
>> had that guarantee so far as user data was concerned, but it's been
>shaky
>> (and getting worse) so far as the database schema is concerned. What
>> bothers me about the current patch is that it's going to make it a
>whole
>> lot more worse.
>
>While thinking this through it occurs to me that there is no problem at
>all.
>
>Your earlier claim that the dump is inconsistent just isn't accurate.
>We now have MVCC catalogs, so any dump is going to see a perfectly
>consistent set of data plus DDL. OK the catalogs may change AFTER the
>snapshot was taken for the dump, but then so can the data change -
>that's just MVCC.
>
>I was going to add an option to increase lock level, but I can't see
>why you'd want it even. The dumps are consistent...

Mvcc scans only guarantee that individual scans are consistent, not that separate scans are. Each individual scan takes a new snapshot if there's been ddl.

Andres

--
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-04 20:08:37
Message-ID: 13336.1393963717@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-03-04 11:40:10 -0500, Tom Lane wrote:
>> I don't care for (2). I'd like to have lock strength reduction as
>> much as anybody, but it can't come at the price of reduction of
>> reliability.

> I am sorry, but I think this is vastly overstating the scope of the
> pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the
> amount of problems that has caused in the past is surprisingly low.

CREATE INDEX happens to be okay because pg_dump won't try to dump indexes
it doesn't see in its snapshot, ie the list of indexes to dump is created
client-side. CREATE INDEX CONCURRENTLY, otoh, already did break pg_dump,
and we had to hack things to fix it; see commit
683abc73dff549e94555d4020dae8d02f32ed78b.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 20:54:24
Message-ID: CA+TgmoYqQMUijqzyLAn6W-QzGkMTm_XoXDB3BR4wiPpBmtASNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 2:39 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 4 March 2014 16:27, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>>> One concern is schema changes that make a dump unrestorable, for
>>> instance if there's a foreign key relationship between tables A and B,
>>
>> Yeah. Ideally, what pg_dump would produce would be a consistent snapshot
>> of the database state as of its transaction snapshot time. We have always
>> had that guarantee so far as user data was concerned, but it's been shaky
>> (and getting worse) so far as the database schema is concerned. What
>> bothers me about the current patch is that it's going to make it a whole
>> lot more worse.
>
> While thinking this through it occurs to me that there is no problem at all.
>
> Your earlier claim that the dump is inconsistent just isn't accurate.
> We now have MVCC catalogs, so any dump is going to see a perfectly
> consistent set of data plus DDL. OK the catalogs may change AFTER the
> snapshot was taken for the dump, but then so can the data change -
> that's just MVCC.

Unfortunately, this isn't correct. The MVCC snapshots taken for
catalog scans are "instantaneous"; that is, we take a new, current
snapshot for each catalog scan. If all of the ruleutils.c stuff were
using the transaction snapshot rather than instantaneous snapshots,
this would be right. But as has been previously discussed, that's not
the case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 21:37:48
Message-ID: 15067.1393969068@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Mar 4, 2014 at 2:39 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Your earlier claim that the dump is inconsistent just isn't accurate.
>> We now have MVCC catalogs, so any dump is going to see a perfectly
>> consistent set of data plus DDL. OK the catalogs may change AFTER the
>> snapshot was taken for the dump, but then so can the data change -
>> that's just MVCC.

> Unfortunately, this isn't correct. The MVCC snapshots taken for
> catalog scans are "instantaneous"; that is, we take a new, current
> snapshot for each catalog scan. If all of the ruleutils.c stuff were
> using the transaction snapshot rather than instantaneous snapshots,
> this would be right. But as has been previously discussed, that's not
> the case.

Yeah. And that's *necessary* for catalog lookups in a normally
functioning backend, because we have to see latest data (eg, it wouldn't
do for a backend to fail to enforce a just-added CHECK constraint because
it was committed after the backend's transaction started).

However, it seems possible that we could have a mode in which a read-only
session did all its catalog fetches according to the transaction snapshot.
That would get us to a situation where the backend-internal lookups that
ruleutils relies on would give the same answers as queries done by
pg_dump. Robert's work on getting rid of SnapshotNow has probably moved
that much closer than it was before, but it's still not exactly a trivial
patch.

Meanwhile, Andres claimed upthread that none of the currently-proposed
reduced-lock ALTER commands affect data that pg_dump is using ruleutils
to fetch. If that's the case, then maybe this is a problem that we can
punt till later. I've not gone through the list to verify it though.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 22:29:31
Message-ID: 531653CB.7050406@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/04/2014 11:43 AM, Andres Freund wrote:
> On March 4, 2014 8:39:55 PM CET, Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>> I was going to add an option to increase lock level, but I can't see
>> why you'd want it even. The dumps are consistent...
>
> Mvcc scans only guarantee that individual scans are consistent, not that separate scans are. Each individual scan takes a new snapshot if there's been ddl.
>
> Andres
>

I thought that we were sharing the same snapshot, for parallel dump?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 22:39:01
Message-ID: 20140304223901.GD27273@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-04 14:29:31 -0800, Josh Berkus wrote:
> On 03/04/2014 11:43 AM, Andres Freund wrote:
> > On March 4, 2014 8:39:55 PM CET, Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
> >> I was going to add an option to increase lock level, but I can't see
> >> why you'd want it even. The dumps are consistent...
> >
> > Mvcc scans only guarantee that individual scans are consistent, not that separate scans are. Each individual scan takes a new snapshot if there's been ddl.

> I thought that we were sharing the same snapshot, for parallel dump?

That snapshot is about data, not the catalog. And no, we can't easily
reuse one for the other, see elsewhere in this thread for some of the
reasons.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-04 22:41:55
Message-ID: CAM-w4HM-JVTSs1czXQSZUpE8PD+egv10q0E7qGVrc9aC9mvNgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 8:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> CREATE INDEX CONCURRENTLY, otoh, already did break pg_dump,
> and we had to hack things to fix it; see commit
> 683abc73dff549e94555d4020dae8d02f32ed78b.

Well pg_dump was only broken in that there was a new catalog state to
deal with. But the commit you linked to was fixing pg_upgrade which
was broken because the on-disk schema was then out of sync with what
pg_dump would generate. But that should only matter for creating or
deleting whole relations.
--
greg


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 22:43:55
Message-ID: 20140304224355.GE27273@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-04 16:37:48 -0500, Tom Lane wrote:
> However, it seems possible that we could have a mode in which a read-only
> session did all its catalog fetches according to the transaction snapshot.
> That would get us to a situation where the backend-internal lookups that
> ruleutils relies on would give the same answers as queries done by
> pg_dump. Robert's work on getting rid of SnapshotNow has probably moved
> that much closer than it was before, but it's still not exactly a trivial
> patch.

The most interesting bit seems to be the cache invalidation handling. It
would need to be something like PushCatalogSnapshot() which disables
applying invals, including catchup interrupts and all. If the sinval
queue hasn't overflown while that snapshot was up, everything is fine we
just need to apply all pending invalidations, if it has, we need to do a
InvalidateSystemCaches().

> Meanwhile, Andres claimed upthread that none of the currently-proposed
> reduced-lock ALTER commands affect data that pg_dump is using ruleutils
> to fetch. If that's the case, then maybe this is a problem that we can
> punt till later. I've not gone through the list to verify it though.

I think it's true for all but T_AddConstraint, but I am wary about that
one anyway. But somebody else should definitely check the list.

I wonder if AddConstraintUsing would possibly be safe...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-04 22:46:12
Message-ID: 16437.1393973172@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> On Tue, Mar 4, 2014 at 8:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> CREATE INDEX CONCURRENTLY, otoh, already did break pg_dump,
>> and we had to hack things to fix it; see commit
>> 683abc73dff549e94555d4020dae8d02f32ed78b.

> Well pg_dump was only broken in that there was a new catalog state to
> deal with. But the commit you linked to was fixing pg_upgrade which
> was broken because the on-disk schema was then out of sync with what
> pg_dump would generate.

No, it was fixing cases that would cause problems with or without
pg_upgrade. Arguably that patch made it worse for pg_upgrade, which
needed a followon patch (203d8ae2d).

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-04 22:53:59
Message-ID: 16596.1393973639@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-03-04 16:37:48 -0500, Tom Lane wrote:
>> However, it seems possible that we could have a mode in which a read-only
>> session did all its catalog fetches according to the transaction snapshot.
>> That would get us to a situation where the backend-internal lookups that
>> ruleutils relies on would give the same answers as queries done by
>> pg_dump. Robert's work on getting rid of SnapshotNow has probably moved
>> that much closer than it was before, but it's still not exactly a trivial
>> patch.

> The most interesting bit seems to be the cache invalidation handling. It
> would need to be something like PushCatalogSnapshot() which disables
> applying invals, including catchup interrupts and all. If the sinval
> queue hasn't overflown while that snapshot was up, everything is fine we
> just need to apply all pending invalidations, if it has, we need to do a
> InvalidateSystemCaches().

Yeah, at least within the transaction we would simply ignore invals.
To avoid causing sinval queue overrun (which would hurt everyone
system-wide), my inclination would be to just drop invals on the floor all
the time when in this mode, and forcibly do InvalidateSystemCaches at
transaction end. For pg_dump, at least, there is no value in working any
harder than that, since it's going to quit at transaction end anyhow.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-05 09:28:44
Message-ID: CA+U5nM+anhaNZUUGh+idoMVfyZGKcpyN9PaGTP4yRuHtnOX3fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4 March 2014 21:37, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Mar 4, 2014 at 2:39 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> Your earlier claim that the dump is inconsistent just isn't accurate.
>>> We now have MVCC catalogs, so any dump is going to see a perfectly
>>> consistent set of data plus DDL. OK the catalogs may change AFTER the
>>> snapshot was taken for the dump, but then so can the data change -
>>> that's just MVCC.
>
>> Unfortunately, this isn't correct. The MVCC snapshots taken for
>> catalog scans are "instantaneous"; that is, we take a new, current
>> snapshot for each catalog scan. If all of the ruleutils.c stuff were
>> using the transaction snapshot rather than instantaneous snapshots,
>> this would be right. But as has been previously discussed, that's not
>> the case.
>
> Yeah. And that's *necessary* for catalog lookups in a normally
> functioning backend, because we have to see latest data (eg, it wouldn't
> do for a backend to fail to enforce a just-added CHECK constraint because
> it was committed after the backend's transaction started).

OK, thanks for explaining. A valuable point to note for us all.

> However, it seems possible that we could have a mode in which a read-only
> session did all its catalog fetches according to the transaction snapshot.
> That would get us to a situation where the backend-internal lookups that
> ruleutils relies on would give the same answers as queries done by
> pg_dump. Robert's work on getting rid of SnapshotNow has probably moved
> that much closer than it was before, but it's still not exactly a trivial
> patch.
>
> Meanwhile, Andres claimed upthread that none of the currently-proposed
> reduced-lock ALTER commands affect data that pg_dump is using ruleutils
> to fetch. If that's the case, then maybe this is a problem that we can
> punt till later. I've not gone through the list to verify it though.

So that returns us to solving the catalog consistency problem in
pg_dump and similar applications.

We could

(1) change the lock taken by pg_dump to be ShareUpdateExclusive. As
discussed, this would be optional. (Trivial implementation)

The catalog accesses are all in a rather isolated piece of code in
pg_dump and run for a short period. That allows us to consider locking
*always* at ShareUpdateExclusive but only for the period of catalog
access and then release the higher level lock before transaction end.
Since pg_dump is a client program any action we take to resolve this
would need to be done in a user accessible way. That is acceptable
since there may be other user programs that wish/need to read a
consistent view of the definition of a table. This can be implemented
in a few ways:

(2) Implement a server function that allows you to lock a table for a
short duration. e.g. pg_lock_catalog(Oid) and pg_unlock_catalog(Oid).
We can already do this in server-side code, so this is simply a matter
of exposing that same functionality for users.

(3) A new variant of the LOCK command: LOCK CATALOG FOR tablename IN
lock mode MODE NOWAIT, which then would have a matching UNLOCK CATALOG
FOR tablename command. This is just a sugar coated version of (2).

(4) Implement functions to suspend invalidation message handling for a
short period. That's basically the same as (2) in profile. My feeling
is that sounds rather dangerous and not something I'd want to go near
now in in the future.

We tried to avoid locking the catalog some years back, which is how we
went off down this MVCC catalog access, which now seems to have been
something of a red-shifted herring. ISTM that the user would need to
specifically request a "consistent catalog".

Using (2) in pg_dump is pretty easy - patch attached. So we can solve
this problem completely in about another 1 hour of work, so I suggest
we implement (2) and be done.

Happy to document this in a new subsection of docs to describe how to
dump a consistent view of a database object from a user application.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
lock_catalog_for_pgdump.v1.patch application/octet-stream 3.1 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-06 09:47:04
Message-ID: CA+U5nMJNg3aHfXiestbSA0fst_3bEVo4To1J+nwTkEqG9rjrRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 5 March 2014 09:28, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> So that returns us to solving the catalog consistency problem in
> pg_dump and similar applications.

No answer, guys, and time is ticking away here.

I'd like to get a communal solution to this rather than just punting
the whole patch.

If we have to strip it down to the bar essentials, so be it. For me,
the biggest need here is to make VALIDATE CONSTRAINT take only a
ShareUpdateExclusiveLock while it runs. Almost everything else needs a
full AccessExclusiveLock anyway, or doesn't run for very long so isn't
a critical problem. (Perhaps we can then wrap ADD CONSTRAINT ... NOT
VALID and VALIDATE into a single command using the CONCURRENTLY
keyword so it runs two transactions to complete the task).

Validating FKs on big tables can take hours and it really isn't
acceptable for us to lock out access while we do that. FKs are
*supposed* to be a major reason people use RDBMS, so keeping them in a
state where they are effectively unusable is a major debilitating
point against adoption of PostgreSQL.

If there are issues with pg_dump we can just document them.

Guide me with your thoughts.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-06 22:43:40
Message-ID: 20140306224340.GA3551655@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 04, 2014 at 06:50:17PM +0100, Andres Freund wrote:
> On 2014-03-04 11:40:10 -0500, Tom Lane wrote:
> > Robert Haas <robertmhaas(at)gmail(dot)com> writes: > I think this is all too
> > late for 9.4, though.
> >
> > I agree with the feeling that a meaningful fix for pg_dump isn't going
> > to get done for 9.4. So that leaves us with the alternatives of (1)
> > put off the lock-strength-reduction patch for another year; (2) push
> > it anyway and accept a reduction in pg_dump reliability.
> >
> > I don't care for (2). I'd like to have lock strength reduction as
> > much as anybody, but it can't come at the price of reduction of
> > reliability.
>
> I am sorry, but I think this is vastly overstating the scope of the
> pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the
> amount of problems that has caused in the past is surprisingly low. If
> such a frequently used command doesn't cause problems, why are you
> assuming other commands to be that problematic? And I think it's hard to
> argue that the proposed changes are more likely to cause problems.
>
> Let's try to go at this a bit more methodically. The commands that -
> afaics - change their locklevel due to latest patch (v21) are:
[snip]

Good analysis. The hazards arise when pg_dump uses one of the ruleutils.c
deparse worker functions. As a cross-check to your study, I looked briefly at
the use of those functions in pg_dump and how this patch might affect them:

-- pg_get_constraintdef()

pg_dump reads the constraint OID with its transaction snapshot, so we will
never see a too-new constraint. Dropping a constraint still requires
AccessExclusiveLock.

Concerning VALIDATE CONSTRAINT, pg_dump reads convalidated with its
transaction snapshot and uses that to decide whether to dump the CHECK
constraint as part of the CREATE TABLE or as a separate ALTER TABLE ADD
CONSTRAINT following the data load. However, pg_get_constraintdef() reads the
latest convalidated to decide whether to emit NOT VALID. Consequently, one
can get a dump in which the dumped table data did not yet conform to the
constraint, and the ALTER TABLE ADD CONSTRAINT (w/o NOT VALID) fails.
(Suppose you deleted the last invalid rows just before executing the VALIDATE
CONSTRAINT. I tested this by committing the DELETE + VALIDATE CONSTRAINT with
pg_dump stopped at getTableAttrs().)

One hacky, but maintainable and effective, solution to the VALIDATE CONSTRAINT
problem is to have pg_dump tack on a NOT VALID if pg_get_constraintdef() did
not do so. It's, conveniently, the last part of the definition. I would tend
to choose this. We could also just decide this isn't bad enough to worry
about. The consequence is that an ALTER TABLE ADD CONSTRAINT fails. Assuming
no --single-transaction for the original restoral, you just add NOT VALID to
the command and rerun. Like most of the potential new pg_dump problems, this
can already happen today if the relevant database changes happen between
taking the pg_dump transaction snapshot and locking the tables.

-- pg_get_expr() for default expressions

pg_dump reads pg_attrdef.adbin using its transaction snapshot, so it will
never see a too-new default. This does allow us to read a dropped default.
That's not a problem directly. However, suppose the default references a
function dropped at the same time as the default. pg_dump could fail in
pg_get_expr().

-- pg_get_indexdef()

As you explained elsewhere, new indexes are no problem. DROP INDEX still
requires AccessExclusiveLock. Overall, no problems here.

-- pg_get_ruledef()

The patch changes lock requirements for enabling and disabling of rules, but
that is all separate from the rule expression handling. No problems.

-- pg_get_triggerdef()

The patch reduces CREATE TRIGGER and DROP TRIGGER to ShareUpdateExclusiveLock.
The implications for pg_dump are similar to those for pg_get_expr().

-- pg_get_viewdef()

Untamed: pg_dump does not lock views at all.

One thing not to forget is that you can always get the old mutual exclusion
back by issuing LOCK TABLE just before a DDL operation. If some unlucky user
regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a
workaround. There's no comparable way for someone who would not experience
that problem to weaken the now-hardcoded AccessExclusiveLock. Many
consequences of insufficient locking are too severe for that workaround to
bring comfort, but the pg_dump failure scenarios around pg_get_expr() and
pg_get_triggerdef() seem mild enough. Restore-time failures are more serious,
hence my recommendation to put a hack in pg_dump around VALIDATE CONSTRAINT.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-06 23:49:16
Message-ID: CA+U5nMKrnGPBn-0kiXZvMkSUPSRjBZjFjuMPr-t1_CTCFLEQ1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6 March 2014 22:43, Noah Misch <noah(at)leadboat(dot)com> wrote:

> Good analysis. The hazards arise when pg_dump uses one of the ruleutils.c
> deparse worker functions.

Ah, good. We're thinking along the same lines. I was already working
on this analysis also. I'll complete mine and then compare notes.

> One thing not to forget is that you can always get the old mutual exclusion
> back by issuing LOCK TABLE just before a DDL operation. If some unlucky user
> regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a
> workaround. There's no comparable way for someone who would not experience
> that problem to weaken the now-hardcoded AccessExclusiveLock.

Good point.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-07 09:04:04
Message-ID: CA+U5nM+5vr2Swtqb7AKT0EYhY9CUOvA7K=E8ab5+o0KgpFGbmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6 March 2014 22:43, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Tue, Mar 04, 2014 at 06:50:17PM +0100, Andres Freund wrote:
>> On 2014-03-04 11:40:10 -0500, Tom Lane wrote:
>> > Robert Haas <robertmhaas(at)gmail(dot)com> writes: > I think this is all too
>> > late for 9.4, though.
>> >
>> > I agree with the feeling that a meaningful fix for pg_dump isn't going
>> > to get done for 9.4. So that leaves us with the alternatives of (1)
>> > put off the lock-strength-reduction patch for another year; (2) push
>> > it anyway and accept a reduction in pg_dump reliability.
>> >
>> > I don't care for (2). I'd like to have lock strength reduction as
>> > much as anybody, but it can't come at the price of reduction of
>> > reliability.
>>
>> I am sorry, but I think this is vastly overstating the scope of the
>> pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the
>> amount of problems that has caused in the past is surprisingly low. If
>> such a frequently used command doesn't cause problems, why are you
>> assuming other commands to be that problematic? And I think it's hard to
>> argue that the proposed changes are more likely to cause problems.
>>
>> Let's try to go at this a bit more methodically. The commands that -
>> afaics - change their locklevel due to latest patch (v21) are:
> [snip]
>
> Good analysis. The hazards arise when pg_dump uses one of the ruleutils.c
> deparse worker functions. As a cross-check to your study, I looked briefly at
> the use of those functions in pg_dump and how this patch might affect them:
>
> -- pg_get_constraintdef()
>
> pg_dump reads the constraint OID with its transaction snapshot, so we will
> never see a too-new constraint. Dropping a constraint still requires
> AccessExclusiveLock.

Agreed

> Concerning VALIDATE CONSTRAINT, pg_dump reads convalidated with its
> transaction snapshot and uses that to decide whether to dump the CHECK
> constraint as part of the CREATE TABLE or as a separate ALTER TABLE ADD
> CONSTRAINT following the data load. However, pg_get_constraintdef() reads the
> latest convalidated to decide whether to emit NOT VALID. Consequently, one
> can get a dump in which the dumped table data did not yet conform to the
> constraint, and the ALTER TABLE ADD CONSTRAINT (w/o NOT VALID) fails.
> (Suppose you deleted the last invalid rows just before executing the VALIDATE
> CONSTRAINT. I tested this by committing the DELETE + VALIDATE CONSTRAINT with
> pg_dump stopped at getTableAttrs().)
>
> One hacky, but maintainable and effective, solution to the VALIDATE CONSTRAINT
> problem is to have pg_dump tack on a NOT VALID if pg_get_constraintdef() did
> not do so. It's, conveniently, the last part of the definition. I would tend
> to choose this. We could also just decide this isn't bad enough to worry
> about. The consequence is that an ALTER TABLE ADD CONSTRAINT fails. Assuming
> no --single-transaction for the original restoral, you just add NOT VALID to
> the command and rerun. Like most of the potential new pg_dump problems, this
> can already happen today if the relevant database changes happen between
> taking the pg_dump transaction snapshot and locking the tables.

Too hacky for me, but some good thinking. My proposed solution is below.

> -- pg_get_expr() for default expressions
>
> pg_dump reads pg_attrdef.adbin using its transaction snapshot, so it will
> never see a too-new default. This does allow us to read a dropped default.
> That's not a problem directly. However, suppose the default references a
> function dropped at the same time as the default. pg_dump could fail in
> pg_get_expr().
>
> -- pg_get_indexdef()
>
> As you explained elsewhere, new indexes are no problem. DROP INDEX still
> requires AccessExclusiveLock. Overall, no problems here.
>
> -- pg_get_ruledef()
>
> The patch changes lock requirements for enabling and disabling of rules, but
> that is all separate from the rule expression handling. No problems.
>
> -- pg_get_triggerdef()
>
> The patch reduces CREATE TRIGGER and DROP TRIGGER to ShareUpdateExclusiveLock.
> The implications for pg_dump are similar to those for pg_get_expr().

These are certainly concerning. What surprises me the most is that
pg_dump has been so happy to randomly mix SQL using the transaction
snapshot with sys cache access code using a different snapshot. If
that was intention there is no documentation in code or in the docs to
explain that.

> -- pg_get_viewdef()
>
> Untamed: pg_dump does not lock views at all.

OMG, its really a wonder pg_dump works at all.

> One thing not to forget is that you can always get the old mutual exclusion
> back by issuing LOCK TABLE just before a DDL operation. If some unlucky user
> regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a
> workaround. There's no comparable way for someone who would not experience
> that problem to weaken the now-hardcoded AccessExclusiveLock. Many
> consequences of insufficient locking are too severe for that workaround to
> bring comfort, but the pg_dump failure scenarios around pg_get_expr() and
> pg_get_triggerdef() seem mild enough. Restore-time failures are more serious,
> hence my recommendation to put a hack in pg_dump around VALIDATE CONSTRAINT.

The right thing to do here is to not push to the extremes. If we mess
too much with the ruleutil stuff it will just be buggy. A more
considered analysis in a later release is required for a full and
complete approach. As I indicated earlier, an 80/20 solution is better
for this release.

Slimming down the patch, I've removed changes to lock levels for
almost all variants. The only lock levels now reduced are those for
VALIDATE, plus setting of relation and attribute level options.

VALIDATE is implemented by calling pg_get_constraintdef_mvcc(), a
slightly modified variant of pg_get_constraintdef that uses the
transaction snapshot. I propose this rather than Noah's solution
solely because this will allow any user to request the MVCC data,
rather than implement a hack that only works for pg_dump. I will post
the patch later today.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-08 11:14:30
Message-ID: CA+U5nMLuAUyzsf7XLaE78cDAeDGmPKtmBXs7pAZmy0QW_mGs7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 March 2014 09:04, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> The right thing to do here is to not push to the extremes. If we mess
> too much with the ruleutil stuff it will just be buggy. A more
> considered analysis in a later release is required for a full and
> complete approach. As I indicated earlier, an 80/20 solution is better
> for this release.
>
> Slimming down the patch, I've removed changes to lock levels for
> almost all variants. The only lock levels now reduced are those for
> VALIDATE, plus setting of relation and attribute level options.
>
> VALIDATE is implemented by calling pg_get_constraintdef_mvcc(), a
> slightly modified variant of pg_get_constraintdef that uses the
> transaction snapshot. I propose this rather than Noah's solution
> solely because this will allow any user to request the MVCC data,
> rather than implement a hack that only works for pg_dump. I will post
> the patch later today.

Implemented in attached patch, v22

The following commands (only) are allowed with
ShareUpdateExclusiveLock, patch includes doc changes.

ALTER TABLE ... VALIDATE CONSTRAINT constraint_name
covered by isolation test, plus verified manually with pg_dump

ALTER TABLE ... ALTER COLUMN ... SET STATISTICS
ALTER TABLE ... ALTER COLUMN ... SET (...)
ALTER TABLE ... ALTER COLUMN ... RESET (...)

ALTER TABLE ... CLUSTER ON ...
ALTER TABLE ... SET WITHOUT CLUSTER
ALTER TABLE ... SET (...)
covered by isolation test

ALTER TABLE ... RESET (...)

ALTER INDEX ... SET (...)
ALTER INDEX ... RESET (...)

All other ALTER commands take AccessExclusiveLock

I commend this patch to you for final review; I would like to commit
this in a few days.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
reduce_lock_levels.v22.patch application/octet-stream 178.0 KB

From: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-10 00:52:26
Message-ID: 531D0CCA.6000802@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/06/2014 10:47 AM, Simon Riggs wrote:
> On 5 March 2014 09:28, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
>> So that returns us to solving the catalog consistency problem in
>> pg_dump and similar applications.
> No answer, guys, and time is ticking away here.

Sorry, I've accumulated several days of backlog (and it'll only get
worse in the coming week) so I haven't had time to look at all this in
the detail I wanted to.

> I'd like to get a communal solution to this rather than just punting
> the whole patch.
>
> If we have to strip it down to the bar essentials, so be it. For me,
> the biggest need here is to make VALIDATE CONSTRAINT take only a
> ShareUpdateExclusiveLock while it runs. Almost everything else needs a
> full AccessExclusiveLock anyway, or doesn't run for very long so isn't
> a critical problem. (Perhaps we can then wrap ADD CONSTRAINT ... NOT
> VALID and VALIDATE into a single command using the CONCURRENTLY
> keyword so it runs two transactions to complete the task).
>
> Validating FKs on big tables can take hours and it really isn't
> acceptable for us to lock out access while we do that. FKs are
> *supposed* to be a major reason people use RDBMS, so keeping them in a
> state where they are effectively unusable is a major debilitating
> point against adoption of PostgreSQL.
>
> If there are issues with pg_dump we can just document them.
>
> Guide me with your thoughts.

I think committing VALIDATE CONSTRAINT is essential for 9.4; the rest
can be delayed until 9.5. None of the discussion in this thread has
been about that subcommand, and I don't personally see a problem with it.

I don't care much about ADD CONSTRAINT CONCURRENTLY. If it's there,
fine. If it's not, that's fine, too.

My personal use case for this, and I even started writing a patch before
I realized I was re-writing this one, is adding a CHECK constraint NOT
VALID so that future commits respect it, then UPDATEing the existing
rows to "fix" them, and then VALIDATE CONSTRAINTing it. There is zero
need for an AccessExclusiveLock on that last step.

My original idea was to concurrently create a partial index on the "bad"
rows, and then validate the constraint using that index. The AEL would
only be held long enough to check if the index is empty or not.
Obviously, reducing the lock level is a cleaner solution, so I'd like to
see that happen.

--
Vik


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-18 10:39:03
Message-ID: CA+U5nM+Zbejr9Cj8OkCE8k2wjCQ99ofhzUzUPVgsqsk91vu59w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 March 2014 11:14, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 7 March 2014 09:04, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
>> The right thing to do here is to not push to the extremes. If we mess
>> too much with the ruleutil stuff it will just be buggy. A more
>> considered analysis in a later release is required for a full and
>> complete approach. As I indicated earlier, an 80/20 solution is better
>> for this release.
>>
>> Slimming down the patch, I've removed changes to lock levels for
>> almost all variants. The only lock levels now reduced are those for
>> VALIDATE, plus setting of relation and attribute level options.
>>
>> VALIDATE is implemented by calling pg_get_constraintdef_mvcc(), a
>> slightly modified variant of pg_get_constraintdef that uses the
>> transaction snapshot. I propose this rather than Noah's solution
>> solely because this will allow any user to request the MVCC data,
>> rather than implement a hack that only works for pg_dump. I will post
>> the patch later today.
>
> Implemented in attached patch, v22
>
> The following commands (only) are allowed with
> ShareUpdateExclusiveLock, patch includes doc changes.
>
> ALTER TABLE ... VALIDATE CONSTRAINT constraint_name
> covered by isolation test, plus verified manually with pg_dump
>
> ALTER TABLE ... ALTER COLUMN ... SET STATISTICS
> ALTER TABLE ... ALTER COLUMN ... SET (...)
> ALTER TABLE ... ALTER COLUMN ... RESET (...)
>
> ALTER TABLE ... CLUSTER ON ...
> ALTER TABLE ... SET WITHOUT CLUSTER
> ALTER TABLE ... SET (...)
> covered by isolation test
>
> ALTER TABLE ... RESET (...)
>
> ALTER INDEX ... SET (...)
> ALTER INDEX ... RESET (...)
>
> All other ALTER commands take AccessExclusiveLock
>
> I commend this patch to you for final review; I would like to commit
> this in a few days.

I'm planning to commit this today at 1500UTC barring objections or
negative reviews.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-18 21:21:22
Message-ID: 20140318212122.GA3868905@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 18, 2014 at 10:39:03AM +0000, Simon Riggs wrote:
> On 8 March 2014 11:14, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Implemented in attached patch, v22

> > I commend this patch to you for final review; I would like to commit
> > this in a few days.
>
> I'm planning to commit this today at 1500UTC barring objections or
> negative reviews.

Not an objection, but FYI, I'm currently in the midst of reviewing this.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-19 09:52:26
Message-ID: CA+U5nML03+vA_c5OucqFLCGjJeWK_5dqrobfRQZ-GgDAXxH9ZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18 March 2014 21:21, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Tue, Mar 18, 2014 at 10:39:03AM +0000, Simon Riggs wrote:
>> On 8 March 2014 11:14, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > Implemented in attached patch, v22
>
>> > I commend this patch to you for final review; I would like to commit
>> > this in a few days.
>>
>> I'm planning to commit this today at 1500UTC barring objections or
>> negative reviews.
>
> Not an objection, but FYI, I'm currently in the midst of reviewing this.

Thanks, I'll holdoff until I hear from you.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-20 08:00:50
Message-ID: 532AA032.8070207@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/18/2014 11:39 AM, Simon Riggs wrote:
> On 8 March 2014 11:14, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On 7 March 2014 09:04, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>
>>> The right thing to do here is to not push to the extremes. If we mess
>>> too much with the ruleutil stuff it will just be buggy. A more
>>> considered analysis in a later release is required for a full and
>>> complete approach. As I indicated earlier, an 80/20 solution is better
>>> for this release.
>>>
>>> Slimming down the patch, I've removed changes to lock levels for
>>> almost all variants. The only lock levels now reduced are those for
>>> VALIDATE, plus setting of relation and attribute level options.
>>>
>>> VALIDATE is implemented by calling pg_get_constraintdef_mvcc(), a
>>> slightly modified variant of pg_get_constraintdef that uses the
>>> transaction snapshot. I propose this rather than Noah's solution
>>> solely because this will allow any user to request the MVCC data,
>>> rather than implement a hack that only works for pg_dump. I will post
>>> the patch later today.
>> Implemented in attached patch, v22
>>
>> The following commands (only) are allowed with
>> ShareUpdateExclusiveLock, patch includes doc changes.
>>
>> ALTER TABLE ... VALIDATE CONSTRAINT constraint_name
>> covered by isolation test, plus verified manually with pg_dump
>>
>> ALTER TABLE ... ALTER COLUMN ... SET STATISTICS
>> ALTER TABLE ... ALTER COLUMN ... SET (...)
>> ALTER TABLE ... ALTER COLUMN ... RESET (...)
>>
>> ALTER TABLE ... CLUSTER ON ...
>> ALTER TABLE ... SET WITHOUT CLUSTER
>> ALTER TABLE ... SET (...)
>> covered by isolation test
>>
>> ALTER TABLE ... RESET (...)
>>
>> ALTER INDEX ... SET (...)
>> ALTER INDEX ... RESET (...)
>>
>> All other ALTER commands take AccessExclusiveLock
>>
>> I commend this patch to you for final review; I would like to commit
>> this in a few days.
> I'm planning to commit this today at 1500UTC barring objections or
> negative reviews.

At my current level of competence, this patch looks good to me. I'm
looking forward to reading Noah's review to see what I may have missed.

The attached patch fixes two typos in the code comments.

--
Vik

Attachment Content-Type Size
typos.patch text/x-diff 1.6 KB

From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 03:45:56
Message-ID: 20140321034556.GA3927180@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 08, 2014 at 11:14:30AM +0000, Simon Riggs wrote:
> On 7 March 2014 09:04, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > The right thing to do here is to not push to the extremes. If we mess
> > too much with the ruleutil stuff it will just be buggy. A more
> > considered analysis in a later release is required for a full and
> > complete approach. As I indicated earlier, an 80/20 solution is better
> > for this release.
> >
> > Slimming down the patch, I've removed changes to lock levels for
> > almost all variants. The only lock levels now reduced are those for
> > VALIDATE, plus setting of relation and attribute level options.

Good call.

> The following commands (only) are allowed with
> ShareUpdateExclusiveLock, patch includes doc changes.
>
> ALTER TABLE ... VALIDATE CONSTRAINT constraint_name
> covered by isolation test, plus verified manually with pg_dump

I found a pre-existing bug aggravated by this, which I will shortly report on
a new thread.

> ALTER TABLE ... ALTER COLUMN ... SET STATISTICS
> ALTER TABLE ... ALTER COLUMN ... SET (...)
> ALTER TABLE ... ALTER COLUMN ... RESET (...)
>
> ALTER TABLE ... CLUSTER ON ...
> ALTER TABLE ... SET WITHOUT CLUSTER

A comment at check_index_is_clusterable() still mentions "exclusive lock".

> ALTER TABLE ... SET (...)
> covered by isolation test
>
> ALTER TABLE ... RESET (...)
>
> ALTER INDEX ... SET (...)
> ALTER INDEX ... RESET (...)

See discussion below.

> All other ALTER commands take AccessExclusiveLock

> --- a/doc/src/sgml/mvcc.sgml
> +++ b/doc/src/sgml/mvcc.sgml
> @@ -865,7 +865,8 @@ ERROR: could not serialize access due to read/write dependencies among transact
> <para>
> Acquired by <command>VACUUM</command> (without <option>FULL</option>),
> <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
> - some forms of <command>ALTER TABLE</command>.
> + <command>ALTER TABLE VALIDATE</command> and other
> + <command>ALTER TABLE</command> variants that set options.

ALTER TABLE's documentation covers the details, so the old text sufficed for
here. I find "variants that set options" too vague, considering the nuances
of the actual list of affected forms.

> </para>
> </listitem>
> </varlistentry>
> @@ -951,7 +952,7 @@ ERROR: could not serialize access due to read/write dependencies among transact
> </para>
>
> <para>
> - Acquired by the <command>ALTER TABLE</>, <command>DROP TABLE</>,
> + Acquired by the <command>ALTER TABLE</> for rewriting, <command>DROP TABLE</>,
> <command>TRUNCATE</command>, <command>REINDEX</command>,
> <command>CLUSTER</command>, and <command>VACUUM FULL</command>
> commands.

Forms that rewrite the table are just one class of examples. I would write
"Acquired by DROP TABLE, ..., VACUUM FULL, and some forms of ALTER TABLE."

> --- a/doc/src/sgml/ref/alter_table.sgml
> +++ b/doc/src/sgml/ref/alter_table.sgml

> @@ -420,6 +439,9 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
> index specification from the table. This affects
> future cluster operations that don't specify an index.
> </para>
> + <para>
> + Changing cluster options uses a <literal>SHARE UPDATE EXCLUSIVE</literal> lock.
> + </para>
> </listitem>
> </varlistentry>
>
> @@ -467,6 +489,10 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
> FULL</>, <xref linkend="SQL-CLUSTER"> or one of the forms
> of <command>ALTER TABLE</> that forces a table rewrite.
> </para>
> + <para>
> + Changing storage parameters requires only a
> + <literal>SHARE UPDATE EXCLUSIVE</literal> lock.
> + </para>

Some places say "requires only", while others say "uses". Please adopt one of
those consistently. I somewhat prefer the latter.

> @@ -1075,6 +1105,14 @@ ALTER TABLE distributors ADD CONSTRAINT distfk FOREIGN KEY (address) REFERENCES
> </para>
>
> <para>
> + To add a foreign key constraint to a table minimising impact on other work:

Our documentation has used the "minimize" spelling exclusively.

> --- a/src/backend/catalog/toasting.c
> +++ b/src/backend/catalog/toasting.c

> @@ -52,7 +53,7 @@ static bool needs_toast_table(Relation rel);
> * to end with CommandCounterIncrement if it makes any changes.
> */
> void
> -AlterTableCreateToastTable(Oid relOid, Datum reloptions)
> +AlterTableCreateToastTable(Oid relOid, Datum reloptions, LOCKMODE lockmode)
> {
> Relation rel;
>
> @@ -63,10 +64,10 @@ AlterTableCreateToastTable(Oid relOid, Datum reloptions)
> * concurrent readers of the pg_class tuple won't have visibility issues,
> * so let's be safe.
> */

The comment ending right here is falsified by the change.

> - rel = heap_open(relOid, AccessExclusiveLock);
> + rel = heap_open(relOid, lockmode);

We now request whatever lock our caller has already taken. If that is
AccessExclusiveLock, create_toast_table() could actually add a toast table.
Otherwise, it will either deduce that no change is required or raise an error.

> @@ -161,6 +164,14 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid, Datum reloptio
> return false;
>
> /*
> + * We need to create a toast table. We shouldn't
> + * have got this far if we're in ALTER TABLE unless
> + * we are holding AccessExclusiveLock.
> + */
> + if (lockmode < AccessExclusiveLock)
> + elog(ERROR, "AccessExclusiveLock required to add toast tables.");

Our model of lock levels has not regarded them as being ordered, so I
recommend "lockmode != AccessExclusiveLock". (I don't object to using
incidental order in AlterTableGetLockLevel(), where all the relevant levels
originate in the same function.)

> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -2687,8 +2687,7 @@ AlterTableLookupRelation(AlterTableStmt *stmt, LOCKMODE lockmode)
> * The caller must lock the relation, with an appropriate lock level
> * for the subcommands requested. Any subcommand that needs to rewrite
> * tuples in the table forces the whole command to be executed with
> - * AccessExclusiveLock (actually, that is currently required always, but
> - * we hope to relax it at some point). We pass the lock level down
> + * AccessExclusiveLock. We pass the lock level down

The set of commands needing AccessExclusiveLock is much broader than those
rewriting the table. Consider replacing the quoted paragraph with something
like:

* The caller has locked the relation with "lockmode", which must be at least
* as strong as AlterTableGetLockLevel(stmt->cmds). We pass the lock level
* down so ...

> @@ -2750,35 +2772,29 @@ AlterTableInternal(Oid relid, List *cmds, bool recurse)
> * ALTER TABLE RENAME which is treated as a different statement type T_RenameStmt
> * and does not travel through this section of code and cannot be combined with
> * any of the subcommands given here.
> + *
> + * Note that Hot Standby only knows about AccessExclusiveLocks on the master
> + * so any changes that might affect SELECTs running on standbys need to use
> + * AccessExclusiveLocks even if you think a lesser lock would do, unless you
> + * have a solution for that also.

Out of curiosity, do SELECTs on hot standbys impose known challenges in this
area not shared with local SELECTs?

> + *
> + * Also note that pg_dump uses only an AccessShareLock, meaning that anything
> + * that takes a lock less than AccessExclusiveLock can change object definitions
> + * while pg_dump is running. Be careful to check that the appropriate data is
> + * derived by pg_dump using an MVCC snapshot, rather than syscache lookups,
> + * otherwise we might end up with an inconsistent dump that can't restore.
> + *
> + * Be careful to ensure this function is called for Tables and Indexes only.
> + * It is not currently safe to be called for Views because security_barrier
> + * is listed as an option and so would be allowed to be set at a level lower
> + * than AccessExclusiveLock, which would not be correct.

This statement is accepted and takes only ShareUpdateExclusiveLock:

alter table information_schema.triggers set (security_barrier = true);

I suggest adding a LOCKMODE field to relopt_gen and adding a
reloptions_locklevel() function to determine the required level from an
options list. That puts control of the lock level near the code that
understands the implications for each option. You can then revert the
addition of AlterViewInternal() and some of the utility.c changes.

If I had to pick a reloption most likely to benefit from AccessExclusiveLock,
it would be user_catalog_table rather than security_barrier. I wonder if
output plugins would want a clear moment in the WAL stream -- the commit of
the transaction that sets the reloption -- after which they can assume that
all future records pertaining to the table in question have the needed bits.
If so, that implies AccessExclusiveLock for this reloption, because
heap_page_prune_opt() issues affected WAL records under AccessShareLock.

AT_ReplaceRelOptions could require the highest lock level required by any
option. It usage seems to be limited to CREATE OR REPLACE VIEW, so I would
give it AccessExclusiveLock and be done.

> - * 2. Relcache needs to be internally consistent, so unless we lock the
> - * definition during reads we have no way to guarantee that.

I looked for hazards like this, but I found none in the ALTER forms covered by
this patch. None of them modify multiple catalog rows affecting the same
relcache entry. However, thinking about that did lead me to ponder another
class of hazards. When backends can use one or more relations concurrently
with a DDL operation affecting those relations, those backends can find
themselves running with a subset of the catalog changes made within a
particular DDL operation. Consider VALIDATE CONSTRAINT against an inherited
constraint of an inheritance parent. It validates child table constraints,
modifying one catalog row per table. At COMMIT time, we queue sinval messages
for all affected tables. We add to the queue in atomic groups of
WRITE_QUANTUM (64) messages. Between two such groups joining the queue,
another backend may process the first group of messages. If the original DDL
used AccessExclusiveLock, this is always harmless. The DDL-issuing backend
still holds its lock, which means the inval-accepting backend must not have
the relation open. If the inval-accepting backend later opens the affected
relation, it will first acquire some lock and process the rest of the
invalidations from the DDL operation. When doing DDL under a weaker lock, the
inval-accepting backend might apply half the invalidations and immediately use
them in the context of an open relation. For VALIDATE CONSTRAINT, this means
a backend might briefly recognize only a subset of the inheritance tree
becoming valid. (I did not actually build a test case to confirm this.)

Considering that constraint exclusion is the sole consumer of
convalidated/ccvalid that can run in parallel with VALIDATE CONSTRAINT, I
think this is harmless. I did not find problems of this nature in any ALTER
TABLE forms affected by the patch. Let's just keep it in mind during future
lock level changes.

> --- a/src/backend/utils/adt/ruleutils.c
> +++ b/src/backend/utils/adt/ruleutils.c

> static char *
> pg_get_constraintdef_worker(Oid constraintId, bool fullCommand,
> - int prettyFlags)
> + int prettyFlags, bool useSyscache)
> {
> HeapTuple tup;
> Form_pg_constraint conForm;
> StringInfoData buf;
> + SysScanDesc scandesc;
> + ScanKeyData scankey[1];
> + Relation relation;

gcc 4.2.4 gives me these warnings:

ruleutils.c: In function `pg_get_constraintdef_worker':
ruleutils.c:1318: warning: `relation' may be used uninitialized in this function
ruleutils.c:1316: warning: `scandesc' may be used uninitialized in this function

> +
> + if (useSyscache)
> + tup = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constraintId));
> + else
> + {
> + Snapshot snapshot = RegisterSnapshot(GetTransactionSnapshot());
> + PushActiveSnapshot(snapshot);

There's no need to push the snapshot; registration is enough.

> +
> + relation = heap_open(ConstraintRelationId, AccessShareLock);
> +
> + ScanKeyInit(&scankey[0],
> + ObjectIdAttributeNumber,
> + BTEqualStrategyNumber, F_OIDEQ,
> + ObjectIdGetDatum(constraintId));
> +
> + scandesc = systable_beginscan(relation,
> + ConstraintOidIndexId,
> + true,
> + snapshot,
> + 1,
> + scankey);
> + tup = systable_getnext(scandesc);
> +
> + PopActiveSnapshot();
> + UnregisterSnapshot(snapshot);
> + }

Later code uses SysCacheGetAttr() regardless of how we acquired the tuple.
That works, but it deserves a brief comment mention.

pg_get_constraintdef_mvcc() still does syscache lookups by way of
decompile_column_index_array(), get_constraint_index(), and
deparse_expression_pretty(). It uses MVCC for things that matter for pg_dump
vs. reduced lock levels, but not comprehensively. I recommend not adding a
new function and instead changing pg_get_constraintdef() to use the
transaction snapshot unconditionally. It's hard to imagine an application
author making a reasoned choice between the two. Our in-tree callers, psql
and pg_dump, have no cause to prefer the less-MVCC behavior at any time.

> --- a/src/backend/utils/cache/relcache.c
> +++ b/src/backend/utils/cache/relcache.c
> @@ -161,6 +161,14 @@ static bool eoxact_list_overflowed = false;
> eoxact_list_overflowed = true; \
> } while (0)
>
> +/*
> + * EOXactTupleDescArray stores *TupleDescs that (might) need AtEOXact

Stray asterisk.

> @@ -2312,6 +2334,37 @@ RelationCloseSmgrByOid(Oid relationId)
> RelationCloseSmgr(relation);
> }
>
> +void
> +RememberToFreeTupleDescAtEOX(TupleDesc td)
> +{
> + if (EOXactTupleDescArray == NULL)
> + {
> + MemoryContext oldcxt;
> + oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
> +
> + EOXactTupleDescArray = (TupleDesc *) palloc(16 * sizeof(TupleDesc));
> + EOXactTupleDescArrayLen = 16;
> + NextEOXactTupleDescNum = 0;
> + MemoryContextSwitchTo(oldcxt);
> + }
> + else if (NextEOXactTupleDescNum >= EOXactTupleDescArrayLen)
> + {
> + MemoryContext oldcxt;
> + int32 newlen = EOXactTupleDescArrayLen * 2;
> +
> + oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
> +
> + Assert(EOXactTupleDescArrayLen > 0);
> +
> + EOXactTupleDescArray = (TupleDesc *) repalloc(EOXactTupleDescArray,
> + newlen * sizeof(TupleDesc));
> + EOXactTupleDescArrayLen = newlen;
> + MemoryContextSwitchTo(oldcxt);

No need to switch contexts for repalloc().

> --- /dev/null
> +++ b/src/test/isolation/specs/alter-table-1.spec
> @@ -0,0 +1,34 @@
> +# ALTER TABLE - Add and Validate constraint with concurrent writes
> +#
> +# ADD FOREIGN KEY allows a minimum of ShareRowExclusiveLock,
> +# so we mix writes with it to see what works or waits.
> +# VALIDATE allows a minimum of ShareUpdateExclusiveLock
> +# so we mix reads with it to see what works or waits

This comment is obsolete regarding the ADD FOREIGN KEY lock type.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 16:11:12
Message-ID: CA+U5nMJQ6Af=iRjbk=_wUMKjR=6OQLj+fDNrccM42L=QCQDAaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 March 2014 03:45, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Sat, Mar 08, 2014 at 11:14:30AM +0000, Simon Riggs wrote:
>> On 7 March 2014 09:04, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > The right thing to do here is to not push to the extremes. If we mess
>> > too much with the ruleutil stuff it will just be buggy. A more
>> > considered analysis in a later release is required for a full and
>> > complete approach. As I indicated earlier, an 80/20 solution is better
>> > for this release.
>> >
>> > Slimming down the patch, I've removed changes to lock levels for
>> > almost all variants. The only lock levels now reduced are those for
>> > VALIDATE, plus setting of relation and attribute level options.
>
> Good call.

Thanks for the review. I'll respond to each point on a later email but
looks nothing much major, apart from the point raised on separate
thread.

>> + * Be careful to ensure this function is called for Tables and Indexes only.
>> + * It is not currently safe to be called for Views because security_barrier
>> + * is listed as an option and so would be allowed to be set at a level lower
>> + * than AccessExclusiveLock, which would not be correct.
>
> This statement is accepted and takes only ShareUpdateExclusiveLock:
>
> alter table information_schema.triggers set (security_barrier = true);

I find it hard to justify why we accept such a statement. Surely its a
bug when the named table turns out to be a view? Presumably ALTER
SEQUENCE and ALTER <other stuff> has checks for the correct object
type? OMG.

> I suggest adding a LOCKMODE field to relopt_gen and adding a
> reloptions_locklevel() function to determine the required level from an
> options list. That puts control of the lock level near the code that
> understands the implications for each option. You can then revert the
> addition of AlterViewInternal() and some of the utility.c changes.

Sure, that's how we code it, but I'm not sure we should introduce that
feature. The above weirdness is not itself justification.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 17:49:42
Message-ID: 20140321174942.GA3969106@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 21, 2014 at 04:11:12PM +0000, Simon Riggs wrote:
> On 21 March 2014 03:45, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Sat, Mar 08, 2014 at 11:14:30AM +0000, Simon Riggs wrote:

> Thanks for the review. I'll respond to each point on a later email but
> looks nothing much major, apart from the point raised on separate
> thread.

Yep.

> >> + * Be careful to ensure this function is called for Tables and Indexes only.
> >> + * It is not currently safe to be called for Views because security_barrier
> >> + * is listed as an option and so would be allowed to be set at a level lower
> >> + * than AccessExclusiveLock, which would not be correct.
> >
> > This statement is accepted and takes only ShareUpdateExclusiveLock:
> >
> > alter table information_schema.triggers set (security_barrier = true);
>
> I find it hard to justify why we accept such a statement. Surely its a
> bug when the named table turns out to be a view? Presumably ALTER
> SEQUENCE and ALTER <other stuff> has checks for the correct object
> type? OMG.

We've framed ALTER TABLE's relkind leniency as a historic artifact. As a move
toward stricter checks, ALTER TABLE refused to operate on foreign tables in
9.1 and 9.2. 9.3 reversed that course, though. For better or worse, ALTER
TABLE is nearly a union of the relation ALTER possibilities. That choice is
well-entrenched.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 18:53:27
Message-ID: CA+U5nMJ0i5qU6ntEEuhm=jDJWPXHcYRRDR1CPybGuZL7x48H8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 March 2014 17:49, Noah Misch <noah(at)leadboat(dot)com> wrote:

>> >> + * Be careful to ensure this function is called for Tables and Indexes only.
>> >> + * It is not currently safe to be called for Views because security_barrier
>> >> + * is listed as an option and so would be allowed to be set at a level lower
>> >> + * than AccessExclusiveLock, which would not be correct.
>> >
>> > This statement is accepted and takes only ShareUpdateExclusiveLock:
>> >
>> > alter table information_schema.triggers set (security_barrier = true);
>>
>> I find it hard to justify why we accept such a statement. Surely its a
>> bug when the named table turns out to be a view? Presumably ALTER
>> SEQUENCE and ALTER <other stuff> has checks for the correct object
>> type? OMG.
>
> We've framed ALTER TABLE's relkind leniency as a historic artifact. As a move
> toward stricter checks, ALTER TABLE refused to operate on foreign tables in
> 9.1 and 9.2. 9.3 reversed that course, though. For better or worse, ALTER
> TABLE is nearly a union of the relation ALTER possibilities. That choice is
> well-entrenched.

By "well entrenched", I think you mean undocumented, untested, unintentional?

Do we think anyone *relies* on being able to say the word TABLE when
in fact they mean VIEW or SEQUENCE?

How is that artefact anything but a bug? i.e. is anyone going to stop
me fixing it?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 19:11:14
Message-ID: CA+U5nML+Ke5BzdQc40qvSiROa4bXcwJQzwDtPTK2OCAitK5OCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 March 2014 03:45, Noah Misch <noah(at)leadboat(dot)com> wrote:

>> + * Note that Hot Standby only knows about AccessExclusiveLocks on the master
>> + * so any changes that might affect SELECTs running on standbys need to use
>> + * AccessExclusiveLocks even if you think a lesser lock would do, unless you
>> + * have a solution for that also.
>
> Out of curiosity, do SELECTs on hot standbys impose known challenges in this
> area not shared with local SELECTs?

No, but locks less than AccessExclusiveLock won't happen at all, so
its a difference that if improperly handled could cause a bug.

Plus I wanted to indicate I'd thought about it.

>> - * 2. Relcache needs to be internally consistent, so unless we lock the
>> - * definition during reads we have no way to guarantee that.
>
> I looked for hazards like this, but I found none in the ALTER forms covered by
> this patch. None of them modify multiple catalog rows affecting the same
> relcache entry. However, thinking about that did lead me to ponder another
> class of hazards. When backends can use one or more relations concurrently
> with a DDL operation affecting those relations, those backends can find
> themselves running with a subset of the catalog changes made within a
> particular DDL operation. Consider VALIDATE CONSTRAINT against an inherited
> constraint of an inheritance parent. It validates child table constraints,
> modifying one catalog row per table. At COMMIT time, we queue sinval messages
> for all affected tables. We add to the queue in atomic groups of
> WRITE_QUANTUM (64) messages. Between two such groups joining the queue,
> another backend may process the first group of messages. If the original DDL
> used AccessExclusiveLock, this is always harmless. The DDL-issuing backend
> still holds its lock, which means the inval-accepting backend must not have
> the relation open. If the inval-accepting backend later opens the affected
> relation, it will first acquire some lock and process the rest of the
> invalidations from the DDL operation. When doing DDL under a weaker lock, the
> inval-accepting backend might apply half the invalidations and immediately use
> them in the context of an open relation. For VALIDATE CONSTRAINT, this means
> a backend might briefly recognize only a subset of the inheritance tree
> becoming valid. (I did not actually build a test case to confirm this.)
>
> Considering that constraint exclusion is the sole consumer of
> convalidated/ccvalid that can run in parallel with VALIDATE CONSTRAINT, I
> think this is harmless. I did not find problems of this nature in any ALTER
> TABLE forms affected by the patch. Let's just keep it in mind during future
> lock level changes.

I'll document

> pg_get_constraintdef_mvcc() still does syscache lookups by way of
> decompile_column_index_array(), get_constraint_index(), and
> deparse_expression_pretty(). It uses MVCC for things that matter for pg_dump
> vs. reduced lock levels, but not comprehensively. I recommend not adding a
> new function and instead changing pg_get_constraintdef() to use the
> transaction snapshot unconditionally.

OK

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 20:58:28
Message-ID: 20140321205828.GB3969106@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 21, 2014 at 06:53:27PM +0000, Simon Riggs wrote:
> On 21 March 2014 17:49, Noah Misch <noah(at)leadboat(dot)com> wrote:
>
> >> > alter table information_schema.triggers set (security_barrier = true);
> >>
> >> I find it hard to justify why we accept such a statement. Surely its a
> >> bug when the named table turns out to be a view? Presumably ALTER
> >> SEQUENCE and ALTER <other stuff> has checks for the correct object
> >> type? OMG.
> >
> > We've framed ALTER TABLE's relkind leniency as a historic artifact. As a move
> > toward stricter checks, ALTER TABLE refused to operate on foreign tables in
> > 9.1 and 9.2. 9.3 reversed that course, though. For better or worse, ALTER
> > TABLE is nearly a union of the relation ALTER possibilities. That choice is
> > well-entrenched.
>
> By "well entrenched", I think you mean undocumented, untested, unintentional?

It's deliberate; a -hackers discussion revisits it perhaps once a year. The
ALTER VIEW documentation says:

For historical reasons, ALTER TABLE can be used with views too; but the only
variants of ALTER TABLE that are allowed with views are equivalent to the
ones shown above.

ALTER INDEX and ALTER SEQUENCE say something similar.

> Do we think anyone *relies* on being able to say the word TABLE when
> in fact they mean VIEW or SEQUENCE?

pg_dump emits statements that exercise it:

psql -c 'create view v as select 1 as c; alter view v alter c set default 0;'
pg_dump --table v | grep ALTER

> How is that artefact anything but a bug? i.e. is anyone going to stop
> me fixing it?

It's not the behavior I would choose for a new product, but I can't see
benefits sufficient to overturn previous decisions to keep it.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 21:23:30
Message-ID: CA+U5nMJr6vnF-+JweWmvr57qPeBYqTP6XOW4Xg0LYmP=4FyOrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 March 2014 20:58, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Fri, Mar 21, 2014 at 06:53:27PM +0000, Simon Riggs wrote:
>> On 21 March 2014 17:49, Noah Misch <noah(at)leadboat(dot)com> wrote:
>>
>> >> > alter table information_schema.triggers set (security_barrier = true);
>> >>
>> >> I find it hard to justify why we accept such a statement. Surely its a
>> >> bug when the named table turns out to be a view? Presumably ALTER
>> >> SEQUENCE and ALTER <other stuff> has checks for the correct object
>> >> type? OMG.
>> >
>> > We've framed ALTER TABLE's relkind leniency as a historic artifact. As a move
>> > toward stricter checks, ALTER TABLE refused to operate on foreign tables in
>> > 9.1 and 9.2. 9.3 reversed that course, though. For better or worse, ALTER
>> > TABLE is nearly a union of the relation ALTER possibilities. That choice is
>> > well-entrenched.
>>
>> By "well entrenched", I think you mean undocumented, untested, unintentional?
>
> It's deliberate; a -hackers discussion revisits it perhaps once a year. The
> ALTER VIEW documentation says:
>
> For historical reasons, ALTER TABLE can be used with views too; but the only
> variants of ALTER TABLE that are allowed with views are equivalent to the
> ones shown above.
>
> ALTER INDEX and ALTER SEQUENCE say something similar.
>
>> Do we think anyone *relies* on being able to say the word TABLE when
>> in fact they mean VIEW or SEQUENCE?
>
> pg_dump emits statements that exercise it:
>
> psql -c 'create view v as select 1 as c; alter view v alter c set default 0;'
> pg_dump --table v | grep ALTER
>
>> How is that artefact anything but a bug? i.e. is anyone going to stop
>> me fixing it?
>
> It's not the behavior I would choose for a new product, but I can't see
> benefits sufficient to overturn previous decisions to keep it.

Speechless

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndQuadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-21 23:36:28
Message-ID: 29240.1395444988@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On 21 March 2014 20:58, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> It's not the behavior I would choose for a new product, but I can't see
>> benefits sufficient to overturn previous decisions to keep it.

> Speechless

The key argument for not "fixing" this is that it would break existing
pg_dump files. That's a pretty hard argument to overcome, unfortunately,
even if you're willing to blow off the possibility that client
applications might contain similar shortcuts. We still do our best to
read dump files from the 7.0 era (see ConvertTriggerToFK() for one example
of going above and beyond for that); and every so often we do hear of
people trying to get data out of such ancient servers. So even if you
went and fixed pg_dump tomorrow, it'd probably be ten or fifteen years
before people would let you stop reading dumps from existing versions.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-22 07:31:00
Message-ID: CA+U5nMKihfhazYtQFmYSw1a4DHVD5dv91dMh4HPnZ=FZOTGEfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 March 2014 23:36, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> On 21 March 2014 20:58, Noah Misch <noah(at)leadboat(dot)com> wrote:
>>> It's not the behavior I would choose for a new product, but I can't see
>>> benefits sufficient to overturn previous decisions to keep it.
>
>> Speechless
>
> The key argument for not "fixing" this is that it would break existing
> pg_dump files. That's a pretty hard argument to overcome, unfortunately,
> even if you're willing to blow off the possibility that client
> applications might contain similar shortcuts. We still do our best to
> read dump files from the 7.0 era (see ConvertTriggerToFK() for one example
> of going above and beyond for that); and every so often we do hear of
> people trying to get data out of such ancient servers. So even if you
> went and fixed pg_dump tomorrow, it'd probably be ten or fifteen years
> before people would let you stop reading dumps from existing versions.

Noah had already convinced me, but thank you for explaining the issue
behind that.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Date: 2014-03-22 07:36:02
Message-ID: CA+U5nM+ToGNx9S0uR+iUuScXvCOYKd+r9r+8tvKfv870xgYdWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 March 2014 16:11, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>>> + * Be careful to ensure this function is called for Tables and Indexes only.
>>> + * It is not currently safe to be called for Views because security_barrier
>>> + * is listed as an option and so would be allowed to be set at a level lower
>>> + * than AccessExclusiveLock, which would not be correct.
>>
>> This statement is accepted and takes only ShareUpdateExclusiveLock:
>>
>> alter table information_schema.triggers set (security_barrier = true);

>> I suggest adding a LOCKMODE field to relopt_gen and adding a
>> reloptions_locklevel() function to determine the required level from an
>> options list. That puts control of the lock level near the code that
>> understands the implications for each option. You can then revert the
>> addition of AlterViewInternal() and some of the utility.c changes.
>
> Sure, that's how we code it, but I'm not sure we should introduce that
> feature. The above weirdness is not itself justification.

OK, will follow this path. It's a good idea since it encourages
authors of new "options" to consider properly the lock level that will
be required.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Jim Nasby <jim(at)nasby(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE lock strength reduction patch is unsafe
Date: 2014-03-22 16:37:27
Message-ID: 532DBC47.20603@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/26/14, 9:15 AM, Simon Riggs wrote:
> On 26 February 2014 13:38, Andres Freund<andres(at)2ndquadrant(dot)com> wrote:
>> >Hi,
>> >
>> >On 2014-02-26 07:32:45 +0000, Simon Riggs wrote:
>>>> >> >* This definitely should include isolationtester tests actually
>>>> >> > performing concurrent ALTER TABLEs. All that's currently there is
>>>> >> > tests that the locklevel isn't too high, but not that it actually works.
>>> >>
>>> >>There is no concurrent behaviour here, hence no code that would be
>>> >>exercised by concurrent tests.
>> >
>> >Huh? There's most definitely new concurrent behaviour. Previously no
>> >other backends could have a relation open (and locked) while it got
>> >altered (which then sends out relcache invalidations). That's something
>> >that should be tested.
> It has been. High volume concurrent testing has been performed, per
> Tom's original discussion upthread, but that's not part of the test
> suite.
> For other tests I have no guide as to how to write a set of automated
> regression tests. Anything could cause a failure, so I'd need to write
> an infinite set of tests to prove there is no bug*somewhere*. How
> many tests are required? 0, 1, 3, 30?

ISTM that we don't want hand-written tests here, but rather generated tests that actually hit all potential cases. Obviously we'd never run that as part of normal reqression, but farm animals certainly could.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net