Re: Concurrent ALTER SEQUENCE RESTART Regression

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jason Petersen <jason(at)citusdata(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Concurrent ALTER SEQUENCE RESTART Regression
Date: 2017-04-27 05:52:11
Message-ID: 20170427055211.jc5tblpguawbeqjq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 2017-04-26 22:07:03 -0400, Peter Eisentraut wrote:
> On 4/26/17 21:12, Andres Freund wrote:
> > I think it's unacceptable to regress with an error message here. I've
> > seen sequence DDL being used while concurrent DML was onging in a number
> > of production use cases, and just starting to error out instead of
> > properly blocking doesn't seem acceptable to me.
>
> It's not clear to me what the use case is here that we are optimizing
> for. The best solution would depend on that. Running concurrent ALTER
> SEQUENCE in a tight loop is probably not it. ;-)

I don't think the use-case actually matters all that much. It's bad
enough that you can get "concurrent update" errors for some forms of DDL
already, and there's plenty enough complaints about it. Adding new
forms of that is completely unacceptable. Just properly locking the
object while doing DDL - which'll be a behavioural change too - seems
like the minimal thing to do.

As far as I can see the issue here is that the locking in
AlterSequence() is plainly broken:

ObjectAddress
AlterSequence(ParseState *pstate, AlterSeqStmt *stmt)
{
...
relid = RangeVarGetRelid(stmt->sequence, AccessShareLock, stmt->missing_ok);
...
/* lock page' buffer and read tuple into new sequence structure */
seqdata = read_seq_tuple(seqrel, &buf, &seqdatatuple);
...
/* Now okay to update the on-disk tuple */
START_CRIT_SECTION();
..
XLogRegisterBuffer(0, buf, REGBUF_WILL_INIT);
...
recptr = XLogInsert(RM_SEQ_ID, XLOG_SEQ_LOG);
...
END_CRIT_SECTION();
...
UnlockReleaseBuffer(buf);
...
CatalogTupleUpdate(rel, &tuple->t_self, tuple);
..

In contrast to <v10, the actual change of the tuple is *not* happening
with the page lock held. But now we do log XLOG_SEQ_LOG, then unlock
the buffer, and then do a CatalogTupleUpdate(). How is that correct?
And if so, why isn't there a honking large comment explaining why it is?

Imagine two of these running concurrently - you might end up with
A:XLogInsert B:XLogInsert B:CatalogTupleUpdate A:CatalogTupleUpdate

that can't be right, no?

- Andres

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2017-04-27 05:58:06 Re: Concurrent ALTER SEQUENCE RESTART Regression
Previous Message Peter Eisentraut 2017-04-27 02:07:03 Re: Concurrent ALTER SEQUENCE RESTART Regression

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-04-27 05:58:06 Re: Concurrent ALTER SEQUENCE RESTART Regression
Previous Message Andrew Borodin 2017-04-27 05:21:44 Re: PG 10 release notes