From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: First-draft release notes for next week's releases |
Date: | 2014-03-17 18:27:15 |
Message-ID: | 20140317182715.GN16438@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2014-03-17 14:16:41 -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
> >> IIUC, this case only occurs when using the new-in-9.3 types of
> >> nonexclusive row locks. I'm willing to bet that the number of
> >> applications using those is negligible; so I think it's all right to not
> >> mention that case explicitly, as long as the wording doesn't say that
> >> foreign keys are the *only* cause (which I didn't).
>
> > I actually think the issue could also occur with row locks of other
> > severities (is that the correct term?).
>
> The commit log entry says
>
> We were resetting the tuple's HEAP_HOT_UPDATED flag as well as t_ctid on
> WAL replay of a tuple-lock operation, which is incorrect when the tuple
> is already updated.
>
> Back-patch to 9.3. The clearing of both header elements was there
> previously, but since no update could be present on a tuple that was
> being locked, it was harmless.
>
> which I read to mean that the case can't occur with the types of row
> locks that were allowed pre-9.3.
That's not an unreasonable interpretation of the commit message, but I
don't think it's correct with respect to the code :(
> > but if I see correctly it's also triggerable if a backend waits for an
> > updating transaction to finish and follow_updates = true is passed to
> > heap_lock_tuple(). Which e.g. nodeLockRows.c does...
>
> That sounds backwards. nodeLockRows locks the latest tuple in the chain,
> so it can't be subject to this.
Hm, I don't see anything in the code preventing it, that's the
lock_tuple() before the EPQ stuff... in ExecLockRows():
foreach(lc, node->lr_arowMarks)
{
test = heap_lock_tuple(erm->relation, &tuple,
estate->es_output_cid,
lockmode, erm->noWait, true,
&buffer, &hufd);
ReleaseBuffer(buffer);
switch (test)
{
case HeapTupleSelfUpdated:
...
the true passed to heap_lock_tuple() is the follow_updates
parameter. And then in heap_lock_tuple():
if (require_sleep)
{
if (infomask & HEAP_XMAX_IS_MULTI)
{
...
/* if there are updates, follow the update chain */
if (follow_updates &&
!HEAP_XMAX_IS_LOCKED_ONLY(infomask))
{
HTSU_Result res;
res = heap_lock_updated_tuple(relation, tuple, &t_ctid,
GetCurrentTransactionId(),
...
else
{
/* wait for regular transaction to end */
if (nowait)
{
if (!ConditionalXactLockTableWait(xwait))
ereport(ERROR,
(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
errmsg("could not obtain lock on row in relation \"%s\"",
RelationGetRelationName(relation))));
}
else
XactLockTableWait(xwait);
/* if there are updates, follow the update chain */
if (follow_updates &&
!HEAP_XMAX_IS_LOCKED_ONLY(infomask))
...
if (RelationNeedsWAL(relation))
{
xl_heap_lock xlrec;
XLogRecPtr recptr;
XLogRecData rdata[2];
xlrec.target.node = relation->rd_node;
xlrec.target.tid = tuple->t_self;
...
To me that looks sufficient to trigger the bug, because we're issuing a
wal record about the row that was passed to heap_lock_update(), not the
latest one in the ctid chain. When replaying that record, it will reset
the t_ctid field, thus breaking the chain.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-03-17 18:29:56 | Re: First-draft release notes for next week's releases |
Previous Message | Merlin Moncure | 2014-03-17 18:20:47 | Re: Planner hints in Postgresql |