Re: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]

Lists: pgsql-hackers
From: Florian Pflug <fgp(at)phlo(dot)org>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]
Date: 2010-12-14 19:29:09
Message-ID: 483B0E69-1218-4AF7-AA3B-8D6A81E91C23@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

In the process of re-verifying my serializable lock consistency patch, I ran
the fk_concurrency testsuite against *unpatched* HEAD for comparison.

My build of HEAD had asserts enabled, and I promptly triggered
Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID))
in heap_delete().

The seems wrong, if result was set to HeapTupleUpdated because the tuple was invisible
to the crosscheck snapshot, its xmax may very well be invalid.

Simply removing the assert isn't an option, because right after the assert the tuple's
xmax is copied into update_xmax. Thus the attached patch takes care to set update_xmax
to InvalidTransactionId explicitly in case the update is prevented by the crosscheck snapshot.

heap_update() suffers from the same problem and is treated similarly by the patch.

Note that this patch conflicts with the serializable_lock_consistency patch, since it
changes that assert too, but in a different way.

best regards,
Florian Pflug

Attachment Content-Type Size
fix_assert_xmaxinvalid.v1.patch application/octet-stream 2.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]
Date: 2010-12-14 20:18:40
Message-ID: 18238.1292357920@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> In the process of re-verifying my serializable lock consistency patch, I ran
> the fk_concurrency testsuite against *unpatched* HEAD for comparison.

> My build of HEAD had asserts enabled, and I promptly triggered
> Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID))
> in heap_delete().

> The seems wrong, if result was set to HeapTupleUpdated because the tuple was invisible
> to the crosscheck snapshot, its xmax may very well be invalid.

This patch seems certainly wrong. Please provide an actual test case
rather than just asserting we should change this.

regards, tom lane


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]
Date: 2010-12-14 20:40:50
Message-ID: 99A6ACCB-A4A3-42EF-9D10-0F5B402A631A@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec14, 2010, at 21:18 , Tom Lane wrote:
> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> In the process of re-verifying my serializable lock consistency patch, I ran
>> the fk_concurrency testsuite against *unpatched* HEAD for comparison.
>
>> My build of HEAD had asserts enabled, and I promptly triggered
>> Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID))
>> in heap_delete().
>
>> The seems wrong, if result was set to HeapTupleUpdated because the tuple was invisible
>> to the crosscheck snapshot, its xmax may very well be invalid.
>
> This patch seems certainly wrong. Please provide an actual test case
> rather than just asserting we should change this.

Running my FK concurrency test suite against HEAD as of today with 100 transaction / client triggers this within a few seconds or so. The test suite can be found at https://github.com/fgp/fk_concurrency.

./fk_concurrency.sh <tx/client> native <path to pg> <host or patch to socket>

Could you explain what seems to be wrong with my patch? If you believe that it's impossible for a tuple to be visible under the query's snapshot but invisible to the crosscheck snapshot, unless it was deleted, that's *not* the case! For RI checks in serializable transactions, the *crosscheck* snapshot is the serializable snapshot, while the query's snapshot is obtained with GetLatetsSnapshot(). This is the relevant snippet from ri_trigger.c, ri_PerformCheck():

if (IsolationUsesXactSnapshot() && detectNewRows)
{
CommandCounterIncrement(); /* be sure all my own work is visible */
test_snapshot = GetLatestSnapshot();
crosscheck_snapshot = GetTransactionSnapshot();
}
else
{
/* the default SPI behavior is okay */
test_snapshot = InvalidSnapshot;
crosscheck_snapshot = InvalidSnapshot;
}

best regards,
Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]
Date: 2010-12-14 20:52:30
Message-ID: 18870.1292359950@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> Could you explain what seems to be wrong with my patch?

I'm unconvinced that this is the proper response to whatever the problem
is; and if it is the right response, it seems to still need a good bit
more work. You didn't even update the functions' header comments, let
alone look at their callers to see how they'd be affected by an
InvalidTransactionId result.

regards, tom lane


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]
Date: 2010-12-14 21:34:53
Message-ID: 27514990-DCB6-4137-B843-371760F476F2@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec14, 2010, at 21:52 , Tom Lane wrote:
> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> Could you explain what seems to be wrong with my patch?
>
> I'm unconvinced that this is the proper response to whatever the problem
> is;
Well, you didn't comment on the part of my previous e-mail that *did*
explain why I believe this is the proper response...

> and if it is the right response, it seems to still need a good bit
> more work. You didn't even update the functions' header comments, let
> alone look at their callers to see how they'd be affected by an
> InvalidTransactionId result.

Well, I hit this while re-verifying the serializable lock consistency stuff
with a current HEAD, so I didn't really want to spend more time on this than
necessary. Especially since that patch replaces the assert in question anyway.

So I moved the assert to a safe place and HeapTupleHeaderGetXmax() with
it, since if the assert fails HeapTupleHeaderGetXmax() will return garbage
anyway (read: a multi-xid instead of an xid in some cases!).

For non-assert-enabled builds, the only effect of the patch is thus to
consistently return InvalidTransactionId if the crosscheck snapshot turns
HeapTupleMayBeUpdated into HeapTupleUpdated. Which certainly seems to be
an improvement over sometimes returning InvalidTransactionId, sometimes
a locker's xid, and sometime's a multi-xid.

best regards,
Florian Pflug


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Triggered assertion "!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)" in heap_delete() on HEAD [PATCH]
Date: 2010-12-15 01:25:41
Message-ID: F39699C0-47DC-4E81-A059-E8990E4F37C0@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec14, 2010, at 22:34 , Florian Pflug wrote:
> For non-assert-enabled builds, the only effect of the patch is thus to
> consistently return InvalidTransactionId if the crosscheck snapshot turns
> HeapTupleMayBeUpdated into HeapTupleUpdated. Which certainly seems to be
> an improvement over sometimes returning InvalidTransactionId, sometimes
> a locker's xid, and sometime's a multi-xid.

I've updated the patch to explain that in the comments above heap_update()
and heap_delete(). I've taken a brief look at the callers of these functions,
but fail to see how to improve things there. Things work currently because
crosschecking is only used in serializable mode, while the tuple's xmax is
only required in read committed mode to double-check the tuple chain when
following the ctid pointers. But the API doesn't enforce that at all :-(

I'm not willing to clean that mess up, since the serializable lock consistency
patch changes all these areas, *and* makes the cleanup easier by getting rid
of the crosscheck snapshot entirely.

I still believe that applying this patch now is worth it, for the reasons
already explained, but in the end that's obviously something for a committer
to decide.

Patch with updated comments attached.

best regards,
Florian Pflug

Attachment Content-Type Size
fix_assert_xmaxinvalid.v2.patch application/octet-stream 5.3 KB