Re: serializable lock consistency

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: serializable lock consistency
Date: 2010-12-19 18:57:25
Message-ID: AA7DB036-4DDA-42C6-B5A9-DB6AD51B9EBC@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Dec19, 2010, at 17:01 , Robert Haas wrote:
> On Sun, Dec 19, 2010 at 9:12 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>>> On Sun, Dec 19, 2010 at 4:02 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>>>> Note that it's sufficient to check if B can see the effects of the
>>>> *latest* locker of T. If it can see those, it must also see the
>>>> effects of any previous locker. But because of this, B cannot
>>>> distinguish different lock strengths on T - even if A locked T
>>>> in exclusive mode, some transaction A2 may lock T in shared mode
>>>> after A has committed but before B inspects T.
>>>
>>> This seems to point to a rather serious problem, though. If B sees
>>> that the last locker A1 aborted, correctness requires that it roll
>>> back B, because there might have been some other transaction A2 which
>>> locked locked T and committed before A1 touched it. Implementing that
>>> behavior could lead to a lot of spurious rollbacks, but NOT
>>> implementing it isn't fully correct.
>>
>> Certainly seems serios. How on earth did I manage to miss that one, I
>> wonder :-(
>>
>> If only shared locks are invovled, the patch probably works correctly,
>> though, since they don't remove all traces of previous lockers, they
>> merely add themselves to the multi-xid holding the lock. That'd also
>> explain why my concurrency test-suite didn't trip over this. So we may
>> only need to do what you propose for exclusive locks.
>
> I'd be willing to bet, without checking, that if the previous shared
> locker is no longer running by the time the next one comes along, we
> don't create a multixact in the first place.

And you'd have won that bet. I just checked, and this is exactly what
we do.

> And even if it does,
> what about this: exclusive lock, commit, shared lock, abort.

I figured we could solve that by adding the exclusive locker's xid
to the multi-xid if it was >= GlobalXmin.

But...

Playing games with multi-xid lifetimes helps nothing in case
T1 locks, T1 commits, T2 updates, T2 aborts, all after T0
took its snapshot but before T0 attempts to delete. :-(

Actually, wait a second...

What we're interested here is the last non-aborted xmax of the tuple.
If that is invisible to some serializable transaction which tries
to modify the tuple, we raise a serialization error.

In the case of updates, deletes and exclusive locks, we always wait
for the xid(s) (more than if share-locked) in xmax to either commit or
abort, and thus know their fate. We can therefore track the latest
non-aborted xmax in a single additional field "xlast" if we simply
save the old xmax there before overwriting it *if* that xmax did
actually commit. Otherwise we leave xlast as it was.

The same works for share locks. If the tuple was previously locked
exclusively, updated or deleted, we proceed as above. If the tuple
was previously share-locked, some other lockers may still be in
progress. If at least one of them has committed, we again store its
xid in xlast. Otherwise, we again leave xlast as it was.

Note that xlast is never a multi-xid!

When a serializable transaction updates, deletes or exclusively locks
a tuple, we check if either xmax *or* xlast is invisible. If so,
we raise a serialization error.

If we reuse the legacy field xvac to store xlast, we don't get into
trouble with binary upgrades either. We' need to find a way to deal
with tuples where HEAP_MOVED_IN or HEAP_MOVED_OUT is set, but that
seems manageable..

Does that sound viable?

> As unhappy as I am with the present behavior, this cure sounds like it
> might be worse than the disease.

I agree. Aborting due to conflicts with aborted transactions really
seems like a bad idea - there probably cases were that would lead
to one spurious abort causing another causing another...

But the solution sketched may be a way out...

best regards,
Florian Pflug

PS: Thanks to anyone who put work into this! I'm extremely sorry that
I didn't catch this earlier, and thus caused unnecessary work on your
end :-(

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2010-12-19 19:16:18 Re: pg_ctl and port number detection
Previous Message Dimitri Fontaine 2010-12-19 18:47:43 Re: Extensions, patch v20 (bitrot fixes)