Re: BUG #8434: Why does dead lock occur many times ?

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8434: Why does dead lock occur many times ?
Date: 2013-11-27 16:42:27
Message-ID: 20131127164226.GA14522@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I think I have figured this out. This code was being too simplistic in
that it was only checking whether there was a key-update or not; but
that, it seems, is not sufficiently fine-grained. If we instead think
in terms of an already-acquired MultiXactStatus, and a new
MultiXactStatus corresponding to the lock we're trying to acquire, we
can decide granularly whether each future version of the tuple can be
locked by us or not. And if not, we can decide whether we need to wait
on the transaction holding the lock, or fail if it already committed.
(There is a funny trick here which is that we represent the held lock
with a MultiXactStatus, even if the lock is only a plain Xid and not a
multi. This doesn't seem a problem to me.)

So I propose the attached patch. This doesn't change the behavior
codified in the existing isolation tests, and it fixes my reduction of
your original test case. (I haven't run your original test case yet,
but I soon will.)

Note: I don't quite like that this patch duplicates some code in
compute_new_xmax_infomask() which determines the MultiXactStatus from
the infomask bits. Not sure what a good refactoring is, though, because
that code just issues a WARNING when the LOCK_ONLY bit is set and no
other lockmode is set; whereas the new code issues an ERROR. It seems
hard to justify doing otherwise in either place, though; and doing
something more complicated doesn't seem warranted for such a corner
case.

Note: this patch doesn't apply to master in isolation. You will need
the patch I mention in
http://www.postgresql.org/message-id/20131125201039.GF6597@eldon.alvh.no-ip.org
even though I haven't posted it yet. Will do so shortly.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
granular-chain-following.patch text/x-diff 7.6 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message bricklen 2013-11-27 16:42:48 Re: BUG #8633: Assigning to a variable named "current_time" gives wrong output
Previous Message michael.lancaster 2013-11-27 15:49:49 BUG #8633: Assigning to a variable named "current_time" gives wrong output