Re: deadlock detected errors

Lists: pgsql-general
From: Clarence Gardner <clarence(at)silcom(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: deadlock detected errors
Date: 2006-09-04 21:24:47
Message-ID: Pine.SUN.4.02.10609042123180.955-100000@beach.silcom.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


I am trying to locate the source of some deadlocks that have started
cropping up recently, with little success, and I have a question regarding
the message that accompanies them.

The message my application gets is like this:
Process 244 waits for ShareLock on transaction 39523645; blocked by
process 74242.
Process 74242 waits for ShareLock on tuple (65,2) of relation 16708 of
database
16386; blocked by process 244.

But in investigating various places in my code that could be relevant, and
executing the statements manually, my deadlock messages always look like
this:
Process 38158 waits for ShareLock on transaction 15691; blocked by
process 38160.
Process 38160 waits for ShareLock on transaction 15698; blocked by process
38158.

The difference I'm asking about is that my artificial ones always involve
waiting on a transaction, while the ones from the real application always
involve one transaction wait and one tuple wait.

The first question is, is there any significance to this difference?

If so, the second question is, what is it? :) In looking at some code in
heapam.c, I see the following comment that sorta kinda sounds like maybe
there are three parties involved, with two of them waiting on a locked
tuple?

* Acquire tuple lock to establish our priority for the tuple.
* LockTuple will release us when we are next-in-line for the tuple.

Thanks for any help,
Clarence


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Clarence Gardner <clarence(at)silcom(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: deadlock detected errors
Date: 2006-09-04 22:39:18
Message-ID: 13852.1157409558@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Clarence Gardner <clarence(at)silcom(dot)com> writes:
> The difference I'm asking about is that my artificial ones always involve
> waiting on a transaction, while the ones from the real application always
> involve one transaction wait and one tuple wait.

> The first question is, is there any significance to this difference?

Yup.

> If so, the second question is, what is it? :) In looking at some code in
> heapam.c, I see the following comment that sorta kinda sounds like maybe
> there are three parties involved, with two of them waiting on a locked
> tuple?

Got it in one. The guy waiting on the tuple-specific lock is second in
line to actually mung the tuple. Whoever is first in line behind the
current tenant will be blocked trying to acquire ShareLock on the
current tenant's transaction ID.

What you appear to have is a situation where two transactions are trying
to lock or update the same two rows in different orders. Without a lot
more info about your application logic, I couldn't guess why this seems
to be associated with having more than two transaction interested in the
same tuple.

Note that the guy looking for ShareLock on the tuple is evidently either
doing SELECT FOR SHARE on this tuple, or trying to install a new tuple
referencing this one as a foreign key (which does SELECT FOR SHARE under
the hood). But he's blocked by someone who's done either SELECT FOR
UPDATE or an actual UPDATE on that tuple.

regards, tom lane