Re: lock on object is already held

Lists: pgsql-hackers
From: Daniel Wood <dwood(at)salesforce(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: lock on object is already held
Date: 2013-11-19 04:24:01
Message-ID: CAPweHKfExEsbACRXQTBdu_4QxhHk-Cic_iwmbh5XHo_0Z=Q-1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sorry but I don't know how to respond to an old thread I found on
postgresql.org:
http://www.postgresql.org/message-id/20766.1357318482@sss.pgh.pa.us

I presume this is still an open issue. While working on a new feature I
wrote a stress test for it. After fixing my bugs, I couldn't get rid of:
ERROR: lock RowExclusiveLock on object 16384/16393/0 is already held
In addition, with asserts enabled in lock.c I would see Asserts in
UnGrantLock, LockRelease, LockReleaseAll, etc. In one run everything hung
waiting on SQL locks.

The asserts or hang occurs within seconds of running the stress test.
Before I get into the details I want to see if this is something already
being worked on. I can repro it easily in vanilla 9.3.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Wood <dwood(at)salesforce(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: lock on object is already held
Date: 2013-11-19 06:31:51
Message-ID: 30981.1384842711@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Wood <dwood(at)salesforce(dot)com> writes:
> Sorry but I don't know how to respond to an old thread I found on
> postgresql.org:
> http://www.postgresql.org/message-id/20766.1357318482@sss.pgh.pa.us

> I presume this is still an open issue. While working on a new feature I
> wrote a stress test for it. After fixing my bugs, I couldn't get rid of:
> ERROR: lock RowExclusiveLock on object 16384/16393/0 is already held
> In addition, with asserts enabled in lock.c I would see Asserts in
> UnGrantLock, LockRelease, LockReleaseAll, etc. In one run everything hung
> waiting on SQL locks.

> The asserts or hang occurs within seconds of running the stress test.
> Before I get into the details I want to see if this is something already
> being worked on. I can repro it easily in vanilla 9.3.

So let's see the reproducer?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Wood <dwood(at)salesforce(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: lock on object is already held
Date: 2013-11-27 01:30:01
Message-ID: 3163.1385515801@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Wood <dwood(at)salesforce(dot)com> writes:
> Sorry but I don't know how to respond to an old thread I found on
> postgresql.org:
> http://www.postgresql.org/message-id/20766.1357318482@sss.pgh.pa.us

> I presume this is still an open issue. While working on a new feature I
> wrote a stress test for it. After fixing my bugs, I couldn't get rid of:
> ERROR: lock RowExclusiveLock on object 16384/16393/0 is already held
> In addition, with asserts enabled in lock.c I would see Asserts in
> UnGrantLock, LockRelease, LockReleaseAll, etc. In one run everything hung
> waiting on SQL locks.

> The asserts or hang occurs within seconds of running the stress test.
> Before I get into the details I want to see if this is something already
> being worked on. I can repro it easily in vanilla 9.3.

Dan sent me this test case off-list. Here's a somewhat cleaned-up
version:

1. Create a table:
create table lock_bug(f1 int, f2 int);

2. Compile the attached lockbug.c.

3. Run the attached lockbug.sh, with adjustment of parameters to taste.

This spits up in a remarkable variety of ways in 9.3 and HEAD,
especially if you have assertions on.

After some investigation I found the cause: LockRelease() expects that
if a LOCALLOCK object represents a valid lock (nLocks > 0), then
either its lock and proclock fields point to associated shared-memory
entries, or they're NULL. However, it's possible for that to not be
true, because if we fail to acquire a lock, the code leaves around a
LOCALLOCK object pointing at the shared objects we couldn't get lock
on. *These objects are subject to reclamation, because no lock is
actually held*. So if we make another attempt to get the same lock
later in the same transaction, LockAcquire finds and re-uses that
LOCALLOCK object. Pre fast-path locks, it would always recompute the
lock and proclock pointers, so the fact that they might have been stale
wasn't a problem. But the fast-path patch added a code path in which
we could succeed in acquiring a fast-path lock, and then exit without
having done anything with the pointer fields. Now we have something
that looks like a valid locallock but could be pointing at entirely
unrelated shared objects. This leads to all sorts of fun at release
time, with the symptoms depending on exactly what those shared
hashtable entries are being used for at the instant we stomp on them.

So the fix is pretty simple:

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index f8dc951..37c605f 100644
*************** LockAcquireExtended(const LOCKTAG *lockt
*** 837,842 ****
--- 844,851 ----
LWLockRelease(MyProc->backendLock);
if (acquired)
{
+ locallock->lock = NULL;
+ locallock->proclock = NULL;
GrantLockLocal(locallock, owner);
return LOCKACQUIRE_OK;
}

although this needs to be backed up with a lot more comments of course.

When I showed this to Dan, he objected that it'd be better if
the failing lock operation had cleaned up the LOCALLOCK instead.
That would be a significantly bigger change though, and I think
it'd net out being less robust. The data structure was designed
to not need such cleanup, because the more stuff you have to do
to clean up after a failure, the more likely it is that you'll
have additional problems during the cleanup, leaving you hosed.
In particular, we'd have to be certain that we could remove the
useless shared objects during the cleanup step, since once the
LOCALLOCK is gone there is nothing else that will remind us to
garbage-collect them at end of transaction.

BTW, although I'm certain that 9.2 has this bug as well, this
test case fails to show a problem in that branch. I've not
looked into why not --- it's probably a timing issue or something.

Comments?

regards, tom lane

Attachment Content-Type Size
lockbug.c text/x-c 701 bytes
lockbug.sh text/x-shellscript 449 bytes

From: Daniel Wood <dwood(at)salesforce(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: lock on object is already held
Date: 2013-11-27 03:41:44
Message-ID: CAPweHKfsGjSzry1y-A4dE-OUWyMqNUFYB0fmJa5APWGdabKiEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Does the original version of my stress test not repro the problem on 9.2?

My primary concern with the fix is that I simply didn't understand what
might happen after a failed lock attempt called CleanUpLock freeing the
PROCLOCK but leaving some LOCALLOCK still pointing at it. As long as
"nLocks == 0" is guarenteed I guess we are OK although LockRelease will
elog a WARNING and LockReleaseAll believes "/* ... we must've run out of
shared memory ... */".

Why does LockAcquireExtended() test for "nLocks == 0" in the "if
(dontWait)" block before calling RemoveLocalLock()? Can nLocks be anything
other than 0 if we've just freed the PROCLOCK in this block of code? If
nLocks is > 0 AND pointing at a freed PROCLOCK can't we still have a
problem. Given that "dontWait==true" seems to be associated with DDL and
other rare things it might be hard to stress test this case.

On Tue, Nov 26, 2013 at 5:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Daniel Wood <dwood(at)salesforce(dot)com> writes:
> > Sorry but I don't know how to respond to an old thread I found on
> > postgresql.org:
> > http://www.postgresql.org/message-id/20766.1357318482@sss.pgh.pa.us
>
> > I presume this is still an open issue. While working on a new feature I
> > wrote a stress test for it. After fixing my bugs, I couldn't get rid of:
> > ERROR: lock RowExclusiveLock on object 16384/16393/0 is already held
> > In addition, with asserts enabled in lock.c I would see Asserts in
> > UnGrantLock, LockRelease, LockReleaseAll, etc. In one run everything
> hung
> > waiting on SQL locks.
>
> > The asserts or hang occurs within seconds of running the stress test.
> > Before I get into the details I want to see if this is something already
> > being worked on. I can repro it easily in vanilla 9.3.
>
> Dan sent me this test case off-list. Here's a somewhat cleaned-up
> version:
>
> 1. Create a table:
> create table lock_bug(f1 int, f2 int);
>
> 2. Compile the attached lockbug.c.
>
> 3. Run the attached lockbug.sh, with adjustment of parameters to taste.
>
> This spits up in a remarkable variety of ways in 9.3 and HEAD,
> especially if you have assertions on.
>
> After some investigation I found the cause: LockRelease() expects that
> if a LOCALLOCK object represents a valid lock (nLocks > 0), then
> either its lock and proclock fields point to associated shared-memory
> entries, or they're NULL. However, it's possible for that to not be
> true, because if we fail to acquire a lock, the code leaves around a
> LOCALLOCK object pointing at the shared objects we couldn't get lock
> on. *These objects are subject to reclamation, because no lock is
> actually held*. So if we make another attempt to get the same lock
> later in the same transaction, LockAcquire finds and re-uses that
> LOCALLOCK object. Pre fast-path locks, it would always recompute the
> lock and proclock pointers, so the fact that they might have been stale
> wasn't a problem. But the fast-path patch added a code path in which
> we could succeed in acquiring a fast-path lock, and then exit without
> having done anything with the pointer fields. Now we have something
> that looks like a valid locallock but could be pointing at entirely
> unrelated shared objects. This leads to all sorts of fun at release
> time, with the symptoms depending on exactly what those shared
> hashtable entries are being used for at the instant we stomp on them.
>
> So the fix is pretty simple:
>
> diff --git a/src/backend/storage/lmgr/lock.c
> b/src/backend/storage/lmgr/lock.c
> index f8dc951..37c605f 100644
> *************** LockAcquireExtended(const LOCKTAG *lockt
> *** 837,842 ****
> --- 844,851 ----
> LWLockRelease(MyProc->backendLock);
> if (acquired)
> {
> + locallock->lock = NULL;
> + locallock->proclock = NULL;
> GrantLockLocal(locallock, owner);
> return LOCKACQUIRE_OK;
> }
>
> although this needs to be backed up with a lot more comments of course.
>
> When I showed this to Dan, he objected that it'd be better if
> the failing lock operation had cleaned up the LOCALLOCK instead.
> That would be a significantly bigger change though, and I think
> it'd net out being less robust. The data structure was designed
> to not need such cleanup, because the more stuff you have to do
> to clean up after a failure, the more likely it is that you'll
> have additional problems during the cleanup, leaving you hosed.
> In particular, we'd have to be certain that we could remove the
> useless shared objects during the cleanup step, since once the
> LOCALLOCK is gone there is nothing else that will remind us to
> garbage-collect them at end of transaction.
>
> BTW, although I'm certain that 9.2 has this bug as well, this
> test case fails to show a problem in that branch. I've not
> looked into why not --- it's probably a timing issue or something.
>
> Comments?
>
> regards, tom lane
>
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Wood <dwood(at)salesforce(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: lock on object is already held
Date: 2013-11-27 17:30:37
Message-ID: 18886.1385573437@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Wood <dwood(at)salesforce(dot)com> writes:
> Does the original version of my stress test not repro the problem on 9.2?

[ tries it ... ] No, it doesn't, or at least the MTBF is a couple orders
of magnitude better than on 9.3.

Another odd thing (seen with my short version as well as your original)
is that 9.3/HEAD run the test case enormously faster than 9.2 and 9.1
do. The older versions seem to spend a lot of time sleeping, which
I don't understand.

> Why does LockAcquireExtended() test for "nLocks == 0" in the "if
> (dontWait)" block before calling RemoveLocalLock()?

Looks like a useless test to me --- we wouldn't be here at all if nLocks
had been positive to start with, and there's nothing in between that
could raise the count. On the other hand, removing a LOCALLOCK that
did have positive count would be disastrous. Maybe what would be
more appropriate is an Assert(nLocks == 0) in RemoveLocalLock().

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Wood <dwood(at)salesforce(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: lock on object is already held
Date: 2013-11-28 19:35:23
Message-ID: 17227.1385667323@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Daniel Wood <dwood(at)salesforce(dot)com> writes:
>> Does the original version of my stress test not repro the problem on 9.2?

> [ tries it ... ] No, it doesn't, or at least the MTBF is a couple orders
> of magnitude better than on 9.3.

Oh, of course: the reason the test doesn't fail as given on 9.2 is that
9.2 doesn't have a lock_timeout parameter. (I missed the complaints about
this in the blizzard of other noise the test generates :-(.) The timeout
is critical to exposing the bug because it needs failed lock acquisitions.
Probably unpatched 9.2 would fall over if you used statement_timeout
instead.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Wood <dwood(at)salesforce(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: lock on object is already held
Date: 2013-11-29 20:13:51
Message-ID: 13298.1385756031@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Wood <dwood(at)salesforce(dot)com> writes:
> ... Presuming your fix is putting PG_SETMASK(&UnBlockSig)
> immediately before each of the 6 calls to ereport(ERROR,...) I've been
> running the stress test with both this fix and the lock already held fix.

I'm now planning to put it in error cleanup instead, but that's good
enough for proving that the problem is what I thought it was.

> I get plenty of lock timeout errors as expected. However, once in a great
> while I get: sqlcode = -400, sqlstate = 57014, sqlerrmc = canceling
> statement due to user request
> My stress test certainly doesn't do a user cancel. Should this be expected?

I think I see what must be happening there: the lock timeout interrupt is
happening at some point after the lock has been granted, but before
ProcSleep reaches its disable_timeouts call. QueryCancelPending gets set,
and will be honored next time something does CHECK_FOR_INTERRUPTS.
But because ProcSleep told disable_timeouts to clear the LOCK_TIMEOUT
indicator bit, ProcessInterrupts thinks the cancel must've been a plain
user SIGINT, and reports it that way.

What we should probably do about this is change ProcSleep to not clear the
LOCK_TIMEOUT indicator bit, same as we already do in LockErrorCleanup,
which is the less-race-condition-y path out of a lock timeout.

(It would be cooler if the timeout handler had a way to realize that the
lock is already granted, and not issue a query cancel in the first place.
But having a signal handler poking at shared memory state is a little too
scary for my taste.)

It strikes me that this also means that places where we throw away pending
cancels by clearing QueryCancelPending, such as the sigsetjmp stanza in
postgres.c, had better reset the LOCK_TIMEOUT indicator bit. Otherwise,
a thrown-away lock timeout cancel might cause a later SIGINT cancel to be
misreported.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Wood <dwood(at)salesforce(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: lock on object is already held
Date: 2013-12-01 04:50:42
Message-ID: CAFj8pRAdN9UgWTtwp9ixH09+iD4pY_LB1Wc-haOA-e=G4pzn2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

we found this issue year ago -
http://www.postgresql.org/message-id/CAFj8pRAHVzUPfBx+8EY-XHfwBo8bxVu_YNMBAPSDj8W-ARatLA@mail.gmail.com

I try to simulate this error, but without success - so I prepared patch
that had to help with identification of this issue. Important part is
backport process startup from 9.2. After applying we detected this issue
newer.

Regards

Pavel

2013/11/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>

> Daniel Wood <dwood(at)salesforce(dot)com> writes:
> > ... Presuming your fix is putting PG_SETMASK(&UnBlockSig)
> > immediately before each of the 6 calls to ereport(ERROR,...) I've been
> > running the stress test with both this fix and the lock already held fix.
>
> I'm now planning to put it in error cleanup instead, but that's good
> enough for proving that the problem is what I thought it was.
>
> > I get plenty of lock timeout errors as expected. However, once in a
> great
> > while I get: sqlcode = -400, sqlstate = 57014, sqlerrmc = canceling
> > statement due to user request
> > My stress test certainly doesn't do a user cancel. Should this be
> expected?
>
> I think I see what must be happening there: the lock timeout interrupt is
> happening at some point after the lock has been granted, but before
> ProcSleep reaches its disable_timeouts call. QueryCancelPending gets set,
> and will be honored next time something does CHECK_FOR_INTERRUPTS.
> But because ProcSleep told disable_timeouts to clear the LOCK_TIMEOUT
> indicator bit, ProcessInterrupts thinks the cancel must've been a plain
> user SIGINT, and reports it that way.
>
> What we should probably do about this is change ProcSleep to not clear the
> LOCK_TIMEOUT indicator bit, same as we already do in LockErrorCleanup,
> which is the less-race-condition-y path out of a lock timeout.
>
> (It would be cooler if the timeout handler had a way to realize that the
> lock is already granted, and not issue a query cancel in the first place.
> But having a signal handler poking at shared memory state is a little too
> scary for my taste.)
>
> It strikes me that this also means that places where we throw away pending
> cancels by clearing QueryCancelPending, such as the sigsetjmp stanza in
> postgres.c, had better reset the LOCK_TIMEOUT indicator bit. Otherwise,
> a thrown-away lock timeout cancel might cause a later SIGINT cancel to be
> misreported.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Attachment Content-Type Size
orphaned_locks.patch text/x-patch 4.7 KB