Re: FATAL: lock AccessShareLock on object 0/1260/0 is already held

From: daveg <daveg(at)sonic(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FATAL: lock AccessShareLock on object 0/1260/0 is already held
Date: 2011-08-22 07:31:31
Message-ID: 20110822073131.GC3363@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 12, 2011 at 04:19:37PM -0700, daveg wrote:
>
> This seems to be bug month for my client. Now there are seeing periods
> where all new connections fail immediately with the error:
>
> FATAL: lock AccessShareLock on object 0/1260/0 is already held
>
> This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has
> been up for months. It started happening sporadicly a few days ago. It will
> do this for a period of several minutes to an hour and then go back to
> normal for hours or days.
>
> One complete failing session out of several hundred around that time:
> -----------------
> 2011-08-09 00:01:04.446 8823 [unknown] [unknown] LOG: connection received: host=op05.xxx port=34067
> 2011-08-09 00:01:04.446 8823 c77 apps LOG: connection authorized: user=apps database=c77
> 2011-08-09 00:01:04.449 8823 c77 apps FATAL: lock AccessShareLock on object 0/1260/0 is already held
> ------------------

This is to add additional information to the original report:

For a while this was happening on many different databases in one postgresql
8.4.7 instance on a single large host ('U2' 512GB 64cpu) running RH 5.
That has been quiet for several days and the newest batches of errors have
happened on only on a single database 'c23', in a postgresql 9.0.4 instance
on a smaller host ('A', 64GB 8cpu) running SuSE 10.2.

No memory errors or other misbehaviour have been seen on either of these
hosts in recent months.

The original error was:

lock AccessShareLock on object 0/1260/0 is already held

which is for pg_database. The recent errors are:

lock AccessShareLock on object 16403/2615/0 is already held

which is for pg_namespace in database c23.

All of the orginal and most of the recent batchs of errors were immediately
after connecting to a database and being authorized, that is, before any
statements were attempted. However, some of the most recent are on the first
"query" statement. That is after logging in and doing things like "set
transaction ... " the first select would hit this error.

It seems to come in clusters, sometimes, which suggests something shared
by multiple processes. For example, here are the times for the errors
on c23 in the afternoon of August 20:

20 07:14:12.722

20 16:05:07.798
20 16:05:07.808

20 16:05:10.519

20 16:07:07.726
20 16:07:08.722
20 16:07:09.734
20 16:07:10.656

20 16:07:25.436

20 16:22:23.983
20 16:22:24.014
20 16:22:24.335
20 16:22:24.409
20 16:22:24.477
20 16:22:24.499
20 16:22:24.516

20 16:30:58.210

20 16:31:15.261
20 16:31:15.296
20 16:31:15.324
20 16:31:15.348

20 18:06:16.515

20 18:06:49.198
20 18:06:49.204

20 18:06:51.444

20 21:03:05.940

So far I've got:

- affects system tables
- happens very soon after process startup
- in 8.4.7 and 9.0.4
- not likely to be hardware or OS related
- happens in clusters for period of a few second to many minutes

I'll work on printing the LOCK and LOCALLOCK when it happens, but it's
hard to get downtime to pick up new builds. Any other ideas on getting to
the bottom of this?

Thanks

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2011-08-22 09:14:45 Re: Question: CREATE EXTENSION and create schema permission?
Previous Message Pavan Deolasee 2011-08-22 06:22:35 Single pass vacuum - take 2