Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held

Lists: pgsql-bugs
From: Michael Milligan <milli(at)acmeps(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-29 20:05:51
Message-ID: 48B8569F.20900@acmeps.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Uh oh. This is a first (for me). I got this error on a transaction, it
did not crash the server but did abort the transaction (of course).

ERROR: lock AccessShareLock on object 16385/16467/0 is already held

What was I doing? A large transaction where I was pushing about 20
million rows into two tables using prepared transactions. I have been
doing this for some time (over a year) on 8.2.6 without any problems.

I may be able to reproduce, but the transaction takes a long time to
complete... about a day. I'm running it again now.

There was no information in the log. What else can I do to get debug
information for you, provided it's reproducable...

Some db details:

db=> select version();
version
-------------------------------------------------------------------------------------------------
PostgreSQL 8.3.3 on amd64-portbld-freebsd6.3, compiled by GCC cc (GCC)
3.4.6 [FreeBSD] 20060305
(1 row)

This instance was also compiled with the following changes
(pg_config_manual.h):

diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 354a3d3..20c40c4 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -23,7 +23,7 @@
*
* Changing BLCKSZ requires an initdb.
*/
-#define BLCKSZ 8192
+#define BLCKSZ 32768

/*
* RELSEG_SIZE is the maximum number of blocks allowed in one disk
@@ -69,7 +69,7 @@
*
* Changing this requires an initdb.
*/
-#define NAMEDATALEN 64
+#define NAMEDATALEN 128

/*
* Maximum number of arguments to a function.

This goes along with a UFS block size of 32768.

Regards,
Mike

--
Michael Milligan -> milli(at)acmeps(dot)com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Michael Milligan <milli(at)acmeps(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-29 20:13:21
Message-ID: 48B85861.9090200@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Michael Milligan wrote:
> Uh oh. This is a first (for me). I got this error on a transaction, it
> did not crash the server but did abort the transaction (of course).
>
> ERROR: lock AccessShareLock on object 16385/16467/0 is already held
>
> What was I doing? A large transaction where I was pushing about 20
> million rows into two tables using prepared transactions. I have been
> doing this for some time (over a year) on 8.2.6 without any problems.

Could you provide more details on the transaction? At what point did it
fail, during the insertss, or at commit perhaps? Are you really using
prepared transactions, aka two-phase commit, or did you mean prepared
statements?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Milligan <milli(at)acmeps(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-29 20:28:59
Message-ID: 10315.1220041739@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Michael Milligan <milli(at)acmeps(dot)com> writes:
> Uh oh. This is a first (for me). I got this error on a transaction, it
> did not crash the server but did abort the transaction (of course).

> ERROR: lock AccessShareLock on object 16385/16467/0 is already held

Huh, that shouldn't happen. What object is that? The 16385 should be
a database OID, and the 16467 is most likely a table's OID within that
database.

> What was I doing? A large transaction where I was pushing about 20
> million rows into two tables using prepared transactions. I have been
> doing this for some time (over a year) on 8.2.6 without any problems.

> I may be able to reproduce, but the transaction takes a long time to
> complete... about a day. I'm running it again now.

I wonder if it's possible that you overflowed nLocks in the local lock
table. There are a lot of other things that likely would break first,
though, so that doesn't seem very probable.

Is this transaction longer than any comparable one you ever did before?

regards, tom lane


From: Michael Milligan <milli(at)acmeps(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-29 20:52:07
Message-ID: 48B86177.6030207@acmeps.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Heikki Linnakangas wrote:
> Michael Milligan wrote:
>> Uh oh. This is a first (for me). I got this error on a transaction, it
>> did not crash the server but did abort the transaction (of course).
>>
>> ERROR: lock AccessShareLock on object 16385/16467/0 is already held
>>
>> What was I doing? A large transaction where I was pushing about 20
>> million rows into two tables using prepared transactions. I have been
>> doing this for some time (over a year) on 8.2.6 without any problems.
>
> Could you provide more details on the transaction? At what point did it
> fail, during the insertss, or at commit perhaps? Are you really using
> prepared transactions, aka two-phase commit, or did you mean prepared
> statements?

Yes, mistyped there, prepared _statements_. It failed during an
EXECUTE, so during a prepared INSERT statement.

Regards,
Mike

--
Michael Milligan -> milli(at)acmeps(dot)com


From: Michael Milligan <milli(at)acmeps(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-29 20:57:55
Message-ID: 48B862D3.5070000@acmeps.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Michael Milligan <milli(at)acmeps(dot)com> writes:
>> Uh oh. This is a first (for me). I got this error on a transaction, it
>> did not crash the server but did abort the transaction (of course).
>
>> ERROR: lock AccessShareLock on object 16385/16467/0 is already held
>
> Huh, that shouldn't happen. What object is that? The 16385 should be
> a database OID, and the 16467 is most likely a table's OID within that
> database.
>
>> What was I doing? A large transaction where I was pushing about 20
>> million rows into two tables using prepared transactions. I have been
>> doing this for some time (over a year) on 8.2.6 without any problems.
>
>> I may be able to reproduce, but the transaction takes a long time to
>> complete... about a day. I'm running it again now.
>
> I wonder if it's possible that you overflowed nLocks in the local lock
> table. There are a lot of other things that likely would break first,
> though, so that doesn't seem very probable.
>
> Is this transaction longer than any comparable one you ever did before?

For 8.3.3, yes. This is only the second such large transaction I've
done on 8.3.3, the first one I did was around 6.7 million rows and it
worked.

And a correction, the transaction that caused this error was inserting
13.5 million rows and failed near the end.

Regards,
Mike

--
Michael Milligan -> milli(at)acmeps(dot)com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Milligan <milli(at)acmeps(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-29 21:30:13
Message-ID: 15941.1220045413@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Michael Milligan <milli(at)acmeps(dot)com> writes:
> Tom Lane wrote:
>> Huh, that shouldn't happen. What object is that? The 16385 should be
>> a database OID, and the 16467 is most likely a table's OID within that
>> database.

Please answer the above question.

> And a correction, the transaction that caused this error was inserting
> 13.5 million rows and failed near the end.

Once you've determined which table the error message is talking about,
please show us what the transaction does with that table.

regards, tom lane


From: Michael Milligan <milli(at)acmeps(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-29 22:44:22
Message-ID: 48B87BC6.5010901@acmeps.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Michael Milligan <milli(at)acmeps(dot)com> writes:
>> Tom Lane wrote:
>>> Huh, that shouldn't happen. What object is that? The 16385 should be
>>> a database OID, and the 16467 is most likely a table's OID within that
>>> database.
>
> Please answer the above question.

16385 is the database (db) and 16467 is the table (email) the prepared
inserts are executed against. That table (email) is an inherited table
definition for partitioned tables that are named by date ranges. E.g.,

CREATE TABLE email_2008_week27 ( PRIMARY KEY (id), CHECK ( ts >=
'2008-06-30'::timestamp AND ts < '2008-07-07'::timestamp ) ) INHERITS
(email)

>
>> And a correction, the transaction that caused this error was inserting
>> 13.5 million rows and failed near the end.
>
> Once you've determined which table the error message is talking about,
> please show us what the transaction does with that table.

You mean like:

BEGIN;
PREPARE msg (...) INSERT INTO email VALUES (...);
EXECUTE msg (...)
EXECUTE msg (...)
EXECUTE msg (...)
EXECUTE msg (...)
... repeated millions of times
COMMIT;

Regards,
Mike

--
Michael Milligan -> milli(at)acmeps(dot)com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Milligan <milli(at)acmeps(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-30 00:28:06
Message-ID: 28948.1220056086@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Michael Milligan <milli(at)acmeps(dot)com> writes:
> Tom Lane wrote:
>> Once you've determined which table the error message is talking about,
>> please show us what the transaction does with that table.

> You mean like:

> BEGIN;
> PREPARE msg (...) INSERT INTO email VALUES (...);
> EXECUTE msg (...)
> EXECUTE msg (...)
> EXECUTE msg (...)
> EXECUTE msg (...)
> ... repeated millions of times
> COMMIT;

What I'm wondering about is the sequence of operations that are executed
per row. Could it be long enough that the email table is being touched
by more than 2 billion separate SQL operations within the transaction?

regards, tom lane


From: Michael Milligan <milli(at)acmeps(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-30 00:48:05
Message-ID: 48B898C5.50100@acmeps.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Michael Milligan <milli(at)acmeps(dot)com> writes:
>> Tom Lane wrote:
>>> Once you've determined which table the error message is talking about,
>>> please show us what the transaction does with that table.
>
>> You mean like:
>
>> BEGIN;
>> PREPARE msg (...) INSERT INTO email VALUES (...);
>> EXECUTE msg (...)
>> EXECUTE msg (...)
>> EXECUTE msg (...)
>> EXECUTE msg (...)
>> ... repeated millions of times
>> COMMIT;
>
> What I'm wondering about is the sequence of operations that are executed
> per row. Could it be long enough that the email table is being touched
> by more than 2 billion separate SQL operations within the transaction?

I can't imagine that, but then again, I'm not a PostgreSQL internals
expert either. The partitioned email tables have indexes on them, but
in this part of the transaction, all I'm doing is EXECUTEs where the
prepared statement is INSERTing a single row. Nothing else fancy with
the email table is going on. There are other things going on with
values being stored to TEMP tables within the same transaction, would
those count toward the 2 billion?

I ran the transaction again and it failed again, so it is reproducable.
I'm going to try it with a much smaller dataset and see what happens.

Regards,
Mike

--
Michael Milligan -> milli(at)acmeps(dot)com


From: Michael Milligan <milli(at)acmeps(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-30 01:11:29
Message-ID: 48B89E41.6040300@acmeps.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
>
> What I'm wondering about is the sequence of operations that are executed
> per row. Could it be long enough that the email table is being touched
> by more than 2 billion separate SQL operations within the transaction?

FWIW, I've used the exact same code against PG 8.2.6 and have half a
dozen similar transactions that inserted more than 13.5 million rows,
with the largest transaction at a little over 25 million rows inserted
into the email table.

Regards,
Mike

--
Michael Milligan -> milli(at)acmeps(dot)com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Milligan <milli(at)acmeps(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-30 01:59:42
Message-ID: 6040.1220061582@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Michael Milligan <milli(at)acmeps(dot)com> writes:
> FWIW, I've used the exact same code against PG 8.2.6 and have half a
> dozen similar transactions that inserted more than 13.5 million rows,
> with the largest transaction at a little over 25 million rows inserted
> into the email table.

Hmph. That seems to eliminate the overflow theory, because 8.2 has
essentially the same lock-counting code as 8.3. Unless 8.3 is taking
out the lock a heckuva lot more than 8.2 did, but I can't think of a
reason for that to happen.

Now that we know you can reproduce it, we should think about how to get
some information out. Are you in a position to build a locally modified
Postgres? I could send you a patch to make that particular error report
dump out more information about the lock state, but a patch won't do you
any good if you aren't able to build from source.

regards, tom lane


From: Michael Milligan <milli(at)acmeps(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-08-30 02:38:11
Message-ID: 48B8B293.9040409@acmeps.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Michael Milligan <milli(at)acmeps(dot)com> writes:
>> FWIW, I've used the exact same code against PG 8.2.6 and have half a
>> dozen similar transactions that inserted more than 13.5 million rows,
>> with the largest transaction at a little over 25 million rows inserted
>> into the email table.
>
> Hmph. That seems to eliminate the overflow theory, because 8.2 has
> essentially the same lock-counting code as 8.3. Unless 8.3 is taking
> out the lock a heckuva lot more than 8.2 did, but I can't think of a
> reason for that to happen.
>
> Now that we know you can reproduce it, we should think about how to get
> some information out. Are you in a position to build a locally modified
> Postgres? I could send you a patch to make that particular error report
> dump out more information about the lock state, but a patch won't do you
> any good if you aren't able to build from source.

I can rebuild with a patch, sure. I've got a spare machine I can mirror
the db over to (same hardware) to debug this.

Regards,
Mike

--
Michael Milligan -> milli(at)acmeps(dot)com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Milligan <milli(at)acmeps(dot)com>
Cc: pgsql-bugs(at)postgreSQL(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-09-02 16:52:58
Message-ID: 15583.1220374378@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

[ reincluding the mailing list ]

Michael Milligan <milli(at)acmeps(dot)com> writes:
> Okay, it reproduces and surprise surprise nLocks does overflow...

> ERROR: lock AccessShareLock on object 16385/16467/0 is already held
> lock(0x87408a028) id(16385,16467,0,0,0,1) grantMask(a) waitMask(0)
> req(2,0,1,0,0,0,0,0)=3 grant(1,0,1,0,0,0,0,0)=2 wait(0)
> proclock(0x8743a7508) lock(0x87408a028) method(1) proc(0x8743aada8) hold(a)
> locallock(0xb29c78) nLocks(-2147483648) nOwners(2) mOwners(8)

Hah. Okay, that shows that we'd never have reproduced it with a small
test case.

> So I'm guessing this is not a bug? Just that for some reason 8.3.3 is
> grabbing more locks than 8.2.6 does and I have to adjust my batch
> processes to break things up into chunks of less than 10 million per
> transaction... :-/

Well, I'm wondering exactly why 8.3 is taking more locks than 8.2 does,
because AFAICS it shouldn't. Could you extract a complete example of
what your code does per tuple? I would like to run a small test case
and just see where the LockAcquire calls come from in each version.
(Or, if you prefer, you can get out gdb and do the legwork yourself...)
It's true that breaking up the transaction is likely to be the best
short-term answer for you, but I think there might be a bug here.

In any case, now that we know that nLocks overflow is actually possible
within real-world transaction lengths, it'd behoove us to do something
about that in 8.4 or beyond.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Milligan <milli(at)acmeps(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-09-07 13:06:02
Message-ID: 200809071306.m87D62I29017@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> [ reincluding the mailing list ]
>
> Michael Milligan <milli(at)acmeps(dot)com> writes:
> > Okay, it reproduces and surprise surprise nLocks does overflow...
>
> > ERROR: lock AccessShareLock on object 16385/16467/0 is already held
> > lock(0x87408a028) id(16385,16467,0,0,0,1) grantMask(a) waitMask(0)
> > req(2,0,1,0,0,0,0,0)=3 grant(1,0,1,0,0,0,0,0)=2 wait(0)
> > proclock(0x8743a7508) lock(0x87408a028) method(1) proc(0x8743aada8) hold(a)
> > locallock(0xb29c78) nLocks(-2147483648) nOwners(2) mOwners(8)
>
> Hah. Okay, that shows that we'd never have reproduced it with a small
> test case.
>
> > So I'm guessing this is not a bug? Just that for some reason 8.3.3 is
> > grabbing more locks than 8.2.6 does and I have to adjust my batch
> > processes to break things up into chunks of less than 10 million per
> > transaction... :-/
>
> Well, I'm wondering exactly why 8.3 is taking more locks than 8.2 does,
> because AFAICS it shouldn't. Could you extract a complete example of
> what your code does per tuple? I would like to run a small test case
> and just see where the LockAcquire calls come from in each version.
> (Or, if you prefer, you can get out gdb and do the legwork yourself...)
> It's true that breaking up the transaction is likely to be the best
> short-term answer for you, but I think there might be a bug here.
>
> In any case, now that we know that nLocks overflow is actually possible
> within real-world transaction lengths, it'd behoove us to do something
> about that in 8.4 or beyond.

Is this a TODO?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Michael Milligan <milli(at)acmeps(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-09-07 14:40:14
Message-ID: 13490.1220798414@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Tom Lane wrote:
>> In any case, now that we know that nLocks overflow is actually possible
>> within real-world transaction lengths, it'd behoove us to do something
>> about that in 8.4 or beyond.

> Is this a TODO?

Yes, although I'm still waiting for more info from Michael.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Milligan <milli(at)acmeps(dot)com>, pgsql-bugs(at)postgreSQL(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-12-29 14:35:52
Message-ID: 20081229143552.GF4545@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> [ reincluding the mailing list ]
>
> Michael Milligan <milli(at)acmeps(dot)com> writes:
> > Okay, it reproduces and surprise surprise nLocks does overflow...
>
> > ERROR: lock AccessShareLock on object 16385/16467/0 is already held
> > lock(0x87408a028) id(16385,16467,0,0,0,1) grantMask(a) waitMask(0)
> > req(2,0,1,0,0,0,0,0)=3 grant(1,0,1,0,0,0,0,0)=2 wait(0)
> > proclock(0x8743a7508) lock(0x87408a028) method(1) proc(0x8743aada8) hold(a)
> > locallock(0xb29c78) nLocks(-2147483648) nOwners(2) mOwners(8)
>
> Hah. Okay, that shows that we'd never have reproduced it with a small
> test case.

This hasn't been fixed yet, has it? Do you have any ideas on how to
actually fix the problem? I wonder if it looks like enlarging nLocks,
or you're intending to attempt to reduce the number of locks taken.

A customer of ours started hitting this bug too, last week.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Michael Milligan <milli(at)acmeps(dot)com>, pgsql-bugs(at)postgreSQL(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-12-29 15:06:18
Message-ID: 7452.1230563178@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Tom Lane wrote:
>> Michael Milligan <milli(at)acmeps(dot)com> writes:
>>> Okay, it reproduces and surprise surprise nLocks does overflow...
>>
>> Hah. Okay, that shows that we'd never have reproduced it with a small
>> test case.

> This hasn't been fixed yet, has it?

Well, it has been addressed anyway...

2008-09-15 21:56 tgl

* src/include/storage/: lock.h (REL8_1_STABLE), lock.h
(REL8_3_STABLE), lock.h (REL8_0_STABLE), lock.h (REL8_2_STABLE),
lock.h: Widen the nLocks counts in local lock tables from int to
int64. This forestalls potential overflow when the same table (or
other object, but usually tables) is accessed by very many
successive queries within a single transaction. Per report from
Michael Milligan.

Back-patch to 8.0, which is as far back as the patch conveniently
applies. There have been no reports of overflow in pre-8.3
releases, but clearly the risk existed all along. (Michael's
report suggests that 8.3 may consume lock counts faster than prior
releases, but with no test case to look at it's hard to be sure
about that. Widening the counts seems a good future-proofing
measure in any event.)

> A customer of ours started hitting this bug too, last week.

Tut tut, customer not on latest point release?

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Milligan <milli(at)acmeps(dot)com>, pgsql-bugs(at)postgreSQL(dot)org
Subject: Re: PG 8.3.3 - ERROR: lock AccessShareLock on object 16385/16467/0 is already held
Date: 2008-12-29 15:16:24
Message-ID: 20081229151624.GH4545@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:

> 2008-09-15 21:56 tgl
>
> * src/include/storage/: lock.h (REL8_1_STABLE), lock.h
> (REL8_3_STABLE), lock.h (REL8_0_STABLE), lock.h (REL8_2_STABLE),
> lock.h: Widen the nLocks counts in local lock tables from int to
> int64.

Doh, I didn't see it because I only checked the change log for lock.c,
not lock.h :-(

> > A customer of ours started hitting this bug too, last week.
>
> Tut tut, customer not on latest point release?

Wouldn't be a first for them (we try hard, they don't necessarily heed
advice).

Thanks!

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.