Re: Postgres DB crashing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alan Hodgson <ahodgson(at)simkin(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Postgres DB crashing
Date: 2013-06-23 17:28:05
Message-ID: 19003.1372008485@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-support pgsql-general

Alan Hodgson <ahodgson(at)simkin(dot)ca> writes:
> On Thursday, June 20, 2013 07:52:21 AM Merlin Moncure wrote:
>> OP needs to explore use of connection pooler, in particular pgbouncer.
>> Anyways none of this explains why the server is actually crashing.

> It might be hitting file descriptor limits. I didn't dig into the earlier part
> of this thread much, though.

The disturbing part of the original report was this:

>>> 2013-06-11 16:54:14 GMT [22226]: [1-1]PANIC: stuck spinlock (0x2aaab54279d4) detected at bufmgr.c:1239

which implies that something was holding a buffer header spinlock for an
unreasonably long time (roughly 2 minutes, when no operation that holds
such a lock should take more than a few nanoseconds). But if you were
running a load test that absolutely mashed the machine into the ground,
as the OP seems to have been doing, maybe that could happen --- perhaps
some unlucky backend got interrupted and then swapped out during the
narrow window where it held such a lock, and the machine was too
overloaded to give that process any more cycles for a very long time.

As has been noted already, this test setup seems to have overloaded the
machine by at least two orders of magnitude compared to useful settings
for the available hardware. The "stuck spinlock" error would only come
out if a lock had been held for quite a lot more than two orders of
magnitude more time than expected, though. So I'm not entirely sure
that I buy this theory; but it's hard to see another one. (I discount
the obvious other theory that there's a software bug, because I just
looked through 9.2's bufmgr.c very carefully, and there are no code
paths where it fails to release a buffer header lock within a very few
instructions from where it took the lock.)

regards, tom lane

In response to

Browse pgadmin-support by date

  From Date Subject
Next Message Eike Dierks 2013-06-23 21:45:36 Problem: List view collapses upon change
Previous Message Ashesh Vashi 2013-06-21 04:47:20 Re: bug: repeated messages in pgadmin (1.18.0 Alpha 1) query tool messages pane

Browse pgsql-general by date

  From Date Subject
Next Message Martín Marqués 2013-06-23 21:04:52 Re: pg_restore order and check constraints
Previous Message Noah Misch 2013-06-23 16:53:59 Re: BUG #7493: Postmaster messages unreadable in a Windows console