Re: Use a signal to trigger a memory context dump?

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Use a signal to trigger a memory context dump?
Date: 2014-06-24 05:21:53
Message-ID: 20140624052153.GA1241113@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

+1 for having an API better than GDB to make a process emit a memory usage
dump. This is my top non-crash cause for use of GDB in production.

On Mon, Jun 23, 2014 at 07:21:22PM +0200, Andres Freund wrote:
> On 2014-06-23 10:07:36 -0700, Tom Lane wrote:
> > Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > > I wonder if it'd make sense to allow a signal to trigger a memory
> > > context dump? I and others more than once had the need to examine memory
> > > usage on production systems and using gdb isn't always realistic.
> > > I wonder if we could install a signal handler for some unused signal
> > > (e.g. SIGPWR) to dump memory.

SIGPWR is not widely available. Apart from SIGUSR1 and SIGUSR2, using a
portable signal risks colliding with the standard use thereof.

> > > I'd also considered adding a SQL function that uses the SIGUSR1 signal
> > > multiplexing for the purpose but that's not necessarily nice if you have
> > > to investigate while SQL access isn't yet possible. There's also the
> > > problem that not all possibly interesting processes use the sigusr1
> > > signal multiplexing.

I don't know whether to be interested in cases where SQL access is
unavailable. If those cases are important, an idea for achieving it without
leaning on unportable or already-used signals is to define SIGUSR2 as a second
multiplexer that uses files instead of shared memory. You'd send the signal
with something like this:

: >$PGDATA/procsig/$targetpid.memdump
kill -USR2 $targetpid

(This would probably require first converting the existing autovacuum use of
SIGUSR2 to the shared memory procsig mechanism.)

> > The closest approximation that I think would be reasonable is to
> > set a flag that would be noticed by the next CHECK_FOR_INTERRUPTS
> > macro. So you're already buying into the assumption that the process
> > executes CHECK_FOR_INTERRUPTS fairly often. Which probably means
> > that assuming it's using the standard sigusr1 handler isn't a big
> > extra limitation.

If it's acceptable to require SQL access and exclude would-be target processes
that detach from shared memory, I favor an approach using the shared memory
SIGUSR1 multiplexer. Bringing all the processes that do use shared memory
into agreement about the use of SIGUSR1 feels like a valuable step forward.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2014-06-24 05:32:12 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Previous Message Amit Kapila 2014-06-24 04:56:38 Re: releaseOk and LWLockWaitForVar