Re: Re: Proposed Windows-specific change: Enable crash dumps (like core files)

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Proposed Windows-specific change: Enable crash dumps (like core files)
Date: 2010-11-22 11:37:39
Message-ID: AANLkTikUvW2dKWZhd28CVKf1NJUF7C4X968_Ari7kDMm@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 17:21, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> wrote:
> OK, it's pretty much ready for proper testing now. If a few people are happy
> with the results and think it's a good idea I'll chuck it in the commitfest
> app.
>
> As built, the crash dump handler works with a stock PostgreSQL 9.0 (release
> build) as shipped in EDB's installer. Just drop crashdump.dll in lib\,
> optionally pop the dbghelp.dll redist in bin\, add 'crashdump' to
> shared_preload_libraries, and crash some backends however you feel like
> doing so.
>
> The current build of crashdump.dll for release versions of PostgreSQL 9.0 on
> 32-bit Windows is here:
>
>  http://www.postnewspapers.com.au/~craig/webfiles/crashdump.dll
>
> If folks are happy with how this works, all it needs is:
>
> - Consideration of whether elog should be used or not. I'm inclined to
>  suggest using elog to tell the user about the dump, but only after
>  the crash dump has been written successfully.
>
> - Comments on whether this should be left as a loadable module, or
>  integerated into core so it loads early in backend startup. The latter
>  will permit crash dumping of early backend startup problems, and will
>  have tiny overhead because there's no DLL to find and load. OTOH, it's
>  harder to pull out if somehow something breaks.
>
>  I'd want to start with loadable module in shared_preload_libraries
>  and if that proves useful, only then consider integrating in core.
>  I'm way too bad a programmer to want my code anywhere near Pg's core
>  without plenty of real world testing.
>
> - (Maybe) a method to configure the crash dump type. I'm not too sure
>  it's necessary given the set of dump flags I've landed up with,
>  so I'd leave this be unless it proves to be necessary in real-world
>  testing.
>
Finally getting to looking at this one - sorry about the very long delay.

I agree with Heikki's earlier comment that it's better to have this
included in the backend - but that's obviously not going to happen for
already-released versions. I'd therefor advocate a contrib module for
existing versions, and then in core for 9.1+.

We should then have an option to turn it off (on by default). But we
don't need to pay the overhead on every backend startup - we could
just map the value into the parameter shared memory block, and require
a full postmaster restart in order to change it.

Do we want to backpatch it into contrib/? Adding a new module there
seems kind of wrong - probably better to keep the source separate and
just publish the DLL files for people who do debugging?

Looking at the code:
* Why do we need to look at differnt versions of dbghelp.dll? Can't we
always use the one with Windows? And in that case, can't we just use
compile-time linking, do we need to bother with DLL loading at all?

* Per your comment about using elog() - a good idea in that case is to
use write_stderr(). That will send the output to stderr if there is
one, and otherwise the eventlog (when running as service).

* And yes, in the crash handler, we should *definitely* not use elog().

* On Unix, the core file is dropped in the database directory, we
don't have a separate directory for crashdumps. If we want to be
consistent, we should do that here too. I do think that storing them
in a directory like "crashdumps" is better, but I just wanted to raise
the comment.

* However, when storing it in crashdumps, I think the code would need
to create that directory if it does not exist, doesn't it?

* Right now, we overwrite old crashdumps. It is well known that PIDs
are recycled pretty quickly on Windows - should we perhaps dump as
postgres-<pid>-<sequence>.mdmp when there is a filename conflict?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-11-22 11:54:55 Re: Latches with weak memory ordering (Re: max_wal_senders must die)
Previous Message Valentine Gogichashvili 2010-11-22 11:21:28 Re: final patch - plpgsql: for-in-array