Re: bgworker crashed or not?

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Antonin Houska <antonin(dot)houska(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Subject: Re: bgworker crashed or not?
Date: 2014-05-07 20:55:42
Message-ID: 536A9DCE.7060304@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/05/14 22:32, Robert Haas wrote:
> On Tue, May 6, 2014 at 8:25 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
> I've committed the portion of your patch which does this, with pretty
> extensive changes. I moved the function which resets the crash times
> to bgworker.c, renamed it, and made it just reset all of the crash
> times unconditionally; there didn't seem to be any point in skipping
> the irrelevant ones, so it seemed best to keep things simple. I also
> moved the call from reaper() where you had it to
> PostmasterStateMachine(), because the old placement did not work for
> me when I carried out the following steps:
>
> 1. Kill a background worker with a 60-second restart time using
> SIGTERM, so that it does exit(1).
> 2. Before it restarts, kill a regular backend with SIGQUIT.
>
> Let me know if you think I've got it wrong.
>

No I think it's fine, I didn't try that combination and just wanted to
put it as deep in the call as possible.

>>> (2) If a shmem-connected backend fails to release the deadman switch
>>> or exits with an exit code other than 0 or 1, we crash-and-restart. A
>>> non-shmem-connected backend never causes a crash-and-restart.
>>
>> +1
>
> I did this as a separate commit,
> e2ce9aa27bf20eff2d991d0267a15ea5f7024cd7, just moving the check for
> ReleasePostmasterChildSlot inside the if statement. Then I realized
> that was bogus, so I just pushed
> eee6cf1f337aa488a20e9111df446cdad770e645 to fix that. Hopefully it's
> OK now.

The fixed one looks ok to me.

>
>>> (3) When a background worker exits without triggering a
>>> crash-and-restart, an exit code of precisely 0 causes the worker to be
>>> unregistered; any other exit code has no special effect, so
>>> bgw_restart_time controls.
>>
>> +1
>
> This isn't done yet.
>

Unless I am missing something this change was included in every patch I
sent - setting rw->rw_terminate = true; in CleanupBackgroundWorker for
zero exit code + comment changes. Or do you have objections to this
approach?

Anyway missing parts attached.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
unregister-bgworker-on-exit0-v4.diff text/x-patch 3.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-05-07 21:00:57 Re: jsonb existence queries are misimplemented by jsonb_ops
Previous Message Jeff Janes 2014-05-07 20:51:57 Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers