Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)
Date: 2011-10-15 09:31:35
Message-ID: 956A880E-A75E-42DE-9C2E-21FB542E04EE@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Oct11, 2011, at 09:21 , Magnus Hagander wrote:
> On Tue, Oct 11, 2011 at 03:29, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> On Oct10, 2011, at 21:25 , Magnus Hagander wrote:
>>> On Thu, Oct 6, 2011 at 23:46, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>>>> It'd be nice to generally terminate a backend if the client vanishes, but so
>>>> far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
>>>> sends a signal *everytime* the fd becomes readable or writeable, not only on
>>>> EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could
>>>> make the postmaster keep the fd's of around even after forking a backend, and
>>>> make it watch for broken connections using select(). But with a large max_backends
>>>> settings, we'd risk running out of fds in the postmaster...
>>>
>>> Ugh. Yeah. But at least catching it and terminating it when we *do*
>>> notice it's down would certainly make sense...
>>
>> I'll try to put together a patch that sets a flag if we discover a broken
>> connection in pq_flush, and tests that flag in CHECK_FOR_INTERRUPTS. Unless you
>> wanna, of course.
>
> Please do, I won't have time to even think about it until after
> pgconf.eu anyway ;)

Ok, here's a first cut.

I've based this on how query cancellation due to recovery conflicts work -
internal_flush() sets QueryCancelPending and ClientConnectionLostPending.

If QueryCancelPending is set, CHECK_FOR_INTERRUPTS checks
ClientConnectionLostPending, and if it's set it does ereport(FATAL).

I've only done light testing so far - basically the only case I've tested is
killing pg_basebackup while it's waiting for all required WAL to be archived.

best regards,
Florian Pflug

Attachment Content-Type Size
pg.discon_cancel.v1.patch application/octet-stream 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Boley 2011-10-15 10:47:16 Re: WIP: collect frequency statistics for arrays
Previous Message Tom Lane 2011-10-15 05:41:00 Re: [REVIEW] Patch for cursor calling with named parameters