Re: walreceiver is uninterruptible on win32

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: walreceiver is uninterruptible on win32
Date: 2010-04-15 03:13:29
Message-ID: p2h3f0b79eb1004142013heb708f31v52d7f20220f4f1df@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 14, 2010 at 11:15 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> http://archives.postgresql.org/pgsql-hackers/2010-04/msg00077.php
>>> As for the code itself, don't we need a CHECK_FOR_INTERRUPTS in there
>>> for it to be actually useful?
>>
>> Since the backend version of select() receives any incoming signals
>> on Win32, CHECK_FOR_INTERRUPTS seems not to be needed in the loop,
>> at least in walreceiver. No? The patch doesn't put it in there, and
>> I was able to interrupt walreceiver executing libpqrcv_PQexec() when
>> I tested the patch on Win32.
>
> It will call the signal handler, yes. Normally, the signal handler
> just sets a flag somewhere, which needs to be checked with
> CHECK_FOR_INTERRUPTS.
>
> From how I read the walreceiver.c code (which I'm not that familiar
> with), the signal handlers call ProcessWalRcvInterrupts() which in
> turn has CHECK_FOR_INTERRUPTS in it, and this is where it ends up
> being called.

Yes. While establishing replication connection (i.e., executing
walrcv_connect function), the SIGTERM signal handler directly calls
ProcessWalRcvInterrupts() which does CHECK_FOR_INTERRUPTS() and
elog(FATAL).

>>> The patch also seems confused about whether it's intending to be a
>>> general-purpose solution or not.  You can maybe take some shortcuts
>>> if it's only going to be for walreceiver, but if you're going to put
>>> it in dblink it is *not* acceptable to take shortcuts like not
>>> processing errors completely.
>>
>> The patch still takes some shortcuts since we decided to postpone
>> the fix for dblink to 9.1 or later.
>
> Given those shortcuts, can't we simplify it even further like
> attached?

No, we need to repeat PQgetResult() at least two times. The first call
of it reads the RowDescription, DataRow and CommandComplete messages
and switches the state to PGASYNC_READY. The second one reads the
ReadyForQuery message and switches the state to PGASYNC_IDLE. So if we
don't repeat it, libpqrcv_PQexec() would end in a half-finished state.

> (If nothing else, your code did PQclear() on an
> uninitialized pointer - must've been pure luck that it worked)

PQclear(NULL) might be called in my patch, but this is not a problem
since PQclear() does nothing if the specified PGresult argument is NULL.

> Looking at the call-sites, there are bugs now - if PQexec() returns
> NULL, we don't deal with it. It also doesn't always free the result
> properly. I've added checks for that.

As Tom pointed out in another post, we don't need to treat the
"result is NULL" case as special. OTOH, as you pointed out, I
forgot calling PQclear() when the second call of libpqrcv_PQexec()
in libpqrcv_connect() fails. I added it to the patch. Thanks!

> Finally, I've updated some of the comments.

Thanks a lot! I applied that improvements to the patch.

I attached the revised patch.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
libpqrcv_PQexec_v3.patch application/octet-stream 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-04-15 03:22:54 Re: Rogue TODO list created
Previous Message Robert Haas 2010-04-15 02:55:31 Re: Thoughts on pg_hba.conf rejection