Bug in walreceiver

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Bug in walreceiver
Date: 2011-01-13 08:28:50
Message-ID: AANLkTiny0vRmAT+TFd09wh2b_9gNU1RvT_gkmCzZzH__@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When the master shuts down or crashes, there seems to be
the case where walreceiver exits without flushing WAL which
has already been written. This might lead startup process to
replay un-flushed WAL and break a Write-Ahead-Logging rule.

walreceiver.c
> /* Wait a while for data to arrive */
> if (walrcv_receive(NAPTIME_PER_CYCLE, &type, &buf, &len))
> {
> /* Accept the received data, and process it */
> XLogWalRcvProcessMsg(type, buf, len);
>
> /* Receive any more data we can without sleeping */
> while (walrcv_receive(0, &type, &buf, &len))
> XLogWalRcvProcessMsg(type, buf, len);
>
> /*
> * If we've written some records, flush them to disk and let the
> * startup process know about them.
> */
> XLogWalRcvFlush();
> }

The problematic case happens when the latter walrcv_receive
emits ERROR. In this case, the WAL received by the former
walrcv_receive is not guaranteed to have been flushed yet.

The attached patch ensures that all WAL received is flushed to
disk before walreceiver exits. This patch should be backported
to 9.0, I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
flush_before_walreceiver_exit_v1.patch application/octet-stream 498 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-01-13 08:59:15 Re: Bug in walreceiver
Previous Message Martijn van Oosterhout 2011-01-13 08:06:45 Re: arrays as pl/perl input arguments [PATCH]