Re: Stefan's bug (was: max_standby_delay considered harmful)

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Stefan's bug (was: max_standby_delay considered harmful)
Date: 2010-05-20 02:22:13
Message-ID: AANLkTin7RSrOlV-1NJxCtCJDniRU7YyOvP9sVxFGpAEl@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 19, 2010 at 10:03 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, May 19, 2010 at 8:49 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On Wed, 2010-05-19 at 08:21 -0400, Tom Lane wrote:
>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> > On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> >> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
>>> >> same indicator as the boolean you suggested. Thought?
>>>
>>> > It feels cleaner and simpler to me to use the information that the
>>> > postmaster already collects rather than having it take locks and check
>>> > shared memory, but I might be wrong.  Why do you prefer doing it that
>>> > way?
>>>
>>> The postmaster must absolutely not take locks (once there are competing
>>> processes).  This is non negotiable from a system robustness standpoint.
>>
>> Masao has not proposed this, in fact his proposal was to deliberately
>> avoid do so.
>>
>> I proposed using the state recorded in xlog.c rather than attempting to
>> duplicate that with a second boolean in postmaster because that seems
>> likely to be more buggy.
>
> Well then how are we reading XLogCtl?

In my patch, XLogCtl is directly read in xlog.c without any lock since
there should be no other processes running when CancelBackup() is called.

*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 8975,8980 **** CancelBackup(void)
--- 8975,8987 ----
{
struct stat stat_buf;

+ /*
+ * During recovery, we don't rename the "backup_label" file since
+ * it might be required for subsequent recovery.
+ */
+ if (XLogCtl->SharedRecoveryInProgress)
+ return;
+
/* if the file is not there, return */
if (stat(BACKUP_LABEL_FILE, &stat_buf) < 0)
return;

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2010-05-20 03:17:07 Re: pg_stat_transaction patch
Previous Message Scott Marlowe 2010-05-20 02:07:39 Re: merge join killing performance