Re: warning message in standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: warning message in standby
Date: 2010-06-15 03:35:40
Message-ID: AANLkTimUMIsCEqnQttugwbqReo7qugQIy8kH-J_o_5lc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 14, 2010 at 10:35 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Tue, Jun 15, 2010 at 12:09 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> The testing that I have been doing while we've been discussing this
>> reveals that you are correct.  I set up an HS/SR master and slave
>> (running on the same machine), ran pgbench on the master, and then
>> started randomly sending SIGSEGV to one of the master's backends.  It
>> seems that complaints about the WAL are possible on both master and
>> slave.  Here are a couple from the slave:
>>
>> LOG:  unexpected pageaddr 0/89B7A000 in log file 0, segment 152, offset 12034048
>> WARNING:  there is no contrecord flag in log file 0, segment 136, offset 2523136
>> LOG:  invalid magic number 0000 in log file 0, segment 136, offset 2531328
>>
>> The slave reconnects and then things get better.  So I think your idea
>> of retrying once and then panicking is probably best.
>
> AFAIR, in the previous discussion, some people think that it's better to
> keep the standby open for read-only queries even if an error is found.
> Panicking would be undesirable for them.
>
> On the other hand, I like immediate-panicking. And I don't want the standby
> to retry reconnecting the master infinitely.
>
> To cover all the use cases, how about introducing new parameter specifying
> the maximum number of times to retry reconnecting? If we like the retry-once-
> then-panicking idea, we can set the parameter to one. If we'd like to keep
> the standby open infinitely, we can set it to the very large value (or -1
> meaning infinite retrying). If we think that immediate-panicking is the best,
> we can set it to zero. Thought?

I think that would be overkill. If the user wants to use the slave to
answer queries even though recovery can't continue, they can always
remove recovery.conf.

It seems pretty optimistic to assume that this will be useful very
often, though. Most people will need their slave to be up to date
(otherwise why bother with Hot Standby in the first place?). I feel
pretty safe in predicting that if the master and slave get out of
sync, most administrators are going to want to take a new base backup
immediately.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2010-06-15 03:47:48 Re: [v9.1] Add security hook on initialization of instance
Previous Message Tom Lane 2010-06-15 03:28:11 Re: [v9.1] Add security hook on initialization of instance