Re: warning message in standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: warning message in standby
Date: 2010-06-11 13:46:24
Message-ID: AANLkTimjHH6D5ajVJFqdyaicEZmey7F6yyJXMhp05NLI@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 11, 2010 at 9:43 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Thu, 2010-06-10 at 19:01 +0300, Heikki Linnakangas wrote:
>> >
>> > What "warning message" are we talking about?  All the error cases I can
>> > think of in WAL-application are ERROR, or likely even PANIC.
>>
>> We're talking about a corrupt record (incorrect CRC, incorrect backlink
>> etc.), not errors within redo functions. During crash recovery, a
>> corrupt record means you've reached end of WAL. In standby mode, when
>> streaming WAL from master, that shouldn't happen, and it's not clear
>> what to do if it does. PANIC is not a good idea, at least if the server
>> uses hot standby, because that only makes the situation worse from
>> availability point of view. So we log the error as a WARNING, and keep
>> retrying. It's unlikely that the problem will just go away, but we keep
>> retrying anyway in the hope that it does. However, it seems that we're
>> too aggressive with the retries.
>
> If my streaming replication stops working, I want to know about it as
> soon as possible. WARNING just doesn't cut it.
>
> This needs some better thought.
>
> If we PANIC, then surely it will PANIC again when we restart unless we
> do something. So we can't do that. But we need to do something better
> than
>
> WARNING there is a bug that will likely cause major data loss
> HINT you'll be sacked if you miss this message

+1. I was making this same argument (less eloquently) upthread.

I particularly like the errhint().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2010-06-11 13:57:48 Re: Proposal for 9.1: WAL streaming from WAL buffers
Previous Message Andrew Dunstan 2010-06-11 13:44:53 pg_upgrade output directory