Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Date: 2013-01-17 16:42:25
Message-ID: 20130117164225.GC22844@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-17 18:33:42 +0200, Heikki Linnakangas wrote:
> On 17.01.2013 17:42, Andres Freund wrote:
> >Ok, the attached patch seems to fix a) and b). c) above is bogus, as
> >explained in a comment in the patch. I also noticed that the TLI check
> >didn't mark the last source as failed.
>
> This looks fragile:
>
> > /*
> > * We only end up here without a message when XLogPageRead() failed
> > * - in that case we already logged something.
> > * In StandbyMode that only happens if we have been triggered, so
> > * we shouldn't loop anymore in that case.
> > */
> > if (errormsg == NULL)
> > break;
>
> I don't like relying on the presence of an error message to control logic
> like that. Should we throw in an explicit CheckForStandbyTrigger() check in
> the condition of that loop?

I agree, I wasn't too happy about that either. But in some sense its
only propagating state from XLogReadPage which already has dealt with
the error and decided its ok.
Its the solution closest to what happened in the old implementation,
thats why I thought it would be halfway to acceptable.

Adding the CheckForStandbyTrigger() in the condition would mean
promotion would happen before all the available records are processed
and it would increase the amount of stat()s tremendously.
So I don't really like that either.

Don't really have a good idea :(

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2013-01-17 16:44:55 Re: Latex longtable format
Previous Message Heikki Linnakangas 2013-01-17 16:33:42 Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave