Re: Patch for fail-back without fresh backup

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Sawada Masahiko'" <sawada(dot)mshk(at)gmail(dot)com>
Cc: "'Amit Langote'" <amitlangote09(at)gmail(dot)com>, "'Samrat Revagade'" <revagade(dot)samrat(at)gmail(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch for fail-back without fresh backup
Date: 2013-07-02 05:45:33
Message-ID: 01d201ce76e7$5f49d440$1ddd7cc0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, June 28, 2013 10:41 AM Sawada Masahiko wrote:
> On Wed, Jun 26, 2013 at 1:40 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
> wrote:
> > On Tuesday, June 25, 2013 10:23 AM Amit Langote wrote:
> >> Hi,
> >>
> >> >
> >> >> So our proposal on this problem is that we must ensure that
> master
> >> should
> >> > not make any file system level changes without confirming that the
> >> >> corresponding WAL record is replicated to the standby.
> >> >
> >> > How will you take care of extra WAL on old master during
> recovery.
> >> If it
> >> > plays the WAL which has not reached new-master, it can be a
> problem.
> >> >
> >>
> >> I am trying to understand how there would be extra WAL on old master
> >> that it would replay and cause inconsistency. Consider how I am
> >> picturing it and correct me if I am wrong.
> >>
> >> 1) Master crashes. So a failback standby becomes new master forking
> the
> >> WAL.
> >> 2) Old master is restarted as a standby (now with this patch,
> without
> >> a new base backup).
> >> 3) It would try to replay all the WAL it has available and later
> >> connect to the new master also following the timeline switch (the
> >> switch might happen using archived WAL and timeline history file OR
> >> the new switch-over-streaming-replication-connection as of 9.3,
> >> right?)
> >>
> >> * in (3), when the new standby/old master is replaying WAL, from
> where
> >> is it picking the WAL?
> > Yes, this is the point which can lead to inconsistency, new
> standby/old master
> > will replay WAL after the last successful checkpoint, for which he
> get info from
> > control file. It is picking WAL from the location where it was
> logged when it was active (pg_xlog).
> >
> >> Does it first replay all the WAL in pg_xlog
> >> before archive? Should we make it check for a timeline history file
> in
> >> archive before it starts replaying any WAL?
> >
> > I have really not thought what is best solution for problem.
> >
> >> * And, would the new master, before forking the WAL, replay all the
> >> WAL that is necessary to come to state (of data directory) that the
> >> old master was just before it crashed?
> >
> > I don't think new master has any correlation with old master's data
> directory,
> > Rather it will replay the WAL it has received/flushed before start
> acting as master.
> when old master fail over, WAL which ahead of new master might be
> broken data. so that when user want to dump from old master, there is
> possible to fail dump.
> it is just idea, we extend parameter which is used in recovery.conf
> like 'follow_master_force'. this parameter accepts 'on' and 'off', is
> effective only when standby_mode is set to on.
>
> if both parameters 'follow_master_force' and 'standby_mode' is set to
> 'on',
> 1. when standby server starts and starts to recovery, standby server
> skip to apply WAL which is in pg_xlog, and request WAL from latest
> checkpoint LSN to master server.
> 2. master server receives LSN which is standby server latest
> checkpoint, and compare between LSN of standby and LSN of master
> latest checkpoint. if those LSN match, master will send WAL from
> latest checkpoint LSN. if not, master will inform standby that failed.
> 3. standby will fork WAL, and apply WAL which is sent from master
> continuity.

Please consider if this solution has the same problem as mentioned by Robert Hass in below mail:
http://www.postgresql.org/message-id/CA+TgmoY4j+p7JY69ry8GpOSMMdZNYqU6dtiONPrcxaVG+SPByg@mail.gmail.com

> in this approach, user who want to dump from old master will set 'off'
> to follow_master_force and standby_mode, and gets the dump of old
> master after master started. OTOH, user who want to starts replication
> force will set 'on' to both parameter.

I think before going into solution of this problem, it should be confirmed by others whether such a problem
needs to be resolved as part of this patch.

I have seen that Simon Riggs is a reviewer of this Patch and he hasn't mentioned his views about this problem.
So I think it's not worth inventing a solution.

Rather I think if all other things are resolved for this patch, then may be in end we can check with Committer,
if he thinks that this problem needs to be solved as a separate patch.

With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-07-02 05:56:13 Re: Move unused buffers to freelist
Previous Message Atri Sharma 2013-07-02 04:33:57 Re: Randomisation for ensuring nlogn complexity in quicksort