Re: Inconsistent DB data in Streaming Replication

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, <sthomas(at)optionshouse(dot)com>
Cc: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Samrat Revagade'" <revagade(dot)samrat(at)gmail(dot)com>, "'Hannu Krosing'" <hannu(at)2ndquadrant(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'Andres Freund'" <andres(at)2ndquadrant(dot)com>
Subject: Re: Inconsistent DB data in Streaming Replication
Date: 2013-04-11 07:09:01
Message-ID: 004c01ce3683$729276a0$57b763e0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, April 10, 2013 10:31 PM Fujii Masao wrote:
> On Thu, Apr 11, 2013 at 1:44 AM, Shaun Thomas
> <sthomas(at)optionshouse(dot)com> wrote:
> > On 04/10/2013 11:40 AM, Fujii Masao wrote:
> >
> >> Strange. If this is really true, shared disk failover solution is
> >> fundamentally broken because the standby needs to start up with the
> >> shared "corrupted" database at the failover.
> >
> >
> > How so? Shared disk doesn't use replication. The point I was trying
> to make
> > is that replication requires synchronization between two disparate
> servers,
> > and verifying they have exactly the same data is a non-trivial
> exercise.
> > Even a single transaction after a failover (effectively) negates the
> old
> > server because there's no easy "catch up" mechanism yet.
>
> Hmm... ISTM what Samrat is proposing can resolve the problem. That is,
> if we can think that any data page which has not been replicated to the
> standby
> is not written in the master, new standby (i.e., old master) can safely
> catch up
> with new master (i.e., old standby). In this approach, of course, new
> standby
> might have some WAL records which new master doesn't have, so before
> starting up new standby, we need to remove all the WAL files in new
> standby
> and retrieve any WAL files from new master. But, what's the problem in
> his
> approach?

Consider the case old-master crashed during flushing the data page, now you
would need full page image from new-master.
It might so happen that in new-master Checkpoint would have purged (reused)
the log file's from that time line, in that case
it will be difficult to get the full page image, user can refer WAL archive
for that, but I think it will not be straight forward.

One more point, what will be the new behavior when there are 2 transactions
one has synchronous_commit =off and other with on?

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robins Tharakan 2013-04-11 07:14:55 Add regression tests for COLLATE
Previous Message Jeff Janes 2013-04-11 06:45:35 (auto)vacuum truncate exclusive lock