Re: Patch for fail-back without fresh backup

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Samrat Revagade <revagade(dot)samrat(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch for fail-back without fresh backup
Date: 2013-10-08 09:37:02
Message-ID: CABOikdP=2ZfcFKaNFe2f_HJs60qkdaokbV0mmdBCpAxRAUKrXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 8, 2013 at 2:33 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>wrote:

> On Fri, Oct 4, 2013 at 4:32 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> >
> I attached the v12 patch which have modified based on above suggestions.
>

There are still some parts of this design/patch which I am concerned about.

1. The design clubs synchronous standby and failback safe standby rather
very tightly. IIRC this is based on the feedback you received early, so my
apologies for raising it again so late.
a. GUC synchrnous_standby_names is used to name synchronous as well as
failback safe standbys. I don't know if that will confuse users.
b. synchronous_commit's value will also control whether a sync/async
failback safe standby wait for remote write or flush. Is that reasonable ?
Or should there be a different way to configure the failback safe standby's
WAL safety ?

2. With the current design/implementation, user can't configure a
synchronous and an async failback safe standby at the same time. I think we
discussed this earlier and there was an agreement on the limitation. Just
wanted to get that confirmed again.

3. SyncRepReleaseWaiters() does not know whether its waking up backends
waiting for sync rep or failback safe rep. Is that ok ? For example, I
found that the elog() message announcing next takeover emitted by the
function may look bad. Since changing synchronous_transfer requires server
restart, we can teach SyncRepReleaseWaiters() to look at that parameter to
figure out whether the standby is sync and/or failback safe standby.

4. The documentation still need more work to clearly explain the use case.

5. Have we done any sort of stress testing of the patch ? If there is a
bug, the data corruption at the master can go unnoticed. So IMHO we need
many crash recovery tests to ensure that the patch is functionally correct.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Tiikkaja 2013-10-08 09:41:47 Re: plpgsql.print_strict_params
Previous Message Tomas Vondra 2013-10-08 09:11:57 Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption