Re: Timeout and Synch Rep

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Timeout and Synch Rep
Date: 2010-10-08 13:30:58
Message-ID: AANLkTikPjD6ji461ckcgfG859KuzNMeQ9TAauwPAeat3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 8, 2010 at 4:50 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> In my effort to make the discussion around the design decisions of synch
> rep less opaque, I'm starting a separate thread about what has developed
> to be one of the more contentious issues.
>
> I'm going to champion timeouts because I plan to use them.  In fact, I
> plan to deploy synch rep with a timeout if it's available within 2 weeks
> of 9.1 being released.  Without a timeout (i.e. "wait forever" is the
> only mode), that project will probably never use synch rep.
>
> Let me give you my use-case so that you can understand why I want a timeout.
>
> Client is a telecommunications service provider.  They have a primary
> server and a failover server for data updates.  They also have two async
> slaves on older machines for reporting purposes.   The failover
> currently does NOT accept any queries in order to keep it as current as
> possible.
>
> They would like the failover to be synchronous so that they can
> guarentee no data loss in the event of a master failure.  However, zero
> data loss is less important to them than uptime ... they have a five9's
> SLA with their clients, and the hardware on the master is very good.
>
> So, if something happens to the standby, and it cannot return an ack in
> 30 seconds, they would like it to degrade to asynch mode.  At that
> point, they would also like to trigger a nagios alert which will wake up
> the sysadmin with flashing red lights.  Once he has resolved the
> problem, he would like to promote the now-asynch standby back to synch
> standby.
>
> Yes, this means that, in the event of a standby failure, they have a
> window where any failure on the master will mean data loss.  The user
> regards this risk as acceptable, given that both the master and the
> failover are located in the same data center in any case, so there is
> always a risk of a sufficient disaster wiping out all data back to the
> daily backup.

This explains very well why some systems require the timeout.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2010-10-08 13:39:49 Re: Timeout and Synch Rep
Previous Message Andrew Dunstan 2010-10-08 13:18:27 Re: Git cvsserver serious issue