Re: postmaster recovery and automatic restart suppression

From: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Harald (NSN - DE/Munich) Kolb" <harald(dot)kolb(at)nsn(dot)com>, ext Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Stark <stark(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, "Thoralf (NSN - FI/Helsinki) Czichy" <thoralf(dot)czichy(at)nsn(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postmaster recovery and automatic restart suppression
Date: 2009-06-09 20:03:01
Message-ID: CD358425-972D-4B23-940F-89999230CB48@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Not really since once you fail over you may as well stop the rebuild
since you'll have to restore the whole database. Moreover wouldn't
that have to be a manual decision?

The closest thing I can come to a use case would be if you run a very
large cluster with hundreds of read-only replicas. If one has problems
you would rather the load balancer notice and take it out of rotation
immediately rather than have it flap and continue to cause problems.

Even there it would be dicey since a software bug could easily cause
all your replicas to start misbehaving simultaneously. It would suck
to see them all shut down one by one...

--
Greg

On 9 Jun 2009, at 20:53, "Kevin Grittner"
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

> "Kolb, Harald (NSN - DE/Munich)" <harald(dot)kolb(at)nsn(dot)com> wrote:
>>> From: ext Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
>
>>> Mechanism should exist to support useful policy. I don't believe
>>> that the proposed switch has any real-world usefulness.
>
>> There are some good reasons why a switchover could be an appropriate
>> means in case the DB is facing troubles. It may be that the root
>> cause is not the DB itsself, but used resources or other things
>> which are going crazy and hit the DB first
>
> Would an example of this be that one drive in a RAID has gone bad and
> the hot spare rebuild has been triggered, leading to poor performance
> for a while? Is that the sort of issue where you see value?
>
> -Kevin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-06-09 20:04:27 Re: postmaster recovery and automatic restart suppression
Previous Message Kevin Grittner 2009-06-09 19:53:26 Re: postmaster recovery and automatic restart suppression