Re: Scheduled maintenance affecting gitmaster

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Stefan Kaltenbrunner <Stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Scheduled maintenance affecting gitmaster
Date: 2011-02-14 11:01:53
Message-ID: AANLkTikofZ75ZqT3CpLt3otz1ozbzs248O59CYCF8HxK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/2/14 Magnus Hagander <magnus(at)hagander(dot)net>:
> On Mon, Feb 14, 2011 at 11:46, Cédric Villemain
> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>> 2011/2/14 Stefan Kaltenbrunner <Stefan(at)kaltenbrunner(dot)cc>:
>>> On 02/14/2011 10:09 AM, Magnus Hagander wrote:
>>>> On Mon, Feb 14, 2011 at 07:13, Stefan Kaltenbrunner
>>>> <stefan(at)kaltenbrunner(dot)cc> wrote:
>>>>> On 02/14/2011 01:27 AM, Tom Lane wrote:
>>>>>>
>>>>>> Magnus Hagander<magnus(at)hagander(dot)net>  writes:
>>>>>>>
>>>>>>> Unfortunately, one of the worst-case scenarios appears to have
>>>>>>> happened - a machine did not come back up after a reboot.
>>>>>>> ...
>>>>>>> We'll get back to you with more information as soon as we have it.
>>>>>>
>>>>>> I didn't see any followup to this?
>>>>>
>>>>> yeah - the hosting company managed to reboot the box for us which brought it
>>>>> back to life in the middle of the night (with both magnus and me asleep).
>>>>
>>>> Indeed. But the good news is that once it came back up, the VM with
>>>> the git server started ok :-)
>>>>
>>>>
>>>>>> gitmaster seems to be responding as of now, is it safe to push?
>>>>>
>>>>> yes it is - however we will need to schedule another maintenance window soon
>>>>> to finish the stuff we actually wanted to do.
>>>>
>>>> So, after some discussion with Stefan, we (well, I guess I) decided we
>>>> should just go ahead and declare the maintenance window not closed
>>>> yet, and finish off the upgrade right now :-) Given that the majority
>>>> of our commits don't happen now, we'll hopefully have it done by the
>>>> time the US folks wake up again.
>>>>
>>>> So, maintenance window again, starting now, and we'll let you know as
>>>> soon as we're done. And we're definitely hoping for the machine to
>>>> come back up properly this time :-)
>>>
>>> and it did not... We are trying to figure out what the actual problem
>>> here really is because it seems to boot just fine when powercycled just
>>> not with a software initiated reboot.
>>> We will notify once we have more information...
>>>
>>
>> Does it make sense to get some console link or ipmi set up for those
>> crucial parts of the infrastructure ?
>
> This is production servers, of course they are equipped with remove consoles.
>
> However, these consoles are only accessible from the hosting companys
> internal company network or VPN, so we cannot access them directly.
>
> It is something we are discussing with them...

ok. Not the top priority here I believe, but those kind of crisis
period usually help (to have it set up quickly, as the topic is hot )

Thank you for your time and effort spent,
--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Itagaki Takahiro 2011-02-14 11:23:33 Re: Add support for logging the current role
Previous Message Cédric Villemain 2011-02-14 10:57:22 Re: Debian readline/libedit breakage