pg 9.0, streaming replication, fail over and fail back strategies

Lists: pgsql-general
From: "Kyle R(dot) Burton" <kyle(dot)burton(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: pg 9.0, streaming replication, fail over and fail back strategies
Date: 2010-08-09 22:10:47
Message-ID: AANLkTikY_w=mD=s+zE5memdBFWiYJWuMNrG422X-+LC-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hello,

I'm new to the list and not even sure if this is the right place to be
posting this...

I've worked through the documentation for postgres 9.0 (beta2) and
have successfully set up a master and hot slave configured with
streaming replication (and xlog shipping).  That configuration seems
to be correctly updating the slave and the slave accepts read queries
and shows up to date table data (based on testing by hand with some
DDL and insert queries).

Now that I have that successfully configured, I have manually
performed a fail over by stopping the master, moving a virtual IP
address from the master to the slave, and touched the trigger file on
the slave.  This worked as expected and the former slave promoted
itself to being a full read/write master.

I went through the process of failing back manually by dumping the
database on the slave, restoring it on the master, moving the VIP back
and renaming the recovery.done back to recovery.conf.  This took some
time and required several steps, but was also successful.

After I had moved the VIP from the master to the slave, I had to
restart (not just reload) the postgres daemon to get it to start
listening on the new ip address (it was previously listening to
another IP [10.x.x.y] on the same NIC [eth0]).  I have the
listen_addresses configured to listen on both an internal (10.x.x.y)
address as well as the vip (10.x.x.z), but the interface on the slave
did not have this ip address at the time Postgres was started (so I'm
not all that surprised it didn't bind to that address on becoming the
master).

Is there any way to get PostgreSQL to bind to a new ip address and
interface without actually shutting it down?  If it could, would I
need to break all the current (read only) client connections to get
them to reconnect and have the ability to write?  (am I confused about
this?)

I've set up corosync (part of linux-ha) to manage the VIP, but so far
not to manage postgres itself.  I've set up postgres to be managed
manually (start and stop).

Now that the master+slave configuration is up and running again, I'm
looking for advice on how to monitor for faults: I can fail over
manually, which is fine for now.  What aspects of the postgres system
should be monitored to watch for faults and what are the kinds of
faults that should lead to a fail over?  The machine crashing (OS/HW)
is an obvious one, which will be recognized by corosync and I can
script the initiation of failover (including using ipmi to power down
the master).

Thank you for your time.

Kyle Burton

--
Twitter: @kyleburton
Blog: http://asymmetrical-view.com/
Fun: http://snapclean.me/


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: "Kyle R(dot) Burton" <kyle(dot)burton(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg 9.0, streaming replication, fail over and fail back strategies
Date: 2010-08-09 22:17:14
Message-ID: AANLkTinHBiSwV2sOPxtANHT7UbogjCpf81tDw5vvb-nz@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, Aug 9, 2010 at 4:10 PM, Kyle R. Burton <kyle(dot)burton(at)gmail(dot)com> wrote:
> Hello,
> After I had moved the VIP from the master to the slave, I had to
> restart (not just reload) the postgres daemon to get it to start

Not surprising as you say.

> Is there any way to get PostgreSQL to bind to a new ip address and
> interface without actually shutting it down?  If it could, would I
> need to break all the current (read only) client connections to get
> them to reconnect and have the ability to write?  (am I confused about
> this?)

I wonder if you could have pg on a steady ip and use iptables to
forward traffic there after a failover...


From: "Kyle R(dot) Burton" <kyle(dot)burton(at)gmail(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg 9.0, streaming replication, fail over and fail back strategies
Date: 2010-08-09 23:32:58
Message-ID: AANLkTin7v4V8YvVpX4Lzc0Jwq+wi9tWb8heW68wxMUoi@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>> Is there any way to get PostgreSQL to bind to a new ip address and
>> interface without actually shutting it down?  If it could, would I
>> need to break all the current (read only) client connections to get
>> them to reconnect and have the ability to write?  (am I confused about
>> this?)
>
> I wonder if you could have pg on a steady ip and use iptables to
> forward traffic there after a failover...

That is an excellent suggestion! It just didn't occur to me. I've
tried googling how to forward a port and am not having much success
(rinetd worked, but I feel like I should be able to get iptables to
work - do you have any pointers I could follow?)

Thanks again, this will most likely sove my vip binding issue.

Kyle

--
Twitter: @kyleburton
Blog: http://asymmetrical-view.com/
Fun: http://snapclean.me/


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "Kyle R(dot) Burton" <kyle(dot)burton(at)gmail(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: pg 9.0, streaming replication, fail over and fail back strategies
Date: 2010-08-10 01:33:30
Message-ID: 1281404010.4820.1.camel@jdavis-ux.asterdata.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, 2010-08-09 at 19:32 -0400, Kyle R. Burton wrote:
> That is an excellent suggestion! It just didn't occur to me. I've
> tried googling how to forward a port and am not having much success
> (rinetd worked, but I feel like I should be able to get iptables to
> work - do you have any pointers I could follow?)

This is the link that I found:

http://tldp.org/HOWTO/IP-Masquerade-HOWTO/forwarders.html

Regards,
Jeff Davis


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: "Kyle R(dot) Burton" <kyle(dot)burton(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg 9.0, streaming replication, fail over and fail back strategies
Date: 2010-08-10 01:41:13
Message-ID: AANLkTimQU2nugb9d7obTF7sZ-04VUwOnSBviHSD3=pGx@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Tue, Aug 10, 2010 at 7:10 AM, Kyle R. Burton <kyle(dot)burton(at)gmail(dot)com> wrote:
> Is there any way to get PostgreSQL to bind to a new ip address and
> interface without actually shutting it down?  If it could, would I
> need to break all the current (read only) client connections to get
> them to reconnect and have the ability to write?  (am I confused about
> this?)

What about setting listen_addresses to '*'? If so, you would be able to
connect to new master as soon as VIP has been moved to it.

> Now that the master+slave configuration is up and running again, I'm
> looking for advice on how to monitor for faults: I can fail over
> manually, which is fine for now.  What aspects of the postgres system
> should be monitored to watch for faults and what are the kinds of
> faults that should lead to a fail over?  The machine crashing (OS/HW)
> is an obvious one, which will be recognized by corosync and I can
> script the initiation of failover (including using ipmi to power down
> the master).

Probably the crash of the postgres and corosync process, the trouble
of VIP, and network outage between the master and the client should be
monitored, I think. Since any of them prevents the master from running
queries from the client, we should cause a failover.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Vick Khera <vivek(at)khera(dot)org>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: pg 9.0, streaming replication, fail over and fail back strategies
Date: 2010-08-18 15:37:49
Message-ID: AANLkTinUvjtDzYXB3NHgEyYaHoOE5mXzgBChWrTS00dB@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, Aug 9, 2010 at 6:10 PM, Kyle R. Burton <kyle(dot)burton(at)gmail(dot)com> wrote:
> Is there any way to get PostgreSQL to bind to a new ip address and
> interface without actually shutting it down?  If it could, would I
> need to break all the current (read only) client connections to get
> them to reconnect and have the ability to write?  (am I confused about
> this?)

What if you make the virtual IP instead be a shared IP using the CARP
protocol, and have only one host be the master at any given time.


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: "Kyle R(dot) Burton" <kyle(dot)burton(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg 9.0, streaming replication, fail over and fail back strategies
Date: 2010-08-18 21:02:30
Message-ID: AANLkTinVD96fCSEvOu-W_gVbNZg6Gt7O_v+HkWt1iy2r@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, Aug 9, 2010 at 6:10 PM, Kyle R. Burton <kyle(dot)burton(at)gmail(dot)com> wrote:
> Hello,
>
> I'm new to the list and not even sure if this is the right place to be
> posting this...
>
> I've worked through the documentation for postgres 9.0 (beta2) and
> have successfully set up a master and hot slave configured with
> streaming replication (and xlog shipping).  That configuration seems
> to be correctly updating the slave and the slave accepts read queries
> and shows up to date table data (based on testing by hand with some
> DDL and insert queries).
>
> Now that I have that successfully configured, I have manually
> performed a fail over by stopping the master, moving a virtual IP
> address from the master to the slave, and touched the trigger file on
> the slave.  This worked as expected and the former slave promoted
> itself to being a full read/write master.
>
> I went through the process of failing back manually by dumping the
> database on the slave, restoring it on the master, moving the VIP back
> and renaming the recovery.done back to recovery.conf.  This took some
> time and required several steps, but was also successful.
>
> After I had moved the VIP from the master to the slave, I had to
> restart (not just reload) the postgres daemon to get it to start
> listening on the new ip address (it was previously listening to
> another IP [10.x.x.y] on the same NIC [eth0]).  I have the
> listen_addresses configured to listen on both an internal (10.x.x.y)
> address as well as the vip (10.x.x.z), but the interface on the slave
> did not have this ip address at the time Postgres was started (so I'm
> not all that surprised it didn't bind to that address on becoming the
> master).
>
> Is there any way to get PostgreSQL to bind to a new ip address and
> interface without actually shutting it down?  If it could, would I
> need to break all the current (read only) client connections to get
> them to reconnect and have the ability to write?  (am I confused about
> this?)

hm. I wonder if you could implement a solution around pgbouncer to do this...

merlin