Re: Backup history file should be replicated in Streaming Replication?

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Florian Pflug <fgp(dot)phlo(dot)org(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Backup history file should be replicated in Streaming Replication?
Date: 2009-12-18 20:15:38
Message-ID: A5420689-4CCF-475C-83B4-671416E95F0B@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Le 18 déc. 2009 à 19:21, Heikki Linnakangas a écrit :
> On Fri, Dec 18, 2009 at 12:22 PM, Florian Pflug <fgp(dot)phlo(dot)org(at)gmail(dot)com> wrote:
>>> It'd prefer if the slave could automatically fetch a new base backup if it
>>> falls behind too far to catch up with the available logs. That way, old logs
>>> don't start piling up on the server if a slave goes offline for a long time.

Well I did propose to consider a state machine with clear transition for such problems, a while ago, and I think my remarks still do apply:
http://www.mail-archive.com/pgsql-hackers(at)postgresql(dot)org/msg131511.html

Sorry for non archives.postgresql.org link, couldn't find the mail there.

> Yeah, for small databases, it's probably a better tradeoff. The problem
> with keeping WAL around in the master indefinitely is that you will
> eventually run out of disk space if the standby disappears for too long.

I'd vote for having a setting on the master for how long you keep WALs. If slave loose sync then comes back, either you still have the required WALs and you're back to catchup or you don't and you're back either to base/init dance.

Maybe you want to add a control on the slave to require explicit DBA action before getting back to taking a base backup from the master, though, as that could be provided from a nightly PITR backup rather than the live server.

>> but it's almost certainly much harder
>> to implement. In particular, there's no hard and fast rule for
>> figuring out when you've dropped so far behind that resnapping the
>> whole thing is faster than replaying the WAL bit by bit.
>
> I'd imagine that you take a new base backup only if you have to, ie. the
> old WAL files the slave needs have already been deleted from the master.

Well consider a slave can be in one of those states: base, init, setup, catchup, sync. Now what you just said is reduced to saying what transitions you can do without resorting to base backup, and I don't see that many as soon as the last sync point is no more available on the master.

>> I think (as I did/do with Hot Standby) that the most important thing
>> here is to get to a point where we have a reasonably good feature that
>> is of some use, and commit it. It will probably have some annoying
>> limitations; we can remove those later. I have a feel that what we
>> have right now is going to be non-robust in the face of network
>> breaks, but that is a problem that can be fixed by a future patch.
>
> Agreed. About a year ago, I was vocal about not relying on the file
> based shipping, but I don't have a problem with relying on it as an
> intermediate step, until we add the other options. It's robust as it is,
> if you set up WAL archiving.

I think I'd like to have the feature that a slave never pretends it's in-sync or soon-to-be when clearly it's not. For the asynchronous case, we can live with it. As soon as we're talking synchronous, you really want the master to skip any not-in-sync slave at COMMIT. To be even more clear, a slave that is not in sync is NOT a slave as far as synchronous replication is concerned.

Regards,
--
dim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2009-12-18 20:17:38 Re: Removing pg_migrator limitations
Previous Message Tom Lane 2009-12-18 20:00:26 Re: PATCH: Add hstore_to_json()