Re: Enable WAL archiving even in standby

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enable WAL archiving even in standby
Date: 2014-08-19 14:13:41
Message-ID: 0387E3AC1E6D46299D06D6877F728962@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
> I'd propose the attached WIP patch which allows us to enable WAL archiving
> even in standby. The patch adds "always" as the valid value of
> archive_mode.
> If it's set to "always", the archiver is started when the server is in
> standby
> mode and all the WAL files that walreceiver wrote to the disk are archived
> by
> using archive_command. Then, even after the server is promoted to master,
> the archiver keeps archiving WAL files. The patch doesn't change the
> meanings
> of the setting values "on" and "off" of archive_mode.
>
> I think that this feature is useful for the case, e.g., where large
> database
> needs to be replicated between remote servers. Imagine the situation where
> the replicated database gets corrupted completely in the remote standby.
> How should we address this problematic situation and restart the standby?
>
> One approach is to take a fresh backup from the master and restore it onto
> the standby. But since the database is large and there is long distance
> between two servers, this approach might take a surprisingly long time.
>
> Another approach is to restore the backup which was taken from the standby
> before. But most of many WAL files which the backup needs might exist only
> in the master (because WAL archiving cannot be enabled in the standby) and
> they need to be transfered from the master to the standby via
> long-distance
> network. So I think that this approach also would take a fairly long time.
> To shorten that time, you may think that archive_command in the master can
> be set so that it transfers WAL files from the master to the standby's
> archival storage. I agree that this setting can accelerate the database
> restore
> process. But this causes every WAL files to be transfered between remote
> servers twice (one is by streaming replication, another is by
> archive_command),
> and which is a waste of network bandwidth.

Great. This is exactly what I hoped for disaster recovery, although I
haven't looked at the patch yet.

> Back to the patch. If archive_mode is set to "always", archive_command is
> always used to archive WAL files even during recovery. Do we need to
> separate
> the command into two for master and standby, respectively? We can add
> something like standby_archive_command parameter which is used to archive
> only WAL files walreceiver writes. The other WAL files are archived by
> archive_command. I'm not sure if it's really worth separating the command
> that way. Is there any use case?

I don't see any reason to separate parameters. I want the spec simple.

> I've not included the update of document in the patch yet. If we agree to
> support this feature, I will do the remaining work.

Could you consider adding a new section for disaster recovery that describes
concrete parameter settings (e.g. how do we discard old archive WAL files
after taking a base backup from standby, because backup label file is not
created?). Good luck!

Regards
MauMau

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-08-19 14:33:29 Re: WAL format and API changes (9.5)
Previous Message Kevin Grittner 2014-08-19 13:57:53 Re: BUG #11208: Refresh Materialized View Concurrently bug using user Postgres