Re: [RFC] What should we do for reliable WAL archiving?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] What should we do for reliable WAL archiving?
Date: 2014-03-17 01:20:14
Message-ID: CA+Tgmoa4u1nVpkXj6zpy5OUHhNVabLa5uYZsFA4sD__EzfvV1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 16, 2014 at 6:23 AM, MauMau <maumau307(at)gmail(dot)com> wrote:
> The PostgreSQL documentation describes cp (on UNIX/Linux) or copy (on
> Windows) as an example for archive_command. However, cp/copy does not sync
> the copied data to disk. As a result, the completed WAL segments would be
> lost in the following sequence:
>
> 1. A WAL segment fills up.
>
> 2. The archiver process archives the just filled WAL segment using
> archive_command. That is, cp/copy reads the WAL segment file from pg_xlog/
> and writes to the archive area. At this point, the WAL file is not
> persisted to the archive area yet, because cp/copy doesn't sync the writes.
>
> 3. The checkpoint processing removes the WAL segment file from pg_xlog/.
>
> 4. The OS crashes. The filled WAL segment doesn't exist anywhere any more.
>
> Considering the "reliable" image of PostgreSQL and widespread use in
> enterprise systems, I think something should be done. Could you give me
> your opinions on the right direction? Although the doc certainly escapes by
> saying "(This is an example, not a recommendation, and might not work on all
> platforms.)", it seems from pgsql-xxx MLs that many people are following
> this example.
>
> * Improve the example in the documentation.
> But what command can we use to reliably sync just one file?
>
> * Provide some command, say pg_copy, which copies a file synchronously by
> using fsync(), and describes in the doc something like "for simple use
> cases, you can use pg_copy as the standard reliable copy command."

+1. This won't obviate the need for tools to manage replication, but
it would make it possible to get the simplest case right without
guessing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-03-17 01:38:11 Re: Portability issues in shm_mq
Previous Message Kouhei Kaigai 2014-03-17 00:45:42 Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)