Allowing multiple concurrent base backups

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Allowing multiple concurrent base backups
Date: 2011-01-11 18:17:20
Message-ID: 4D2C9EB0.1040106@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Now that we have a basic over-the-wire base backup capability in
walsender, it would be nice to allow taking multiple base backups at the
same time. It might not seem very useful at first, but it makes it
easier to set up standbys for small databases. At the moment, if you
want to set up two standbys, you have to either take a single base
backup and distribute it to both standbys, or somehow coordinate that
they don't try to take the base backup at the same time. Also, you don't
want initializing a standby to conflict with a nightly backup cron script.

So, this patch modifies the internal do_pg_start/stop_backup functions
so that in addition to the traditional mode of operation, where a
backup_label file is created in the data directory where it's backed up
along with all other files, the backup label file is be returned to the
caller, and the caller is responsible for including it in the backup.
The code in replication/basebackup.c includes it in the tar file that's
streamed the client, as "backup_label".

To make that safe, I've changed forcePageWrites into an integer.
Whenever a backup is started, it's incremented, and when one ends, it's
decremented. When forcePageWrites == 0, there's no backup in progress.

The user-visible pg_start_backup() function is not changed. You can only
have one backup started that way in progress at a time. But you can do
streaming base backups at the same time with traditional pg_start_backup().

I implemented this in two ways, and can't decide which I like better:

1. The contents of the backup label file are returned to the caller of
do_pg_start_backup() as a palloc'd string.

2. do_pg_start_backup() creates a temporary file that the backup label
is written to (instead of "backup_label").

Implementation 1 changes more code, as pg_start/stop_backup() need to be
changed to write/read from memory instead of file, but the result isn't
any more complicated. Nevertheless, I somehow feel more comfortable with 2.

Patches for both approaches attached. They're also available in my
github repository at git(at)github(dot)com:hlinnaka/postgres.git.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
multiple_inprogress_backups1.patch text/x-diff 18.6 KB
multiple_inprogress_backups2.patch text/x-diff 14.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Garick Hamlin 2011-01-11 18:26:38 Re: Streaming base backups
Previous Message Florian Pflug 2011-01-11 18:08:07 Re: SSI and 2PC