Re: Incrementally Updated Backups and restartpoints

Lists: pgsql-docspgsql-hackers
From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Incrementally Updated Backups and restartpoints
Date: 2010-01-13 11:36:02
Message-ID: 4B4DB022.4070209@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

Our documentation suggests that you can take a base backup of a warm
standby server while it's running:

> If we take a backup of the standby server's data directory while it is processing logs shipped from the primary, we will be able to reload that data and restart the standby's recovery process from the last restart point. We no longer need to keep WAL files from before the restart point. If we need to recover, it will be faster to recover from the incrementally updated backup than from the original base backup.

That doesn't seem safe. If the server makes a new restartpoint while the
backup is running, and pg_control is backed up after the new
restartpoint is made, recovery will restart from the new restartpoint.
That is wrong; recovery needs to restart at the restartpoint that was
most recent when the backup started. This is basically the same issue we
have solved in master with the backup_label file.

I wonder if it would be enough to document that pg_control must be
backed up first?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incrementally Updated Backups and restartpoints
Date: 2010-01-13 11:57:37
Message-ID: 3f0b79eb1001130357n6abc8466q6556daf2ab3c3fe2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Wed, Jan 13, 2010 at 8:36 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Our documentation suggests that you can take a base backup of a warm
> standby server while it's running:
>
>> If we take a backup of the standby server's data directory while it is processing logs shipped from the primary, we will be able to reload that data and restart the standby's recovery process from the last restart point. We no longer need to keep WAL files from before the restart point. If we need to recover, it will be faster to recover from the incrementally updated backup than from the original base backup.
>
> That doesn't seem safe. If the server makes a new restartpoint while the
> backup is running, and pg_control is backed up after the new
> restartpoint is made, recovery will restart from the new restartpoint.
> That is wrong; recovery needs to restart at the restartpoint that was
> most recent when the backup started. This is basically the same issue we
> have solved in master with the backup_label file.

Right.

> I wonder if it would be enough to document that pg_control must be
> backed up first?

Probably No. The archive recovery from such base backup would always
fail at the end of recovery because there is no backup-end record,
i.e., pg_stop_backup() is not executed in that case.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incrementally Updated Backups and restartpoints
Date: 2010-01-13 12:34:05
Message-ID: 4B4DBDBD.10304@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

Fujii Masao wrote:
> On Wed, Jan 13, 2010 at 8:36 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> I wonder if it would be enough to document that pg_control must be
>> backed up first?
>
> Probably No. The archive recovery from such base backup would always
> fail at the end of recovery because there is no backup-end record,
> i.e., pg_stop_backup() is not executed in that case.

No, that's not an issue. We only wait for the backup-end record if we
haven't seen yet since we started recovery from the base backup.
Assuming the standby had reached that point already before the new
backup from the standby started, backupStartLoc is zero in the control file.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incrementally Updated Backups and restartpoints
Date: 2010-01-13 22:13:05
Message-ID: 3f0b79eb1001131413v4ccc8ac1jf25a1f75cf71d5ce@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Wed, Jan 13, 2010 at 9:34 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> No, that's not an issue. We only wait for the backup-end record if we
> haven't seen yet since we started recovery from the base backup.
> Assuming the standby had reached that point already before the new
> backup from the standby started, backupStartLoc is zero in the control file.

OK. That assumption should be documented?

And, when we start an archive recovery from the backup from the standby,
we seem to reach a safe starting point before database has actually become
consistent. It's because backupStartLoc is zero. Isn't this an issue?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incrementally Updated Backups and restartpoints
Date: 2010-03-04 12:00:33
Message-ID: 3f0b79eb1003040400w41ced642ua37a0031850c3a5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

Hi,

I thought of this issue again since the related question arrived.
http://archives.postgresql.org/pgsql-admin/2010-03/msg00036.php

On Thu, Jan 14, 2010 at 7:13 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Wed, Jan 13, 2010 at 9:34 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> No, that's not an issue. We only wait for the backup-end record if we
>> haven't seen yet since we started recovery from the base backup.
>> Assuming the standby had reached that point already before the new
>> backup from the standby started, backupStartLoc is zero in the control file.
>
> OK. That assumption should be documented?

This comment is meaningless. Sorry for noise.

> And, when we start an archive recovery from the backup from the standby,
> we seem to reach a safe starting point before database has actually become
> consistent. It's because backupStartLoc is zero. Isn't this an issue?

This issue seems to still happen. So should this be fixed for 9.0?
Or only writing a note in document is enough for 9.0? I'm leaning
towards the latter.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Incrementally Updated Backups and restartpoints
Date: 2010-03-26 13:21:42
Message-ID: 3f0b79eb1003260621y7ebcf2c5s49bd16caf6b12b73@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Thu, Mar 4, 2010 at 9:00 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> And, when we start an archive recovery from the backup from the standby,
>> we seem to reach a safe starting point before database has actually become
>> consistent. It's because backupStartLoc is zero. Isn't this an issue?
>
> This issue seems to still happen. So should this be fixed for 9.0?
> Or only writing a note in document is enough for 9.0? I'm leaning
> towards the latter.

I'm thinking of adding something like the following to the section
"25.6. Incrementally Updated Backups". Thought?

The pg_control file must be backed up first.
This avoids the problem that we might fail to restore a consistent
database state because recovery starts from the later restart point
than the start of the backup.

When recovering from the incrementally updated backup, the server
can begin accepting connections and complete the recovery successfully
before the database has become consistent. To avoid these problems,
you must check whether the database has been consistent by comparing
the progress of the recovery with the backup ending WAL location
before your users try to connect to the server and when archive
recovery ends. So, in advance, the backup ending WAL location must
be taken by calling the pg_last_xlog_replay_location function at the
end of the backup. The progress of the recovery is also taken from
the pg_last_xlog_replay_location function.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center