Re: pg_start_backup without checkpoint patch (a part of Synch Rep)

Lists: pgsql-hackers
From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_start_backup without checkpoint patch (a part of Synch Rep)
Date: 2008-12-27 12:28:52
Message-ID: 3f0b79eb0812270428y1f6f2422ve940caab53d63142@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Attached is the self-contained patch to skip checkpoint at pg_start_backup.
This is a part of Synch Rep patches, and was discussed in the following
thread.
http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com

In Synch Rep, we basically have to get a fresh backup, to make the failed
server catch up with the primary after failover. But getting an online-backup
is expensive operation because of pg_start_backup's checkpoint and data
copying. Especially since pg_start_backup is performed as one transaction,
its checkpoint might not only increase I/O traffic but also cause
long-transaction which prevents VACUUM and HOT. This patch makes
pg_start_backup skip a checkpoint if possible, and works out the above
problem.

Specifically, pg_start_backup uses the last checkpoint instead of doing a
new checkpoint if full_page_writes = on since the last checkpoint, which
guarantees that all the full-pages required for PITR are written.

There is one constraint: XLR_BKP_REMOVABLE flag cannot be marked
correctly, because we cannot judge whether the full-pages (generated
before pg_start_backup) are removal or not. AFAIK, this flag is only used
by pglesslog. Should we add a new option to specify whether checkpoint
is skippable, for the backward compatibility?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
bkp_without_ckpt_v3.patch text/x-patch 25.4 KB

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_start_backup without checkpoint patch (a part of Synch Rep)
Date: 2008-12-29 09:08:23
Message-ID: 49589387.6040605@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Fujii Masao wrote:
> Attached is the self-contained patch to skip checkpoint at pg_start_backup.
> This is a part of Synch Rep patches, and was discussed in the following
> thread.
> http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com

I'm not convinced that this is necessary for the replication patch. It
is an orthogonal, new feature, and should be considered for 8.5, IMHO.

> Specifically, pg_start_backup uses the last checkpoint instead of doing a
> new checkpoint if full_page_writes = on since the last checkpoint, which
> guarantees that all the full-pages required for PITR are written.

That assumes that the DBA has kept all the WAL segments that have been
archived since last checkpoint. So this would no longer be safe:

1. rm <archivedir>/*
2. SELECT pg_start_backup();
3. tar cvzf backup.tar.gz <datadir>
4. SELECT pg_stop_backup();

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_start_backup without checkpoint patch (a part of Synch Rep)
Date: 2008-12-29 17:40:35
Message-ID: 3f0b79eb0812290940i4712f92cv63ce89fdb533b671@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Mon, Dec 29, 2008 at 6:08 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Fujii Masao wrote:
>>
>> Attached is the self-contained patch to skip checkpoint at
>> pg_start_backup.
>> This is a part of Synch Rep patches, and was discussed in the following
>> thread.
>>
>> http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com
>
> I'm not convinced that this is necessary for the replication patch. It is an
> orthogonal, new feature, and should be considered for 8.5, IMHO.

Synch Rep forces online-backup in some scenes (e.g. catchup after failover),
so I've argued from the beginning that the cost of it should be reduced.
I think this patch is one of the solutions to that problem.

>
>> Specifically, pg_start_backup uses the last checkpoint instead of doing a
>> new checkpoint if full_page_writes = on since the last checkpoint, which
>> guarantees that all the full-pages required for PITR are written.
>
> That assumes that the DBA has kept all the WAL segments that have been
> archived since last checkpoint. So this would no longer be safe:
>
> 1. rm <archivedir>/*
> 2. SELECT pg_start_backup();
> 3. tar cvzf backup.tar.gz <datadir>
> 4. SELECT pg_stop_backup();

Umm... I can't believe that there is the DBA to carry out such unsafe
operation. If disk crash occurs between 1 and 2, a database might
not recover *regardless of this patch* because of some missing xlogs.
I don't think that the patch itself is unsafe. In this patch, the DBA
can still judge safely which file can be removed, from a backup history
file or the return value of pg_start_backup.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_start_backup without checkpoint patch (a part of Synch Rep)
Date: 2008-12-29 18:30:46
Message-ID: 49591756.9070603@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Fujii Masao wrote:
> On Mon, Dec 29, 2008 at 6:08 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Fujii Masao wrote:
>>> Attached is the self-contained patch to skip checkpoint at
>>> pg_start_backup.
>>> This is a part of Synch Rep patches, and was discussed in the following
>>> thread.
>>>
>>> http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com
>> I'm not convinced that this is necessary for the replication patch. It is an
>> orthogonal, new feature, and should be considered for 8.5, IMHO.
>
> Synch Rep forces online-backup in some scenes (e.g. catchup after failover),
> so I've argued from the beginning that the cost of it should be reduced.

Sure, but it's just an optimization. Synch rep still works fine without it.

>>> Specifically, pg_start_backup uses the last checkpoint instead of doing a
>>> new checkpoint if full_page_writes = on since the last checkpoint, which
>>> guarantees that all the full-pages required for PITR are written.
>> That assumes that the DBA has kept all the WAL segments that have been
>> archived since last checkpoint. So this would no longer be safe:
>>
>> 1. rm <archivedir>/*
>> 2. SELECT pg_start_backup();
>> 3. tar cvzf backup.tar.gz <datadir>
>> 4. SELECT pg_stop_backup();
>
> Umm... I can't believe that there is the DBA to carry out such unsafe
> operation. If disk crash occurs between 1 and 2, a database might
> not recover *regardless of this patch* because of some missing xlogs.

I use the above rm+start+tar+stop procedure all the time, when testing PITR.

You might be taking a one-off online backup, in which case you don't
care about the history. See section 24.3.5.1 on "Standalone hot
backups":
http://www.postgresql.org/docs/8.3/static/continuous-archiving.html#BACKUP-BASE-BACKUP

> I don't think that the patch itself is unsafe. In this patch, the DBA
> can still judge safely which file can be removed, from a backup history
> file or the return value of pg_start_backup.

Sure, it could be addressed with appropriate documentation. You'd
effectively get the current behavior by doing a manual CHECKPOINT just
before pg_start_backup(), for example. It does still make it harder to
use the feature correctly, though.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com