Re: Online base backup from the hot-standby

Lists: pgsql-hackers
From: Jun Ishiduka <ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Online base backup from the hot-standby
Date: 2011-05-27 06:09:30
Message-ID: 201105270609.p4R694Lo010621@ccmds32.silk.ntts.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

I would like to develop function for 'Online base backup from the
hot-standby' in PostgreSQL 9.2.

Todo : Allow hot file system backups on standby servers
(http://wiki.postgresql.org/wiki/Todo)

[GOAL]
* Make pg_basebackup to execute to the hot-standby server
and acquire online-base-backup .
- pg_basebackup can be executed to only primary server in
PostgreSQL 9.1 .
- But physical-copy(etc) under processing of pg_basebackup
raises the load of primary server .
- Therefore , this function is necessary .

[Problem]
(There is the following problems when hot-standby acquires
online-base-backup like executing pg_basebackup to the primary
server .)
* pg_start_backup() and pg_stop_backup() can't be executed to the
hot-standby server .
- hot-standby can't insert backup-end record to WAL-files and
can't operate CHECKPOINT .
- Because hot-standby can't write anything in WAL-files .
* hot-standby can't send WAL-files to archive server.
- when pg_stop_backup() is executed to the primary server ,
it waits for completing sending wal to archive server ,
but hot-standby can't do it.

[Policy]
(I create with the following Policy .)
* This function doesn't affect primary server .
- I don't adopt the way which "hot-standby requests primary to
execute pg_basebackup" , because I think about many standbys
is connected with a primary .

[Approach]
* When pg_basebackup is executed to the hot-standby server , it
executes RESTARTPOINT instead of CHECKPOINT .
backup_label is made from the RESTARTPOINT's results , and is sent
to the designated backup server using pg_basebackup connection .
* Instead of inserting backup-end record , hot-standby writes
backup-end-position in backup-history-file and sends to the
designated backup server using pg_basebackup connection .
- In 9.1 , startup process knows backup-end-position from only
backup-end record . In addition to its logic, startup process
can know backup-end-position from backup-history-file .
As a result , startup process can recovery certainly
without backup-end record .

[Precondition]
(As a result of the above-mentioned Policy and Approach , there is
the following restrictions .)
* Immediately after backup starting of WAL must contain
full page writes . But the above-mentioned Approach can't satisfy
the restriction according to circumstances . Because
full_page_writes of primary might equal 'off' .
When standby recovery WAL which is removed full page writes by pg_lesslog
, it is the same .
* Because recovery starts from last CHECKPOINT , it becomes long .
* I has not thought new process that become taking the place of
waiting for completing sending wal to archive server , yet.

[Working Step]
STEP1: Make startup process to acquire backup-end-position from
not only backup-end record but also backup-history-file .
* startup process allows to acquire backup-end-position
from backup-history-file .
* When pg_basebackup is executed , backup-history-file is
sent to the designated backup server .

STEP2: Make pg_start_backup() and pg_stop_backup() to be executed
by the hot-standby server.

[Action until The first CommitFest (on June 15)]
I will create a patch to STEP1 .
(The patch will be able to settle a problem of Omnipitr-backup-slave.)
(a problem of Omnipitr-backup-slave :
http://archives.postgresql.org/pgsql-hackers/2011-03/msg01490.php)
* Shedule of creating STEP2 is the next CommitFest (in September 15)

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp
--------------------------------------------


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Jun Ishiduka <ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Online base backup from the hot-standby
Date: 2011-05-27 10:37:54
Message-ID: 4DDF7F02.2050004@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27.05.2011 09:09, Jun Ishiduka wrote:
> STEP1: Make startup process to acquire backup-end-position from
> not only backup-end record but also backup-history-file .
> * startup process allows to acquire backup-end-position
> from backup-history-file .
> * When pg_basebackup is executed , backup-history-file is
> sent to the designated backup server .

I don't much like that approach. The standby would need to be able to
write the backup history file to the archive at the end of backup, and
we'd have to reintroduce the code to fetch it from archive and, when
streaming, from the master. At the moment, the archiver doesn't even run
in the standby.

I think we'll need to write the end-of-backup location somewhere in the
base backup instead. pg_stop_backup() already returns it, the client
just needs to store it somewhere with the base backup. So I'm thinking
that the procedure for taking a base backup from slave would look
something like this:

1. psql postgres -c "SELECT pg_start_backup('label')";
2. tar cvzf basebackup.tar.gz $PGDATA
3. psql postgres -c "SELECT pg_stop_backup()"; > backup_end_location
4. (keep backup_end_location alongside basebackup.tar.gz)

Or, we can just document that the control file must be backed up *last*,
so that the minimum recovery point in the control file serves the same
purposes as the end-of-backup location.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Jun Ishiduka <ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Online base backup from the hot-standby
Date: 2011-05-31 04:46:36
Message-ID: 201105310446.p4V4kDUt014440@ccmds32.silk.ntts.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I don't much like that approach. The standby would need to be able to
> write the backup history file to the archive at the end of backup, and
> we'd have to reintroduce the code to fetch it from archive and, when
> streaming, from the master. At the moment, the archiver doesn't even run
> in the standby.

Please teach the reason for "The standby would need to be able to write
the backup history file to the archive at the end of backup" .
(I'd like to know why "to only pg_xlog" is wrong .)

Because there is the opinion of "Cascade replication" , I don't want to
realize the function with the method which the standby requests to execute
it on the primary server .

(The opinion of "Cascade replication":
http://archives.postgresql.org/pgsql-hackers/2011-05/msg01150.php)

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp
--------------------------------------------


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Jun Ishiduka <ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Online base backup from the hot-standby
Date: 2011-05-31 06:52:49
Message-ID: 4DE49041.6040007@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 31.05.2011 07:46, Jun Ishiduka wrote:
>> I don't much like that approach. The standby would need to be able to
>> write the backup history file to the archive at the end of backup, and
>> we'd have to reintroduce the code to fetch it from archive and, when
>> streaming, from the master. At the moment, the archiver doesn't even run
>> in the standby.
>
> Please teach the reason for "The standby would need to be able to write
> the backup history file to the archive at the end of backup" .
> (I'd like to know why "to only pg_xlog" is wrong .)

If the backup history file is not archived, the postgres process won't
find it when you try to restore from the base backup. The new server has
no access to the standby's pg_xlog directory.

> Because there is the opinion of "Cascade replication" , I don't want to
> realize the function with the method which the standby requests to execute
> it on the primary server .
>
> (The opinion of "Cascade replication":
> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01150.php)

I don't see how this helps.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Jun Ishiduka <ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Online base backup from the hot-standby
Date: 2011-05-31 08:48:13
Message-ID: BANLkTim_cnbWMBOt-UO0z+q4352SXJWd_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/5/31 Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>:
> On 31.05.2011 07:46, Jun Ishiduka wrote:
>>> I don't much like that approach. The standby would need to be able to
>>> write the backup history file to the archive at the end of backup, and
>>> we'd have to reintroduce the code to fetch it from archive and, when
>>> streaming, from the master. At the moment, the archiver doesn't even run
>>> in the standby.
>>
>> Please teach the reason for "The standby would need to be able to write
>> the backup history file to the archive at the end of backup" .
>> (I'd like to know why "to only pg_xlog" is wrong .)
>
> If the backup history file is not archived, the postgres process won't
> find it when you try to restore from the base backup. The new server has
> no access to the standby's pg_xlog directory.

Right. If we take a base backup from the standby by not using pg_basebackup,
since there is no way to share the backup history file from the standby to new
server, an idea like you suggested would be required.

OTOH, if we use pg_basebackup, I think that it makes sense that pg_basebackup
transfers the backup history file from the standby to new server, puts
it in the base
backup (pg_xlog?), and new server reads it from the base backup but not the
archive.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Jun Ishiduka <ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp>
To: heikki(dot)linnakangas(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Online base backup from the hot-standby
Date: 2011-06-03 01:58:21
Message-ID: 201106030158.p531w2SU008799@ccmds32.silk.ntts.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> I don't much like that approach. The standby would need to be able to
>>> write the backup history file to the archive at the end of backup, and
>>> we'd have to reintroduce the code to fetch it from archive and, when
>>> streaming, from the master. At the moment, the archiver doesn't even run
>>> in the standby.
>>
>> Please teach the reason for "The standby would need to be able to write
>> the backup history file to the archive at the end of backup" .
>> (I'd like to know why "to only pg_xlog" is wrong .)
>
> If the backup history file is not archived, the postgres process won't
> find it when you try to restore from the base backup. The new server has
> no access to the standby's pg_xlog directory.

Thanks for the answer .
But , it sends the backup history file to pg_xlog of new server
(=backup server) when pg_basebackup is executed to the standby server
, and so I was going to create the patch of such logic .
I think it don't become the above-mentioned movement .

>> Because there is the opinion of "Cascade replication" , I don't want to
>> realize the function with the method which the standby requests to execute
>> it on the primary server .
>>
>> (The opinion of "Cascade replication":
>> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01150.php)
>
>I don't see how this helps.

Hypothesis:
* Online base backup was realized with the method which the standby
requests to execute it on the "primary server" .
* "Cascade replication" was developed , and user is using it .
(Ex. Primary -- Standby1 -- Standby2)

Situation:
(1) Standby2 executes pg_basebackup .
(2) Then, Standby2 accesses Standby1 .
(3) But, it fails, because Standby2's primary is Standby1, not
Primary .

Result:
* I don't want to realize the function with the method which the
standby requests to execute it on the primary server .
--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka(dot)jun(at)po(dot)ntts(dot)co(dot)jp
--------------------------------------------