Improve shutdown during online backup

Lists: pgsql-hackerspgsql-patches
From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: <pgsql-patches(at)postgresql(dot)org>
Subject: Improve shutdown during online backup
Date: 2008-04-01 13:34:17
Message-ID: D960CB61B694CF459DCFB4B0128514C201ED284B@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

This follows up on the discussion in
http://archives.postgresql.org/pgsql-hackers/2008-03/msg01033.php

- pg_ctl will refuse a smart shutdown during online backup.
- The postmaster will also refuse to shutdown in smart mode
in that case and log a message to that effect.
- In fast shutdown mode, the server will rename "backup_label"
after successfully shutting down and log the fact.

Yours,
Laurenz Albe

Attachment Content-Type Size
backup-shut.doc.patch application/octet-stream 4.2 KB
backup-shut.patch application/octet-stream 3.9 KB

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Improve shutdown during online backup
Date: 2008-04-01 16:42:21
Message-ID: 1207068141.4238.51.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, 2008-04-01 at 15:34 +0200, Albe Laurenz wrote:
> This follows up on the discussion in
> http://archives.postgresql.org/pgsql-hackers/2008-03/msg01033.php
>
> - pg_ctl will refuse a smart shutdown during online backup.
> - The postmaster will also refuse to shutdown in smart mode
> in that case and log a message to that effect.
> - In fast shutdown mode, the server will rename "backup_label"
> after successfully shutting down and log the fact.

Looks good.

Few comments:

* smart shutdown waits for sessions to complete, yet this just ignores
smart shutdowns which is something a little different. I think we
should wait for the backup to complete and then shutdown.

* when we say "online backup cancelled" I think we should say something
more like "online backup mode cancelled". All we are doing is removing
the backup label file, we're not actually cancelling the physical backup
since it is external to the database anyway.

* The #defines at top of postmaster.c are duplicated from xlog.c
If we can't agree on a common header file then we should at least add a
comment to mention they are duplicated (in both locations).

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Improve shutdown during online backup
Date: 2008-04-01 19:03:18
Message-ID: 1207076598.4238.60.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, 2008-04-01 at 17:42 +0100, Simon Riggs wrote:

> Few comments:
>
> * smart shutdown waits for sessions to complete, yet this just ignores
> smart shutdowns which is something a little different. I think we
> should wait for the backup to complete and then shutdown.

> * The #defines at top of postmaster.c are duplicated from xlog.c
> If we can't agree on a common header file then we should at least add a
> comment to mention they are duplicated (in both locations).

If we add a function called something like BackupInProgress() to xlog.c,
exported via miscadmin.h then we can use it within the
PostmasterStateMachine() function like this

if (pmState == PM_WAIT_BACKENDS)
{
if (CountChildren() == 0 &&
StartupPID == 0 &&
(BgWriterPID == 0 || !FatalError) &&
WalWriterPID == 0 &&
AutoVacPID == 0 &&
!BackupInProgress()) <---- new line

so that the postmaster doesn't need to know about how we do backups.

That way you don't need any of the special cases in your patch, nor is
there any need to duplicate the #defines.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Simon Riggs *EXTERN*" <simon(at)2ndquadrant(dot)com>
Cc: <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Improve shutdown during online backup
Date: 2008-04-02 07:11:59
Message-ID: D960CB61B694CF459DCFB4B0128514C201F3E8CC@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
>> Few comments:
>>
>> * smart shutdown waits for sessions to complete, yet this just ignores
>> smart shutdowns which is something a little different. I think we
>> should wait for the backup to complete and then shutdown.

That would be more consistent, I agree.

I'll undo my changes to pg_ctl as well, as they make no more sense then.

>> * The #defines at top of postmaster.c are duplicated from xlog.c
>> If we can't agree on a common header file then we should at least add a
>> comment to mention they are duplicated (in both locations).
>
> If we add a function called something like BackupInProgress()
> to xlog.c,
> exported via miscadmin.h then we can use it within the
> PostmasterStateMachine() function like this
>
> if (pmState == PM_WAIT_BACKENDS)
> {
> if (CountChildren() == 0 &&
> StartupPID == 0 &&
> (BgWriterPID == 0 || !FatalError) &&
> WalWriterPID == 0 &&
> AutoVacPID == 0 &&
> !BackupInProgress()) <---- new line
>
> so that the postmaster doesn't need to know about how we do backups.
>
> That way you don't need any of the special cases in your patch, nor is
> there any need to duplicate the #defines.

I realized that duplicating the #defines was ugly, and will do it
like that.

Thanks for the hints.

Yours,
Laurenz Albe


From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Simon Riggs *EXTERN*" <simon(at)2ndquadrant(dot)com>
Cc: <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Improve shutdown during online backup
Date: 2008-04-07 10:32:12
Message-ID: D960CB61B694CF459DCFB4B0128514C201F3F693@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
>> Few comments:
>>
>> * smart shutdown waits for sessions to complete, yet this just ignores
>> smart shutdowns which is something a little different. I think we
>> should wait for the backup to complete and then shutdown.
>
> If we add a function called something like BackupInProgress() to xlog.c,
> exported via miscadmin.h then we can use it within the
> PostmasterStateMachine() function like this
>
> if (pmState == PM_WAIT_BACKENDS)
> {
> if (CountChildren() == 0 &&
> StartupPID == 0 &&
> (BgWriterPID == 0 || !FatalError) &&
> WalWriterPID == 0 &&
> AutoVacPID == 0 &&
> !BackupInProgress()) <---- new line
>
> so that the postmaster doesn't need to know about how we do backups.
>
> That way you don't need any of the special cases in your patch, nor is
> there any need to duplicate the #defines.

I looked at that, and it won't work, for these reasons:

PostmasterStateMachine() is called once after a smart shutdown.
If there are children or a backup is in progress, pmState will remain
PM_WAIT_BACKENDS.

Now whenever a child exits, the reaper() will be called, which in turn
calls PostmasterStateMachine() again and advances pmState if appropriate.
This won't work for backups though, because removal of backup_label will
not send a SIGCHLD to the postmaster.

Moreover, if Shutdown == SmartShutdown, new connections won't be accepted,
and nobody can connect and call pg_stop_backup().
So even if I'd add a check for
(pmState == PM_WAIT_BACKENDS) && !BackupInProgress() somewhere in the
ServerLoop(), it wouldn't do much good, because the only way for somebody
to cancel online backup mode would be to manually remove the file.

So the only reasonable thing to do on smart shutdown during an online
backup is to have the shutdown request fail, right? The only alternative being
that a smart shutdown request should interrupt online backup mode.

So - unless you point out a flaw in my reasoning - I'll implement it
that way, but will put all code that handles backup_label files into
xlog.c.

Yours,
Laurenz Albe


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Improve shutdown during online backup
Date: 2008-04-07 14:30:29
Message-ID: 47FA3005.5070108@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Albe Laurenz wrote:
> Moreover, if Shutdown == SmartShutdown, new connections won't be accepted,
> and nobody can connect and call pg_stop_backup().
> So even if I'd add a check for
> (pmState == PM_WAIT_BACKENDS) && !BackupInProgress() somewhere in the
> ServerLoop(), it wouldn't do much good, because the only way for somebody
> to cancel online backup mode would be to manually remove the file.

Good point.

> So the only reasonable thing to do on smart shutdown during an online
> backup is to have the shutdown request fail, right? The only alternative being
> that a smart shutdown request should interrupt online backup mode.

Or we can add another state, PM_WAIT_BACKUP, before PM_WAIT_BACKENDS,
that allows new connections, and waits until the backup ends.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Heikki Linnakangas *EXTERN*" <heikki(at)enterprisedb(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-patches(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve shutdown during online backup
Date: 2008-04-08 07:16:33
Message-ID: D960CB61B694CF459DCFB4B0128514C201FA55C5@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

[what should happen if a smart shutdown request is received during online backup mode?
I'll cc: the hackers list, maybe others have something to say to this]

Heikki Linnakangas wrote:
> Albe Laurenz wrote:
>> Moreover, if Shutdown == SmartShutdown, new connections won't be accepted,
>> and nobody can connect and call pg_stop_backup().
>> So even if I'd add a check for
>> (pmState == PM_WAIT_BACKENDS) && !BackupInProgress() somewhere in the
>> ServerLoop(), it wouldn't do much good, because the only way for somebody
>> to cancel online backup mode would be to manually remove the file.
>
> Good point.
>
>> So the only reasonable thing to do on smart shutdown during an online
>> backup is to have the shutdown request fail, right? The only alternative being
>> that a smart shutdown request should interrupt online backup mode.
>
> Or we can add another state, PM_WAIT_BACKUP, before PM_WAIT_BACKENDS,
> that allows new connections, and waits until the backup ends.

That's an option. Maybe it is possible to restrict connections to superusers
(who are the only ones who can call pg_stop_backup() anyway).

Or, we could allow superuser connections in state PM_WAIT_BACKENDS...

Opinions?

Yours,
Laurenz Albe


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: Heikki Linnakangas *EXTERN* <heikki(at)enterprisedb(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improve shutdown during online backup
Date: 2008-04-16 10:07:09
Message-ID: 1208340429.4259.65.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, 2008-04-08 at 09:16 +0200, Albe Laurenz wrote:

> Heikki Linnakangas wrote:
> > Albe Laurenz wrote:
> >> Moreover, if Shutdown == SmartShutdown, new connections won't be accepted,
> >> and nobody can connect and call pg_stop_backup().
> >> So even if I'd add a check for
> >> (pmState == PM_WAIT_BACKENDS) && !BackupInProgress() somewhere in the
> >> ServerLoop(), it wouldn't do much good, because the only way for somebody
> >> to cancel online backup mode would be to manually remove the file.
> >
> > Good point.
> >
> >> So the only reasonable thing to do on smart shutdown during an online
> >> backup is to have the shutdown request fail, right? The only alternative being
> >> that a smart shutdown request should interrupt online backup mode.
> >
> > Or we can add another state, PM_WAIT_BACKUP, before PM_WAIT_BACKENDS,
> > that allows new connections, and waits until the backup ends.
>
> That's an option. Maybe it is possible to restrict connections to superusers
> (who are the only ones who can call pg_stop_backup() anyway).
>
> Or, we could allow superuser connections in state PM_WAIT_BACKENDS...

That sounds right.

Completely unrelated to backups, if you issue a smart shutdown and it
doesn't, you probably would like to connect and see what is happening
and why. The reason may not be a backup-in-progress.

Personally, I think "smart" shutdown could be even smarter. It should
kick off unwanted sessions, such as an idle pgAdmin session - maybe a
rule like "anything that has been idle for >30 seconds".

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "Heikki Linnakangas *EXTERN*" <heikki(at)enterprisedb(dot)com>, <pgsql-patches(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve shutdown during online backup
Date: 2008-04-16 13:09:38
Message-ID: 87mynuvtd9.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:

> Personally, I think "smart" shutdown could be even smarter. It should
> kick off unwanted sessions, such as an idle pgAdmin session - maybe a
> rule like "anything that has been idle for >30 seconds".

That's not a bad idea in itself but I don't think it's something the server
should be in the business of doing. One big reason is that the server
shouldn't be imposing arbitrary policy. That should be something the person
running the shutdown is in control over.

What you could do is have a separate program (I would write a client but a
server-side function would work too) to kick off users based on various
criteria you can specify.

Then you can put in your backup scripts two commands, one to kick off idle
users and then do a smart shutdown.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!