pgsql: Start background writer during archive recovery.

Lists: pgsql-committerspgsql-hackers
From: heikki(at)postgresql(dot)org (Heikki Linnakangas)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Start background writer during archive recovery.
Date: 2009-02-18 15:58:41
Message-ID: 20090218155841.7EFD37559ED@cvs.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Log Message:
-----------
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.

This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.

The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.

Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.

log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.

This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.

Simon Riggs, with lots of modifications by me.

Modified Files:
--------------
pgsql/src/backend/access/transam:
xlog.c (r1.330 -> r1.331)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/xlog.c?r1=1.330&r2=1.331)
pgsql/src/backend/bootstrap:
bootstrap.c (r1.249 -> r1.250)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/bootstrap/bootstrap.c?r1=1.249&r2=1.250)
pgsql/src/backend/postmaster:
bgwriter.c (r1.55 -> r1.56)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/postmaster/bgwriter.c?r1=1.55&r2=1.56)
postmaster.c (r1.570 -> r1.571)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/postmaster/postmaster.c?r1=1.570&r2=1.571)
pgsql/src/backend/storage/buffer:
README (r1.15 -> r1.16)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/storage/buffer/README?r1=1.15&r2=1.16)
pgsql/src/backend/utils/init:
postinit.c (r1.187 -> r1.188)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/init/postinit.c?r1=1.187&r2=1.188)
pgsql/src/include/access:
xlog.h (r1.90 -> r1.91)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/access/xlog.h?r1=1.90&r2=1.91)
pgsql/src/include/storage:
pmsignal.h (r1.21 -> r1.22)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/storage/pmsignal.h?r1=1.21&r2=1.22)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-18 20:43:18
Message-ID: 25524.1234989798@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

heikki(at)postgresql(dot)org (Heikki Linnakangas) writes:
> Log Message:
> -----------
> Start background writer during archive recovery.

Might that have anything to do with this?

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dungbeetle&dt=2009-02-18%2019:44:01

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-18 21:28:55
Message-ID: 1234992535.3973.147.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers


On Wed, 2009-02-18 at 15:43 -0500, Tom Lane wrote:
> heikki(at)postgresql(dot)org (Heikki Linnakangas) writes:
> > Log Message:
> > -----------
> > Start background writer during archive recovery.
>
> Might that have anything to do with this?
>
> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dungbeetle&dt=2009-02-18%2019:44:01

Hmmm, looks very probable. But not anything that jumps out quickly at
me. Will continue to check.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-18 22:16:26
Message-ID: 1234995386.3973.156.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers


On Wed, 2009-02-18 at 21:28 +0000, Simon Riggs wrote:
> On Wed, 2009-02-18 at 15:43 -0500, Tom Lane wrote:
> > heikki(at)postgresql(dot)org (Heikki Linnakangas) writes:
> > > Log Message:
> > > -----------
> > > Start background writer during archive recovery.
> >
> > Might that have anything to do with this?
> >
> > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dungbeetle&dt=2009-02-18%2019:44:01
>
> Hmmm, looks very probable. But not anything that jumps out quickly at
> me. Will continue to check.

Finger points that way, but still can't see any specific reason for
that.

More likely to be an uncommon race condition, rather than a error
specific to dungbeetle. If startup process death is slow, this could
happen, though hasn't occurred in other tests.

Given the shape of the patch, the likely fix is to bump
NUM_AUXILIARY_PROCS by one.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-18 23:33:12
Message-ID: 27582.1234999992@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dungbeetle&dt=2009-02-18%2019:44:01

> More likely to be an uncommon race condition, rather than a error
> specific to dungbeetle.

I agree, that's what it looks like, especially since I couldn't
duplicate it on Fedora 9 x86_64 which is presumably fairly close
to what dungbeetle is running.

I tried to duplicate it by putting the box under extra load, but
no luck.

regards, tom lane


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-19 07:35:14
Message-ID: 499D0BB2.50604@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Simon Riggs wrote:
> More likely to be an uncommon race condition, rather than a error
> specific to dungbeetle. If startup process death is slow, this could
> happen, though hasn't occurred in other tests.

True, the startup process can live for a short while concurrently with
bgwriter, walwriter and autovacuum launcher, before it exits.

> Given the shape of the patch, the likely fix is to bump
> NUM_AUXILIARY_PROCS by one.

Not sure what you mean by the shape of the patch, but agreed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-19 15:00:30
Message-ID: 7682.1235055630@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Simon Riggs wrote:
>> More likely to be an uncommon race condition, rather than a error
>> specific to dungbeetle. If startup process death is slow, this could
>> happen, though hasn't occurred in other tests.

> True, the startup process can live for a short while concurrently with
> bgwriter, walwriter and autovacuum launcher, before it exits.

Maybe the postmaster should wait for startup process exit before
deciding to open for business, instead of just a signal.

regards, tom lane


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-19 16:20:22
Message-ID: 499D86C6.4010905@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> Simon Riggs wrote:
>>> More likely to be an uncommon race condition, rather than a error
>>> specific to dungbeetle. If startup process death is slow, this could
>>> happen, though hasn't occurred in other tests.
>
>> True, the startup process can live for a short while concurrently with
>> bgwriter, walwriter and autovacuum launcher, before it exits.
>
> Maybe the postmaster should wait for startup process exit before
> deciding to open for business, instead of just a signal.

Not a bad idea. Although, there's nothing wrong with the current way
either. The startup process does a proc_exit(0) right after sending the
signal ATM, so there's no real work left at that point.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-19 16:22:53
Message-ID: 8929.1235060573@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> Maybe the postmaster should wait for startup process exit before
>> deciding to open for business, instead of just a signal.

> Not a bad idea. Although, there's nothing wrong with the current way
> either. The startup process does a proc_exit(0) right after sending the
> signal ATM, so there's no real work left at that point.

The thing wrong with it is assuming that nothing interesting will happen
during proc_exit(). We hang enough stuff on on_proc_exit hooks that
that seems like a pretty shaky assumption.

regards, tom lane


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-19 20:28:31
Message-ID: 499DC0EF.9040603@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> Tom Lane wrote:
>>> Maybe the postmaster should wait for startup process exit before
>>> deciding to open for business, instead of just a signal.
>
>> Not a bad idea. Although, there's nothing wrong with the current way
>> either. The startup process does a proc_exit(0) right after sending the
>> signal ATM, so there's no real work left at that point.
>
> The thing wrong with it is assuming that nothing interesting will happen
> during proc_exit(). We hang enough stuff on on_proc_exit hooks that
> that seems like a pretty shaky assumption.

I can't get too worried, given that proc_exit() is a very well-beaten
code path. Admittedly not so much for an auxiliary process, but that's
just a dumbed down version of what happens with a full-blown backend.

However I started looking into that idea anyway, and figured that it
does simplify the logic in postmaster.c quite a bit, so I think it's
worth doing on those grounds alone. Attached is a patch against CVS HEAD
and also against a snapshot before the recovery infra patch, for easier
reading. I'll give that some more testing and commit if I find no issues.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
wait-for-startup-to-die-1.patch text/x-diff 15.4 KB
wait-for-startup-to-die-1-against-old-code.patch text/x-diff 13.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-19 20:39:56
Message-ID: 25555.1235075996@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> The thing wrong with it is assuming that nothing interesting will happen
>> during proc_exit(). We hang enough stuff on on_proc_exit hooks that
>> that seems like a pretty shaky assumption.

> I can't get too worried, given that proc_exit() is a very well-beaten
> code path. Admittedly not so much for an auxiliary process, but that's
> just a dumbed down version of what happens with a full-blown backend.

Well, you're assuming that no future patch or add-on module will put
anything into an on_proc_exit hook that might interact with other
processes. It might be fine now but I don't think it's very robust.

> However I started looking into that idea anyway, and figured that it
> does simplify the logic in postmaster.c quite a bit, so I think it's
> worth doing on those grounds alone.

Couldn't you get rid of PMSIGNAL_RECOVERY_COMPLETED altogether? If the
startup process exits with code 0, recovery is complete, else there
was trouble. I find this SetPostmasterSignal bit quite ugly anyway.

regards, tom lane


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-20 10:55:26
Message-ID: 499E8C1E.8050602@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Tom Lane wrote:
> Couldn't you get rid of PMSIGNAL_RECOVERY_COMPLETED altogether? If the
> startup process exits with code 0, recovery is complete, else there
> was trouble. I find this SetPostmasterSignal bit quite ugly anyway.

Right now, the startup process exits with code 0 also when it's told to
exit with SIGTERM, ie. fast shutdown request, and the recovery-completed
signal is used to differentiate those cases. But yeah, we can use
another exit code for that. I'll look into that approach.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-23 09:31:52
Message-ID: 49A26D08.1010405@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Couldn't you get rid of PMSIGNAL_RECOVERY_COMPLETED altogether? If the
>> startup process exits with code 0, recovery is complete, else there
>> was trouble. I find this SetPostmasterSignal bit quite ugly anyway.
>
> Right now, the startup process exits with code 0 also when it's told to
> exit with SIGTERM, ie. fast shutdown request, and the recovery-completed
> signal is used to differentiate those cases. But yeah, we can use
> another exit code for that. I'll look into that approach.

Committed that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Start background writer during archive recovery.
Date: 2009-02-25 22:52:48
Message-ID: 3f0b79eb0902251452t38d8d4c0scd3d5092c391bcfa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hi,

I have two questions about the stats collector during recovery.

1)
Why does the stats collector need to wait for consistent recovery
mode? The activity statistics which bgwriter may send before
reaching the mode should be ignored?

2)
Why doesn't ServerLoop() try to restart the stats collector
when it's not in progress?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center