Hot Standby: Startup at shutdown checkpoint

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-06 09:22:03
Message-ID: 1270545724.24910.5691.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Initial patch. I will be testing over next day. No commit before at
least midday on Wed 7 Apr.

The existing call to PrescanPreparedTransactions() looks correct to me
but the comment is wrong. I will change that also, if we agree.

--
Simon Riggs www.2ndQuadrant.com

Attachment Content-Type Size
startup_at_shutdown_checkpoint.patch text/x-patch 5.7 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 10:16:12
Message-ID: 1270721772.24910.6941.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote:

> Initial patch. I will be testing over next day. No commit before at
> least midday on Wed 7 Apr.

Various previous discussions sidelined a very important point: what
exactly does it mean to "start recovery from a shutdown checkpoint"?

If standby_mode is enabled and there is no source of WAL, then we get a
stream of messages saying

LOG: record with zero length at 0/C000088
...

but most importantly we never get to the main recovery loop, so Hot
Standby never gets to start at all. We can't keep retrying the request
for WAL and at the same time enter the retry loop, executing lots of
things that expect non-NULL pointers using a NULL xlog pointer.

What we are asking for here is a completely new state: the database is
not "in recovery" - by definition there is nothing at all to recover.

The following patch adds "Snapshot Mode", a very simple variation on the
existing code - emphasis on the "simple":

LOG: entering snapshot mode
LOG: record with zero length at 0/C000088
LOG: consistent recovery state reached at 0/C000088
LOG: database system is ready to accept read only connections

this mode does *not* continually check to see if new WAL files have been
added. Startup just sits and waits, backends allowed. If a trigger file
is specified, then we can leave recovery. Otherwise Startup process just
sits doing nothing.

There's possibly an argument for inventing some more special modes where
we do allow read only connections but don't start the bgwriter. I don't
personally wish to do this at this stage of the release cycle. The
attached patch is non-invasive and safe and I want to leave it at that.

I will be committing later today, unless major objections, but I ask you
to read the patch before you sharpen your pen. It's simple.

--
Simon Riggs www.2ndQuadrant.com

Attachment Content-Type Size
snapshot_mode.patch text/x-patch 8.2 KB

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 10:33:31
Message-ID: 4BBDB0FB.7030607@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote:
>
>> Initial patch. I will be testing over next day. No commit before at
>> least midday on Wed 7 Apr.
>
> Various previous discussions sidelined a very important point: what
> exactly does it mean to "start recovery from a shutdown checkpoint"?

Hot standby should be possible as soon we know that the database is
consistent. That is, as soon as we've replayed WAL past the
minRecoveryPoint/backupStartPoint point indicated in pg_control.

> If standby_mode is enabled and there is no source of WAL, then we get a
> stream of messages saying
>
> LOG: record with zero length at 0/C000088
> ...
>
> but most importantly we never get to the main recovery loop, so Hot
> Standby never gets to start at all. We can't keep retrying the request
> for WAL and at the same time enter the retry loop, executing lots of
> things that expect non-NULL pointers using a NULL xlog pointer.

You mean it can't find even the checkpoint record to start replaying? I
think the behavior in that scenario is fine as it is. The database isn't
consistent (or at least we can't know if it is, because we don't know
the redo pointer) until you read and replay the first checkpoint record.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 11:32:14
Message-ID: 1270726334.8305.26.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2010-04-08 at 13:33 +0300, Heikki Linnakangas wrote:

> > If standby_mode is enabled and there is no source of WAL, then we get a
> > stream of messages saying
> >
> > LOG: record with zero length at 0/C000088
> > ...
> >
> > but most importantly we never get to the main recovery loop, so Hot
> > Standby never gets to start at all. We can't keep retrying the request
> > for WAL and at the same time enter the retry loop, executing lots of
> > things that expect non-NULL pointers using a NULL xlog pointer.
>
> You mean it can't find even the checkpoint record to start replaying?

Clearly I don't mean that. Otherwise it wouldn't be "start from a
shutdown checkpoint". I think you are misunderstanding me.

Let me explain in more detail though please also read the patch before
replying, if you do.

The patch I submitted at top of this thread works for allowing Hot
Standby during recovery. Yes, of course that occurs when the database is
consistent. The trick is to get recovery to the point where it can be
enabled. The second patch on this thread presents a way to get the
database to that point; it touches some of the other recovery code that
you and Masao have worked on. We *must* touch that code if we are to
enable Hot Standby in the way you desire.

In StartupXlog() when we get to the point where we "Find the first
record that logically follows the checkpoint", in the current code
ReadRecord() loops forever, spitting out
LOG: record with zero length at 0/C000088
...

That prevents us from going further down StartupXLog() to the point
where we start the InRedo loop and hence start hot standby. As long as
we retry we cannot progress further: this is the main problem.

So in the patch, I have modified the retry test in ReadRecord() so it no
longer retries iff there is no WAL source defined. Now, when
ReadRecord() exits, record == NULL at that point and so we do not (and
cannot) enter the redo loop.

So I have introduced the new mode ("snapshot mode") to enter hot standby
anyway. That avoids us having to screw around with the loop logic for
redo. I don't see any need to support the case of where we have no WAL
source defined, yet we want Hot Standby but we also want to allow
somebody to drop a WAL file into pg_xlog at some future point. That has
no use case of value AFAICS and is too complex to add at this stage of
the release cycle.

--
Simon Riggs www.2ndQuadrant.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 13:49:19
Message-ID: p2t603c8f071004080649m972635bbr86c08be2441741f6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 8, 2010 at 6:16 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> If standby_mode is enabled and there is no source of WAL, then we get a
> stream of messages saying
>
> LOG:  record with zero length at 0/C000088
> ...
>
> but most importantly we never get to the main recovery loop, so Hot
> Standby never gets to start at all. We can't keep retrying the request
> for WAL and at the same time enter the retry loop, executing lots of
> things that expect non-NULL pointers using a NULL xlog pointer.

This is pretty much a corner case, so I don't think it's a good idea
to add a new mode to handle it. It also seems like it would be pretty
inconsistent if we allow WAL to be dropped in pg_xlog, but only if we
are also doing archive recovery or streaming replication. If we can't
support this case with the same code path we use otherwise, I think we
should revert to disallowing it.

Having said that, I guess I don't understand how having a source of
WAL solves the problem described above. Do we always have to read at
least 1 byte of WAL from either SR or the archive before starting up?
If not, why do we need to do so here?

...Robert


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 15:35:23
Message-ID: 4BBDF7BB.10000@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> In StartupXlog() when we get to the point where we "Find the first
> record that logically follows the checkpoint", in the current code
> ReadRecord() loops forever, spitting out
> LOG: record with zero length at 0/C000088
> ...
>
> That prevents us from going further down StartupXLog() to the point
> where we start the InRedo loop and hence start hot standby. As long as
> we retry we cannot progress further: this is the main problem.
>
> So in the patch, I have modified the retry test in ReadRecord() so it no
> longer retries iff there is no WAL source defined. Now, when
> ReadRecord() exits, record == NULL at that point and so we do not (and
> cannot) enter the redo loop.

Oh, I see.

> So I have introduced the new mode ("snapshot mode") to enter hot standby
> anyway. That avoids us having to screw around with the loop logic for
> redo. I don't see any need to support the case of where we have no WAL
> source defined, yet we want Hot Standby but we also want to allow
> somebody to drop a WAL file into pg_xlog at some future point. That has
> no use case of value AFAICS and is too complex to add at this stage of
> the release cycle.

You don't need a new mode for that. Just do the same "are we consistent
now?" check you do in the loop once before calling ReadRecord to fetch
the record that follows the checkpoint pointer. Attached is a patch to
show what I mean. We just need to let postmaster know that recovery has
started a bit earlier, right after processing the checkpoint record, not
delaying it until we've read the first record after it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
hotstandby-from-shutdown-checkpoint-1.patch text/x-diff 12.8 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 15:58:20
Message-ID: 1270742300.8305.384.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2010-04-08 at 18:35 +0300, Heikki Linnakangas wrote:
>
> > So I have introduced the new mode ("snapshot mode") to enter hot
> standby
> > anyway. That avoids us having to screw around with the loop logic
> for
> > redo. I don't see any need to support the case of where we have no
> WAL
> > source defined, yet we want Hot Standby but we also want to allow
> > somebody to drop a WAL file into pg_xlog at some future point. That
> has
> > no use case of value AFAICS and is too complex to add at this stage
> of
> > the release cycle.
>
> You don't need a new mode for that. Just do the same "are we
> consistent now?" check you do in the loop once before calling
> ReadRecord to fetch the record that follows the checkpoint pointer.
> Attached is a patch to show what I mean. We just need to let
> postmaster know that recovery has started a bit earlier, right after
> processing the checkpoint record, not delaying it until we've read the
> first record after it.

OK, that seems better. I'm happy with that instead.

Have you tested this? Is it ready to commit?

--
Simon Riggs www.2ndQuadrant.com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 16:02:04
Message-ID: 4BBDFDFC.7070307@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> OK, that seems better. I'm happy with that instead.
>
> Have you tested this? Is it ready to commit?

Only very briefly. I think the code is ready, but please review and test
to see I didn't miss anything.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-08 16:20:49
Message-ID: 1270743649.8305.465.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2010-04-08 at 19:02 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > OK, that seems better. I'm happy with that instead.
> >
> > Have you tested this? Is it ready to commit?
>
> Only very briefly. I think the code is ready, but please review and test
> to see I didn't miss anything.

I'm going to need you to commit this. I'm on holiday now until 14 April,
so its not going to get a retest before then otherwise; its not smart to
commit and then go on holiday, IIRC.

I've reviewed your changes and they look correct to me; the main chunk
of code is mine and that was tested by me.

--
Simon Riggs www.2ndQuadrant.com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-13 14:18:54
Message-ID: 4BC47D4E.1020704@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Thu, 2010-04-08 at 19:02 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> OK, that seems better. I'm happy with that instead.
>>>
>>> Have you tested this? Is it ready to commit?
>> Only very briefly. I think the code is ready, but please review and test
>> to see I didn't miss anything.
>
> I'm going to need you to commit this. I'm on holiday now until 14 April,
> so its not going to get a retest before then otherwise; its not smart to
> commit and then go on holiday, IIRC.

:-)

> I've reviewed your changes and they look correct to me; the main chunk
> of code is mine and that was tested by me.

Ok, committed after fixing an obsoleted comment & other small
editorialization.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hot Standby: Startup at shutdown checkpoint
Date: 2010-04-14 10:29:27
Message-ID: 1271240967.8305.1171.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-04-13 at 17:18 +0300, Heikki Linnakangas wrote:
> I've reviewed your changes and they look correct to me; the main chunk
> > of code is mine and that was tested by me.
>
> Ok, committed after fixing an obsoleted comment & other small
> editorialization.

Looks good, thanks.

--
Simon Riggs www.2ndQuadrant.com