Quick Links

Re: pending patch: Re: HS/SR and smart shutdown

Lists:	pgsql-hackers

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-30 09:09:42
Message-ID:	3f0b79eb1003300209j7e10e233qee942e879b721200@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Feb 1, 2010 at 11:49 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Sat, Jan 30, 2010 at 12:54 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> HOWEVER, I do believe this is an issue we could live with for 9.0 if
>>> it's going to lead to a whole lot of additional debugging of SR. But if
>>> it's an easy fix, it'll avoid a lot of complaints on pgsql-general.
>>
>> I think that the latter statement is right.
>
> Though we've not reached consensus on smart shutdown during
> recovery yet, I wrote the patch that changes its behavior:
> shut down the server (including the startup process and the
> walreceiver) as soon as all read-only connections have died.
> The code is also available in the 'replication' branch in
> my git repository.
>
> And, let's discuss whether something like the attached patch
> is required for v9.0 or not.

I rebased the patch to HEAD. Is the patch still required for 9.0?
If not, I'd remove the open item of the smart shutdown during
recovery.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment	Content-Type	Size
new_smart_shutdown_20100330.patch	application/octet-stream	2.1 KB

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 00:47:28
Message-ID:	603c8f071003301747m30dd8af3l1d3e21b82ae267c8@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 30, 2010 at 5:09 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> I rebased the patch to HEAD. Is the patch still required for 9.0?
> If not, I'd remove the open item of the smart shutdown during
> recovery.

I am by no means an expert on this area of the code, but in the
interest of moving things along I reviewed this patch tonight.

1. I wonder if there is a problem if we receive SIGINT while in the
PM_WAIT_READONLY state? Seems to me that might need to be added to
the if statement beginning at line 2212, in pmdie().

2. It appears to me that HandleChildCrash() needs to switch to
PM_WAIT_BACKENDS state if it's in PM_WAIT_READONLY when the child
crash occurs - i.e. the if statement beginning at line 2772 needs
updating.

Thoughts?

...Robert

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 01:48:10
Message-ID:	3f0b79eb1003301848x11856977ie331dba8351b553d@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 31, 2010 at 9:47 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Mar 30, 2010 at 5:09 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> I rebased the patch to HEAD. Is the patch still required for 9.0?
>> If not, I'd remove the open item of the smart shutdown during
>> recovery.
>
> I am by no means an expert on this area of the code, but in the
> interest of moving things along I reviewed this patch tonight.

Thanks a lot!

> 1. I wonder if there is a problem if we receive SIGINT while in the
> PM_WAIT_READONLY state? Seems to me that might need to be added to
> the if statement beginning at line 2212, in pmdie().
>
> 2. It appears to me that HandleChildCrash() needs to switch to
> PM_WAIT_BACKENDS state if it's in PM_WAIT_READONLY when the child
> crash occurs - i.e. the if statement beginning at line 2772 needs
> updating.
>
> Thoughts?

Oh, those are my oversights. You are right!
I modified the patch as you pointed out.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment	Content-Type	Size
new_smart_shutdown_20100331.patch	application/octet-stream	2.6 KB

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 03:10:18
Message-ID:	p2q603c8f071003302010hb8e72d53rd8eb763ac9735912@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 30, 2010 at 9:48 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Wed, Mar 31, 2010 at 9:47 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Mar 30, 2010 at 5:09 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> I rebased the patch to HEAD. Is the patch still required for 9.0?
>>> If not, I'd remove the open item of the smart shutdown during
>>> recovery.
>>
>> I am by no means an expert on this area of the code, but in the
>> interest of moving things along I reviewed this patch tonight.
>
> Thanks a lot!
>
>> 1. I wonder if there is a problem if we receive SIGINT while in the
>> PM_WAIT_READONLY state? Seems to me that might need to be added to
>> the if statement beginning at line 2212, in pmdie().
>>
>> 2. It appears to me that HandleChildCrash() needs to switch to
>> PM_WAIT_BACKENDS state if it's in PM_WAIT_READONLY when the child
>> crash occurs - i.e. the if statement beginning at line 2772 needs
>> updating.
>>
>> Thoughts?
>
> Oh, those are my oversights. You are right!
> I modified the patch as you pointed out.

Cool. This looks good to me now and I also tested it. I will commit
it unless (a) someone objects or (b) someone else does it first.

...Robert

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 08:00:38
Message-ID:	1270022438.3684.7552.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-03-31 at 10:48 +0900, Fujii Masao wrote:
> On Wed, Mar 31, 2010 at 9:47 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Tue, Mar 30, 2010 at 5:09 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> >> I rebased the patch to HEAD. Is the patch still required for 9.0?
> >> If not, I'd remove the open item of the smart shutdown during
> >> recovery.
> >
> > I am by no means an expert on this area of the code, but in the
> > interest of moving things along I reviewed this patch tonight.
>
> Thanks a lot!
>
> > 1. I wonder if there is a problem if we receive SIGINT while in the
> > PM_WAIT_READONLY state? Seems to me that might need to be added to
> > the if statement beginning at line 2212, in pmdie().
> >
> > 2. It appears to me that HandleChildCrash() needs to switch to
> > PM_WAIT_BACKENDS state if it's in PM_WAIT_READONLY when the child
> > crash occurs - i.e. the if statement beginning at line 2772 needs
> > updating.
> >
> > Thoughts?
>
> Oh, those are my oversights. You are right!
> I modified the patch as you pointed out.

Please add some docs that a) explains what the patch does and b) notes
any changes from behaviour in previous releases. ISTM this is a major
change in behaviour.

The reason for the lack of consensus on this issue is simply people
aren't following what you're talking about and don't want to have to
re-read a whole thread to do so. It might be that many would agree.

>From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
"backends might be waiting for the WAL record that conflicts with their
queries to be replayed". Recovery sometimes waits for backends, but
backends never wait for recovery.

--
Simon Riggs www.2ndQuadrant.com

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 08:48:45
Message-ID:	s2g3f0b79eb1003310148g41eefdf9g2fe173d065029c78@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 31, 2010 at 5:00 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Please add some docs that a) explains what the patch does and b) notes
> any changes from behaviour in previous releases. ISTM this is a major
> change in behaviour.

How about adding the following description into "17.5. Shutting Down
the Server"?

If the server is in recovery, it waits for all of the read only
connections to be closed.

And, where should the note be put in? AFAIK, the previous behavior
has not been documented anywhere.

> >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
> "backends might be waiting for the WAL record that conflicts with their
> queries to be replayed". Recovery sometimes waits for backends, but
> backends never wait for recovery.

Really? As Heikki explained before, backends might wait for the lock
taken by the startup process.
http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 09:02:47
Message-ID:	1270026167.3684.7671.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-03-31 at 17:48 +0900, Fujii Masao wrote:
> On Wed, Mar 31, 2010 at 5:00 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Please add some docs that a) explains what the patch does and b) notes
> > any changes from behaviour in previous releases. ISTM this is a major
> > change in behaviour.
>
> How about adding the following description into "17.5. Shutting Down
> the Server"?
>
> If the server is in recovery, it waits for all of the read only
> connections to be closed.

You need to explain which mode you're talking about. Adding minimal
comments isn't my objective. We need good, useful documentation on every
aspect that you add or change.

> And, where should the note be put in? AFAIK, the previous behavior
> has not been documented anywhere.

> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
> > "backends might be waiting for the WAL record that conflicts with their
> > queries to be replayed". Recovery sometimes waits for backends, but
> > backends never wait for recovery.
>
> Really? As Heikki explained before, backends might wait for the lock
> taken by the startup process.
> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php

Backends wait for locks, yes, but they could be waiting for user locks
also. That is not "waiting for the WAL record", that concept does not
exist.

--
Simon Riggs www.2ndQuadrant.com

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 09:58:57
Message-ID:	s2v3f0b79eb1003310258waa1bdf87n7de2f903a2a4bd37@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 31, 2010 at 6:02 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, 2010-03-31 at 17:48 +0900, Fujii Masao wrote:
>> On Wed, Mar 31, 2010 at 5:00 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > Please add some docs that a) explains what the patch does and b) notes
>> > any changes from behaviour in previous releases. ISTM this is a major
>> > change in behaviour.
>>
>> How about adding the following description into "17.5. Shutting Down
>> the Server"?
>>
>> If the server is in recovery, it waits for all of the read only
>> connections to be closed.
>
> You need to explain which mode you're talking about.

Smart Shutdown mode

> Adding minimal
> comments isn't my objective. We need good, useful documentation on every
> aspect that you add or change.

But the patch doesn't provide anything beyond:

>> If the server is in recovery, it waits for all of the read only
>> connections to be closed.

What additional explanation do you think is required?

>> And, where should the note be put in? AFAIK, the previous behavior
>> has not been documented anywhere.
>
>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
>> > "backends might be waiting for the WAL record that conflicts with their
>> > queries to be replayed". Recovery sometimes waits for backends, but
>> > backends never wait for recovery.
>>
>> Really? As Heikki explained before, backends might wait for the lock
>> taken by the startup process.
>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
>
> Backends wait for locks, yes, but they could be waiting for user locks
> also. That is not "waiting for the WAL record", that concept does not
> exist.

How about the following change?

- * waiting for the WAL record that conflicts with their queries to be
- * replayed, recovery and replication need to remain until all read
+ * waiting until the startup process has released the lock that
+ * their queries are waiting for by replaying the WAL record,
+ * recovery and replication need to remain until all read

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 13:29:31
Message-ID:	y2n603c8f071003310629x6b66731fv1a86fcb39d694cd2@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 31, 2010 at 4:00 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Please add some docs that a) explains what the patch does and b) notes
> any changes from behaviour in previous releases. ISTM this is a major
> change in behaviour.

I guess I see this a little bit differently. If you do a smart
shutdown on 8.4, the autovacuum launcher won't prevent s smart
shutdown from completing successfully. Neither will the background
writer. If they did, we'd consider that a bug and fix it.
walreceiver is just one more system process that needs to get properly
shut down when a smart shutdown is requested. So I don't think this
is a major behavior change - I think it's preserving the behavior
we've had all along.

The current documentation reads:

In stop mode, the server that is running in the specified data
directory is shut down. Three different shutdown methods can be
selected with the -m option: "Smart" mode waits for online backup mode
to finish and all the clients to disconnect. This is the default.
"Fast" mode does not wait for clients to disconnect and will terminate
an online backup in progress. All active transactions are rolled back
and clients are forcibly disconnected, then the server is shut down.
"Immediate" mode will abort all server processes without a clean
shutdown. This will lead to a recovery run on restart.

That all still seems accurate after this patch. I'm not even sure
what to add. I suppose we could add a sentence like

If a smart shutdown is requested while the server is in recovery,
recovery will stop and the server will shut down.

...but if we add that then why don't we have a similar sentence that says:

If a smart shutdown is requested while the autovacuum launcher is
running, the autovacuum launcher will be stopped and the server will
shut down.

I just don't see that we're adding any additional clarity here. I
think what would require documentation is if we DIDN'T apply this
patch. Then we'd need something like:

Smart shutdown mode should not be used if streaming replication is in
use. The server will begin to shut down but, because the streaming
replication process is not automatically shut down, it will never
actually finish shutting down unless the streaming replication process
crashes. If a server using streaming replication is accidentally shut
down using smart mode, the problem can be corrected by shutting down
again using fast or immediate mode.

...Robert

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-03-31 15:16:21
Message-ID:	n2n603c8f071003310816zd23e9515nbefe9cc8dc36967d@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 31, 2010 at 5:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
>> > "backends might be waiting for the WAL record that conflicts with their
>> > queries to be replayed". Recovery sometimes waits for backends, but
>> > backends never wait for recovery.
>>
>> Really? As Heikki explained before, backends might wait for the lock
>> taken by the startup process.
>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
>
> Backends wait for locks, yes, but they could be waiting for user locks
> also. That is not "waiting for the WAL record", that concept does not
> exist.

Hmm... this is a good point, on two levels. First, the comment is not
as well-phrased as it could be. Second, I wonder why we can't kill
the startup process and WAL receiver right away, and then wait for the
backends to die off afterwards.

...Robert

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-04-01 08:42:13
Message-ID:	u2q3f0b79eb1004010142q66792e64w201f5bc73ea54195@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Apr 1, 2010 at 12:16 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Mar 31, 2010 at 5:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
>>> > "backends might be waiting for the WAL record that conflicts with their
>>> > queries to be replayed". Recovery sometimes waits for backends, but
>>> > backends never wait for recovery.
>>>
>>> Really? As Heikki explained before, backends might wait for the lock
>>> taken by the startup process.
>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
>>
>> Backends wait for locks, yes, but they could be waiting for user locks
>> also. That is not "waiting for the WAL record", that concept does not
>> exist.
>
> Hmm... this is a good point, on two levels. First, the comment is not
> as well-phrased as it could be. Second, I wonder why we can't kill
> the startup process and WAL receiver right away, and then wait for the
> backends to die off afterwards.

I tested whether killing the startup process and walreceiver releases
the lock which the backends are waiting for. Unfortunately it doesn't,
and the backends have gotten stuck in my box. The behavior which the
startup process shuts down without releasing the lock is a bug?

BTW, I tested that by compiling postgres with the attached patch and
doing the following operations.

1. Make the SR environment
2. Issue some SQLs to the primary

psql -h <primary server>
=# CREATE TABLE t(i int);
=# BEGIN;
=# DROP TABLE t;
=# SELECT pg_switch_xlog();
(keep this session alive)

3. Issue some SQLs to the standby

psql -h <standby server>
=# BEGIN;
=# SELECT * FROM t; --> waiting

4. Perform smart shutdown on the standby
Then the startup process and walreceiver shut down, but the
session created in #3 is still waiting.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment	Content-Type	Size
shutdown_test.patch	application/octet-stream	528 bytes

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-04-01 10:48:12
Message-ID:	l2m603c8f071004010348mf259f3e5yc7510de075f0a929@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Apr 1, 2010 at 4:42 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Apr 1, 2010 at 12:16 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Mar 31, 2010 at 5:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
>>>> > "backends might be waiting for the WAL record that conflicts with their
>>>> > queries to be replayed". Recovery sometimes waits for backends, but
>>>> > backends never wait for recovery.
>>>>
>>>> Really? As Heikki explained before, backends might wait for the lock
>>>> taken by the startup process.
>>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
>>>
>>> Backends wait for locks, yes, but they could be waiting for user locks
>>> also. That is not "waiting for the WAL record", that concept does not
>>> exist.
>>
>> Hmm... this is a good point, on two levels. First, the comment is not
>> as well-phrased as it could be. Second, I wonder why we can't kill
>> the startup process and WAL receiver right away, and then wait for the
>> backends to die off afterwards.
>
> I tested whether killing the startup process and walreceiver releases
> the lock which the backends are waiting for. Unfortunately it doesn't,
> and the backends have gotten stuck in my box. The behavior which the
> startup process shuts down without releasing the lock is a bug?

I think that what this shows is that the original design of Hot
Standby didn't contemplate ever having Hot Standby up without the
startup process running. In retrospect, maybe we want to allow that,
because a smart shutdown would be more likely to complete in a timely
fashion if we stopped replication first and then waited for the
backends to die rather than waiting for the backends to die first and
then stopping replication. That's because, for so long as replication
continues, it may take new locks as well as releasing old ones, to say
nothing of using other system resources like CPU and I/O bandwidth.
But, for 9.0, I'm not sure we have any real choice, unless making the
startup process release locks when it goes away is a very simple
change. Assuming that's not the case, I think we should apply this
patch with some updates to the comments, document how it works and
that it may change in a future release, and add a TODO for 9.1.

Thoughts?

...Robert

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-04-01 11:18:12
Message-ID:	1270120692.5640.1237.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-04-01 at 06:48 -0400, Robert Haas wrote:
> On Thu, Apr 1, 2010 at 4:42 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > On Thu, Apr 1, 2010 at 12:16 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> On Wed, Mar 31, 2010 at 5:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >>>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
> >>>> > "backends might be waiting for the WAL record that conflicts with their
> >>>> > queries to be replayed". Recovery sometimes waits for backends, but
> >>>> > backends never wait for recovery.
> >>>>
> >>>> Really? As Heikki explained before, backends might wait for the lock
> >>>> taken by the startup process.
> >>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
> >>>
> >>> Backends wait for locks, yes, but they could be waiting for user locks
> >>> also. That is not "waiting for the WAL record", that concept does not
> >>> exist.
> >>
> >> Hmm... this is a good point, on two levels. First, the comment is not
> >> as well-phrased as it could be. Second, I wonder why we can't kill
> >> the startup process and WAL receiver right away, and then wait for the
> >> backends to die off afterwards.
> >
> > I tested whether killing the startup process and walreceiver releases
> > the lock which the backends are waiting for. Unfortunately it doesn't,
> > and the backends have gotten stuck in my box. The behavior which the
> > startup process shuts down without releasing the lock is a bug?
>
> I think that what this shows is that the original design of Hot
> Standby didn't contemplate ever having Hot Standby up without the
> startup process running. In retrospect, maybe we want to allow that,
> because a smart shutdown would be more likely to complete in a timely
> fashion if we stopped replication first and then waited for the
> backends to die rather than waiting for the backends to die first and
> then stopping replication. That's because, for so long as replication
> continues, it may take new locks as well as releasing old ones, to say
> nothing of using other system resources like CPU and I/O bandwidth.
> But, for 9.0, I'm not sure we have any real choice, unless making the
> startup process release locks when it goes away is a very simple
> change. Assuming that's not the case, I think we should apply this
> patch with some updates to the comments, document how it works and
> that it may change in a future release, and add a TODO for 9.1.

I'm not willing to investigate this further myself at this stage. This
looks like risk for little benefit.

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-04-01 11:24:55
Message-ID:	x2s603c8f071004010424xb3787751ub16b95e7588a29af@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Apr 1, 2010 at 7:18 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I'm not willing to investigate this further myself at this stage. This
> looks like risk for little benefit.

That's kind of what I figured. I'll see about fixing up Fujii-san's
patch and documenting the behavior; but it won't happen before the
weekend because I'm going to be out of town.

...Robert

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-04-13 13:18:25
Message-ID:	m2g3f0b79eb1004130618n230e0b63j9fa20d939a205a1e@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Apr 1, 2010 at 8:24 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Apr 1, 2010 at 7:18 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> I'm not willing to investigate this further myself at this stage. This
>> looks like risk for little benefit.
>
> That's kind of what I figured. I'll see about fixing up Fujii-san's
> patch and documenting the behavior; but it won't happen before the
> weekend because I'm going to be out of town.

I found the bug which makes smart shutdown get stuck for a while:

If there is no WAL file available in the standby, walreceiver might
be invoked before we have reached the PM_RECOVERY state. We switch
to the PM_RECOVERY state after reading the checkpoint record pointed
out in the pg_control file. If it's not found, we are in the PM_INIT
or PM_START state and start walreceiver to read it from the primary.

If smart shutdown is requested at that point, we cannot switch to
the PM_WAIT_READONLY state because pmdie() doesn't allow that. So
the SIGTERM is never sent to walreceiver, and smart shutdown would
get stuck.

If the current state is either PM_INIT or PM_START, it's guaranteed
that there is no regular backend, so we should kill walreceiver as
soon as smart shutdown is requested, I think. The attached patch
does that.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment	Content-Type	Size
smart_shutdown_bugfix_v1.patch	application/octet-stream	893 bytes

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-04-13 13:27:57
Message-ID:	y2y603c8f071004130627o56b9c836x8254ff0c0a59f579@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Apr 13, 2010 at 9:18 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Apr 1, 2010 at 8:24 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Apr 1, 2010 at 7:18 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> I'm not willing to investigate this further myself at this stage. This
>>> looks like risk for little benefit.
>>
>> That's kind of what I figured. I'll see about fixing up Fujii-san's
>> patch and documenting the behavior; but it won't happen before the
>> weekend because I'm going to be out of town.
>
> I found the bug which makes smart shutdown get stuck for a while:
>
> If there is no WAL file available in the standby, walreceiver might
> be invoked before we have reached the PM_RECOVERY state. We switch
> to the PM_RECOVERY state after reading the checkpoint record pointed
> out in the pg_control file. If it's not found, we are in the PM_INIT
> or PM_START state and start walreceiver to read it from the primary.
>
> If smart shutdown is requested at that point, we cannot switch to
> the PM_WAIT_READONLY state because pmdie() doesn't allow that. So
> the SIGTERM is never sent to walreceiver, and smart shutdown would
> get stuck.
>
> If the current state is either PM_INIT or PM_START, it's guaranteed
> that there is no regular backend, so we should kill walreceiver as
> soon as smart shutdown is requested, I think. The attached patch
> does that.

Can you explain how to recreate the problem that this patch fixes?

...Robert

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pending patch: Re: HS/SR and smart shutdown
Date:	2010-04-13 14:16:27
Message-ID:	x2i3f0b79eb1004130716rc1c6719aj7c19366d605e1733@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Apr 13, 2010 at 10:27 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Can you explain how to recreate the problem that this patch fixes?

1. Configure and start the primary server.
2. Configure the standby server.
3. Remove all of the WAL files in pg_xlog of the standby.
4. Start the standby.
5. Request smart shutdown against the standby before walreceiver receives any
WAL records. You would need to emulate the time-consuming authentication
which usually requires the setting of authentication_timeout. I used the
attached patch for the emulation. New GUC parameter "wal_sender_sleep" which
the patch provides for the test specifies the sleep time during walsender's
handshake processing. If you set it to 10s, walsender sleeps 10 secs before
it sends the WAL records.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment	Content-Type	Size
test_smart_shutdown_bug_v1.patch	application/octet-stream	1.7 KB