Re: comment for "fast promote"

Lists: pgsql-hackers
From: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: comment for "fast promote"
Date: 2013-07-25 08:33:38
Message-ID: 51F0E2E2.5090204@po.ntts.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Now I'm seeing xlog.c in 93_stable for studying "fast promote",
and I have a question.

I think it has an extra unlink command for "promote" file.
(on 9937 line)
-------
9934 if (stat(FAST_PROMOTE_SIGNAL_FILE, &stat_buf) == 0)
9935 {
9936 unlink(FAST_PROMOTE_SIGNAL_FILE);
9937 unlink(PROMOTE_SIGNAL_FILE);
9938 fast_promote = true;
9939 }
-------

Is this command necesary ?

regards,
------------------
NTT Software Corporation
Tomonari Katsumata


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-07-25 12:15:22
Message-ID: CAHGQGwEBUvgcx8X+Z0Hh+VdwYcJ8KCuRuLt1jSsxeLxPcX=0_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 25, 2013 at 5:33 PM, Tomonari Katsumata
<katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp> wrote:
> Hi,
>
> Now I'm seeing xlog.c in 93_stable for studying "fast promote",
> and I have a question.
>
> I think it has an extra unlink command for "promote" file.
> (on 9937 line)
> -------
> 9934 if (stat(FAST_PROMOTE_SIGNAL_FILE, &stat_buf) == 0)
> 9935 {
> 9936 unlink(FAST_PROMOTE_SIGNAL_FILE);
> 9937 unlink(PROMOTE_SIGNAL_FILE);
> 9938 fast_promote = true;
> 9939 }
> -------
>
> Is this command necesary ?

Yes, it prevents PROMOTE_SIGNAL_FILE from remaining even if
both promote files exist.

One question is that: we really still need to support normal promote?
pg_ctl promote provides only way to do fast promotion. If we want to
do normal promotion, we need to create PROMOTE_SIGNAL_FILE
and send the SIGUSR1 signal to postmaster by hand. This seems messy.

I think that we should remove normal promotion at all, or change
pg_ctl promote so that provides also the way to do normal promotion.

Regards,

--
Fujii Masao


From: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-07-26 02:19:50
Message-ID: 51F1DCC6.6000101@po.ntts.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Fujii-san,

Thank you for response.

(2013/07/25 21:15), Fujii Masao wrote:
> On Thu, Jul 25, 2013 at 5:33 PM, Tomonari Katsumata
> <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp> wrote:
>> Hi,
>>
>> Now I'm seeing xlog.c in 93_stable for studying "fast promote",
>> and I have a question.
>>
>> I think it has an extra unlink command for "promote" file.
>> (on 9937 line)
>> -------
>> 9934 if (stat(FAST_PROMOTE_SIGNAL_FILE, &stat_buf) == 0)
>> 9935 {
>> 9936 unlink(FAST_PROMOTE_SIGNAL_FILE);
>> 9937 unlink(PROMOTE_SIGNAL_FILE);
>> 9938 fast_promote = true;
>> 9939 }
>> -------
>>
>> Is this command necesary ?
>
> Yes, it prevents PROMOTE_SIGNAL_FILE from remaining even if
> both promote files exist.
>
The command("unlink(PROMOTE_SIGNAL_FILE)") here is for
unusualy case.
Because the case is when done both procedures below.
- user create "promote" file on PGDATA
- user issue "pg_ctl promote"

I understand the reason.
But I think it's better to unlink(PROMOTE_SIGNAL_FILE) before
unlink(FAST_PROMOTE_SIGNAL_FILE).
Because FAST_PROMOTE_SIGNAL_FILE is definetly there but
PROMOTE_SIGNAL_FILE is sometimes there or not there.

And I have another question linking this behavior.
I think TriggerFile should be removed too.
This is corner-case but it will happen.
How do you think of it ?

> One question is that: we really still need to support normal promote?
> pg_ctl promote provides only way to do fast promotion. If we want to
> do normal promotion, we need to create PROMOTE_SIGNAL_FILE
> and send the SIGUSR1 signal to postmaster by hand. This seems messy.
>
> I think that we should remove normal promotion at all, or change
> pg_ctl promote so that provides also the way to do normal promotion.
>
I think he merit of "fast promote" is
- allowing quick connection by skipping checkpoint
and its demerit is
- taking little bit longer when crash-recovery

If it is seldom to happen its crash soon after promoting
and "fast promte" never breaks consistency of database cluster,
I think we don't need normal promotion.
(of course we need to put detail about promotion on document.)

regards,
--------
NTT Software Corporation
Tomonari Katsumata


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-07-26 07:26:08
Message-ID: CAHGQGwFG2Z+B4Cr8=OtsDAti3sf_zK3Vr=sNzJEBOoBhGNMmpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 26, 2013 at 11:19 AM, Tomonari Katsumata
<katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp> wrote:
> Hi Fujii-san,
>
> Thank you for response.
>
>
> (2013/07/25 21:15), Fujii Masao wrote:
>> On Thu, Jul 25, 2013 at 5:33 PM, Tomonari Katsumata
>> <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp> wrote:
>>> Hi,
>>>
>>> Now I'm seeing xlog.c in 93_stable for studying "fast promote",
>>> and I have a question.
>>>
>>> I think it has an extra unlink command for "promote" file.
>>> (on 9937 line)
>>> -------
>>> 9934 if (stat(FAST_PROMOTE_SIGNAL_FILE, &stat_buf) == 0)
>>> 9935 {
>>> 9936 unlink(FAST_PROMOTE_SIGNAL_FILE);
>>> 9937 unlink(PROMOTE_SIGNAL_FILE);
>>> 9938 fast_promote = true;
>>> 9939 }
>>> -------
>>>
>>> Is this command necesary ?
>>
>> Yes, it prevents PROMOTE_SIGNAL_FILE from remaining even if
>> both promote files exist.
>>
> The command("unlink(PROMOTE_SIGNAL_FILE)") here is for
> unusualy case.
> Because the case is when done both procedures below.
> - user create "promote" file on PGDATA
> - user issue "pg_ctl promote"
>
> I understand the reason.
> But I think it's better to unlink(PROMOTE_SIGNAL_FILE) before
> unlink(FAST_PROMOTE_SIGNAL_FILE).
> Because FAST_PROMOTE_SIGNAL_FILE is definetly there but
> PROMOTE_SIGNAL_FILE is sometimes there or not there.

I could not understand why that's better. Could you elaborate that?

> And I have another question linking this behavior.
> I think TriggerFile should be removed too.
> This is corner-case but it will happen.
> How do you think of it ?

I don't have strong opinion about that. I've never heard the complaint
about that current behavior so far.

>> One question is that: we really still need to support normal promote?
>> pg_ctl promote provides only way to do fast promotion. If we want to
>> do normal promotion, we need to create PROMOTE_SIGNAL_FILE
>> and send the SIGUSR1 signal to postmaster by hand. This seems messy.
>>
>> I think that we should remove normal promotion at all, or change
>> pg_ctl promote so that provides also the way to do normal promotion.
>>
> I think he merit of "fast promote" is
> - allowing quick connection by skipping checkpoint
> and its demerit is
> - taking little bit longer when crash-recovery
>
> If it is seldom to happen its crash soon after promoting
> and "fast promte" never breaks consistency of database cluster,
> I think we don't need normal promotion.

You can execute checkpoint after fast promotion for that.

Regards,

--
Fujii Masao


From: Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-07-27 09:57:03
Message-ID: CAC55fYfRKMX_Yw3UnmUy5G5Pd=w9-Qs3CcnTEwQ_dj0Sa8WTxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

>>> Yes, it prevents PROMOTE_SIGNAL_FILE from remaining even if
>>> both promote files exist.
>>>
>> The command("unlink(PROMOTE_SIGNAL_FILE)") here is for
>> unusualy case.
>> Because the case is when done both procedures below.
>> - user create "promote" file on PGDATA
>> - user issue "pg_ctl promote"
>>
>> I understand the reason.
>> But I think it's better to unlink(PROMOTE_SIGNAL_FILE) before
>> unlink(FAST_PROMOTE_SIGNAL_FILE).
>> Because FAST_PROMOTE_SIGNAL_FILE is definetly there but
>> PROMOTE_SIGNAL_FILE is sometimes there or not there.
>
> I could not understand why that's better. Could you elaborate that?
>
I'm sorry for less explanation.

I've thought that errno would be set ENOENT and
this may lead something wrong.
I checked this and I know it's not problem.

sorry for confusing you.

>> And I have another question linking this behavior.
>> I think TriggerFile should be removed too.
>> This is corner-case but it will happen.
>> How do you think of it ?
>
> I don't have strong opinion about that. I've never heard the complaint
> about that current behavior so far.
>
For example, please imagine the cascading replication environment and
using old master as a standby without copying the timeline history file
to new standby.

-------
1. replicating 3 servers(A,B,C)
A->B->C
("trigger_file = /tmp/trig" is set in recovery_recovery.conf on B and C.)

2. stop server A and promoting server B with "touch /tmp/trig;pg_ctl
promote"
B->C
(/tmp/trig file remains on server B)

4. stop server B and promoting server C with "pg_ctl promote"
C

5. making server B connect for standby of server C
C->B
---------

In step5 server B will promote as soon as it starts,
because "/tmp/trig" is stil there.

>>> One question is that: we really still need to support normal promote?
>>> pg_ctl promote provides only way to do fast promotion. If we want to
>>> do normal promotion, we need to create PROMOTE_SIGNAL_FILE
>>> and send the SIGUSR1 signal to postmaster by hand. This seems messy.
>>>
>>> I think that we should remove normal promotion at all, or change
>>> pg_ctl promote so that provides also the way to do normal promotion.
>>>
>> I think he merit of "fast promote" is
>> - allowing quick connection by skipping checkpoint
>> and its demerit is
>> - taking little bit longer when crash-recovery
>>
>> If it is seldom to happen its crash soon after promoting
>> and "fast promte" never breaks consistency of database cluster,
>> I think we don't need normal promotion.
>
> You can execute checkpoint after fast promotion for that.
>
OK.
Then I think we should do below things.
- removing normal promotion at all from source
- adding the know-how you suggest on document

Are there any objection?

regards,
-------------------
Tomonari Katsumata


From: Euler Taveira <euler(at)timbira(dot)com(dot)br>
To: Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-07-27 20:17:08
Message-ID: 51F42AC4.4000201@timbira.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27-07-2013 06:57, Tomonari Katsumata wrote:
> 1. replicating 3 servers(A,B,C)
> A->B->C
> ("trigger_file = /tmp/trig" is set in recovery_recovery.conf on B and C.)
>
> 2. stop server A and promoting server B with "touch /tmp/trig;pg_ctl
> promote"
> B->C
> (/tmp/trig file remains on server B)
>
Why don't you setup recovery_end_command parameter? The trigger_file is
important in some (legacy) environments and that is using an external
tool to handle the service initialization.

It seems to me it is an opportunity to improve trigger_file description
(informing a way to cleanup the file created) than to suggest it is not
useful.

--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com>
Cc: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-07-29 18:14:48
Message-ID: CAHGQGwERLA0vpd=jP_cnZA8sTsrC2eUHxX9606DttCgFMifkfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jul 27, 2013 at 6:57 PM, Tomonari Katsumata
<t(dot)katsumata1122(at)gmail(dot)com> wrote:
> Hi,
>
>
>>>> Yes, it prevents PROMOTE_SIGNAL_FILE from remaining even if
>>>> both promote files exist.
>>>>
>>> The command("unlink(PROMOTE_SIGNAL_FILE)") here is for
>>> unusualy case.
>>> Because the case is when done both procedures below.
>>> - user create "promote" file on PGDATA
>>> - user issue "pg_ctl promote"
>>>
>>> I understand the reason.
>>> But I think it's better to unlink(PROMOTE_SIGNAL_FILE) before
>>> unlink(FAST_PROMOTE_SIGNAL_FILE).
>>> Because FAST_PROMOTE_SIGNAL_FILE is definetly there but
>>> PROMOTE_SIGNAL_FILE is sometimes there or not there.
>>
>> I could not understand why that's better. Could you elaborate that?
>>
> I'm sorry for less explanation.
>
> I've thought that errno would be set ENOENT and
> this may lead something wrong.
> I checked this and I know it's not problem.
>
> sorry for confusing you.
>
>
>
>>> And I have another question linking this behavior.
>>> I think TriggerFile should be removed too.
>>> This is corner-case but it will happen.
>>> How do you think of it ?
>>
>> I don't have strong opinion about that. I've never heard the complaint
>> about that current behavior so far.
>>
> For example, please imagine the cascading replication environment and
> using old master as a standby without copying the timeline history file
> to new standby.
>
> -------
> 1. replicating 3 servers(A,B,C)
> A->B->C
> ("trigger_file = /tmp/trig" is set in recovery_recovery.conf on B and C.)
>
> 2. stop server A and promoting server B with "touch /tmp/trig;pg_ctl
> promote"

Why do you need to both create the trigger file and run pg_ctl promote?

Anyway, if the patch is useful for fail-safe and it doesn't break the current
behavior, I'd be happy to apply it. You are suggesting that we should remove
the trigger file in CheckForStandbyTrigger() even if pg_ctl promote is executed.
But there can be some cases where we can get out of the WAL replay loop,
for example, reach the recovery_target_xxx. So ISTM we should try to remove
both the trigger file and "promote" file at the end of recovery
instead. Thought?

> B->C
> (/tmp/trig file remains on server B)
>
> 4. stop server B and promoting server C with "pg_ctl promote"
> C
>
> 5. making server B connect for standby of server C
> C->B
> ---------
>
> In step5 server B will promote as soon as it starts,
> because "/tmp/trig" is stil there.
>
>
>
>>>> One question is that: we really still need to support normal promote?
>>>> pg_ctl promote provides only way to do fast promotion. If we want to
>>>> do normal promotion, we need to create PROMOTE_SIGNAL_FILE
>>>> and send the SIGUSR1 signal to postmaster by hand. This seems messy.
>>>>
>>>> I think that we should remove normal promotion at all, or change
>>>> pg_ctl promote so that provides also the way to do normal promotion.
>>>>
>>> I think he merit of "fast promote" is
>>> - allowing quick connection by skipping checkpoint
>>> and its demerit is
>>> - taking little bit longer when crash-recovery
>>>
>>> If it is seldom to happen its crash soon after promoting
>>> and "fast promte" never breaks consistency of database cluster,
>>> I think we don't need normal promotion.
>>
>> You can execute checkpoint after fast promotion for that.
>>
> OK.
> Then I think we should do below things.
> - removing normal promotion at all from source
> - adding the know-how you suggest on document

IMO either is necessary.

Regards,

--
Fujii Masao


From: Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-08-03 07:31:49
Message-ID: CAC55fYcVkxRZ5Jt-U3p-+Wj1+OGoSBmzuME6c0XAOsdpHRsbZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I made a patch for REL9_3_STABLE which gets rid of
old promote processing. please check it.
This patch make PostgreSQL do fast promoting(*) always.
(*) which means skipping long checkpoint before increasing
timeline.

And after this, I'll do make another patch for unlinking files which are
created by user as a trigger_file or "pg_ctl promote" command.

---------------
Tomonari Katsumata
2013/7/30 Fujii Masao <masao(dot)fujii(at)gmail(dot)com>

> On Sat, Jul 27, 2013 at 6:57 PM, Tomonari Katsumata
> <t(dot)katsumata1122(at)gmail(dot)com> wrote:
> > Hi,
> >
> >
> >>>> Yes, it prevents PROMOTE_SIGNAL_FILE from remaining even if
> >>>> both promote files exist.
> >>>>
> >>> The command("unlink(PROMOTE_SIGNAL_FILE)") here is for
> >>> unusualy case.
> >>> Because the case is when done both procedures below.
> >>> - user create "promote" file on PGDATA
> >>> - user issue "pg_ctl promote"
> >>>
> >>> I understand the reason.
> >>> But I think it's better to unlink(PROMOTE_SIGNAL_FILE) before
> >>> unlink(FAST_PROMOTE_SIGNAL_FILE).
> >>> Because FAST_PROMOTE_SIGNAL_FILE is definetly there but
> >>> PROMOTE_SIGNAL_FILE is sometimes there or not there.
> >>
> >> I could not understand why that's better. Could you elaborate that?
> >>
> > I'm sorry for less explanation.
> >
> > I've thought that errno would be set ENOENT and
> > this may lead something wrong.
> > I checked this and I know it's not problem.
> >
> > sorry for confusing you.
> >
> >
> >
> >>> And I have another question linking this behavior.
> >>> I think TriggerFile should be removed too.
> >>> This is corner-case but it will happen.
> >>> How do you think of it ?
> >>
> >> I don't have strong opinion about that. I've never heard the complaint
> >> about that current behavior so far.
> >>
> > For example, please imagine the cascading replication environment and
> > using old master as a standby without copying the timeline history file
> > to new standby.
> >
> > -------
> > 1. replicating 3 servers(A,B,C)
> > A->B->C
> > ("trigger_file = /tmp/trig" is set in recovery_recovery.conf on B and C.)
> >
> > 2. stop server A and promoting server B with "touch /tmp/trig;pg_ctl
> > promote"
>
> Why do you need to both create the trigger file and run pg_ctl promote?
>
> Anyway, if the patch is useful for fail-safe and it doesn't break the
> current
> behavior, I'd be happy to apply it. You are suggesting that we should
> remove
> the trigger file in CheckForStandbyTrigger() even if pg_ctl promote is
> executed.
> But there can be some cases where we can get out of the WAL replay loop,
> for example, reach the recovery_target_xxx. So ISTM we should try to remove
> both the trigger file and "promote" file at the end of recovery
> instead. Thought?
>
> > B->C
> > (/tmp/trig file remains on server B)
> >
> > 4. stop server B and promoting server C with "pg_ctl promote"
> > C
> >
> > 5. making server B connect for standby of server C
> > C->B
> > ---------
> >
> > In step5 server B will promote as soon as it starts,
> > because "/tmp/trig" is stil there.
> >
> >
> >
> >>>> One question is that: we really still need to support normal promote?
> >>>> pg_ctl promote provides only way to do fast promotion. If we want to
> >>>> do normal promotion, we need to create PROMOTE_SIGNAL_FILE
> >>>> and send the SIGUSR1 signal to postmaster by hand. This seems messy.
> >>>>
> >>>> I think that we should remove normal promotion at all, or change
> >>>> pg_ctl promote so that provides also the way to do normal promotion.
> >>>>
> >>> I think he merit of "fast promote" is
> >>> - allowing quick connection by skipping checkpoint
> >>> and its demerit is
> >>> - taking little bit longer when crash-recovery
> >>>
> >>> If it is seldom to happen its crash soon after promoting
> >>> and "fast promte" never breaks consistency of database cluster,
> >>> I think we don't need normal promotion.
> >>
> >> You can execute checkpoint after fast promotion for that.
> >>
> > OK.
> > Then I think we should do below things.
> > - removing normal promotion at all from source
> > - adding the know-how you suggest on document
>
> IMO either is necessary.
>
> Regards,
>
> --
> Fujii Masao
>

Attachment Content-Type Size
getting_rid_of_old_promote.patch application/octet-stream 6.1 KB

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com>
Cc: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: comment for "fast promote"
Date: 2013-08-05 18:32:27
Message-ID: CAHGQGwECArGmcHiVUyD4-_2wRjUKAjyWMmU3LXgZYqP6Z+C3kQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Aug 3, 2013 at 4:31 PM, Tomonari Katsumata
<t(dot)katsumata1122(at)gmail(dot)com> wrote:
> Hi,
>
> I made a patch for REL9_3_STABLE which gets rid of
> old promote processing. please check it.
> This patch make PostgreSQL do fast promoting(*) always.
> (*) which means skipping long checkpoint before increasing
> timeline.

Thanks for the patch!

I fixed the bug that your patch accidentally makes archive recovery
skip end-of-recovery checkpoint, fixed some typos, refactored the
source code and posted the updated version of the patch in
http://www.postgresql.org/message-id/CAHGQGwGYkF+CvpOMdxaO=+aNAzc1Oo9O4LqWo50MxpvFj+0VOw@mail.gmail.com

Regards,

--
Fujii Masao