Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.

Lists: pgsql-committerspgsql-hackers
From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 00:07:13
Message-ID: E1TzyjJ-0007rb-VB@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Fast promote mode skips checkpoint at end of recovery.
pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
can achieve very fast failover when the apply delay is low. Write new WAL record
XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
readers. If we skip synchronous end of recovery checkpoint we request a normal
spread checkpoint so that the window of re-recovery is low.

Simon Riggs and Kyotaro Horiguchi, with input from Fujii Masao.
Review by Heikki Linnakangas

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/fd4ced5230162b50a5c9d33b4bf9cfb1231aa62e

Modified Files
--------------
src/backend/access/rmgrdesc/xlogdesc.c | 10 ++
src/backend/access/transam/xlog.c | 192 +++++++++++++++++++++++++++-----
src/bin/pg_ctl/pg_ctl.c | 18 +++-
src/include/access/xlog_internal.h | 6 +
src/include/catalog/pg_control.h | 1 +
5 files changed, 195 insertions(+), 32 deletions(-)


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 11:31:36
Message-ID: 5107B318.70300@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 29.01.2013 02:07, Simon Riggs wrote:
> + /*
> + * If we've been explicitly promoted with fast option,
> + * end of recovery without a checkpoint if possible.
> + */
> + if (fast_promote)
> + {
> + checkPointLoc = ControlFile->prevCheckPoint;
> + record = ReadCheckpointRecord(xlogreader, checkPointLoc, 2, false);
> + if (record != NULL)
> + {
> + checkpoint_wait = false;
> + CreateEndOfRecoveryRecord();
> + }
> + }

If we must have this ReadCheckPointRecord check, it needs more than zero
comments. Also, if it ever fails for some reason, I'd like to have a big
fat warning in the log to caution that something went badly wrong.

Why does it insist that we still have not only the latest checkpoint,
but the previous one too? At recovery, we fall back to the previous
checkpoint if we can't access the latest one, but that's just a
desperate measure to try to recover something if things have gone badly
wrong. It's OK to not have the WAL containing the previous checkpoint
still around. In particular, right after restoring from a base backup,
e.g with pg_basebackup -x, or with good old pg_start/stop_backup, the
WAL included with the backup won't stretch back to previous checkpoint.

- Heikki


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 11:46:19
Message-ID: CA+U5nML+O6fhZSNbL7YZ9wb+OHALRpZS7rRZCchu9ZdVdZ--Fg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 29 January 2013 11:31, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 29.01.2013 02:07, Simon Riggs wrote:
>>
>> + /*
>> + * If we've been explicitly promoted with fast
>> option,
>> + * end of recovery without a checkpoint if
>> possible.
>> + */
>> + if (fast_promote)
>> + {
>> + checkPointLoc =
>> ControlFile->prevCheckPoint;
>> + record = ReadCheckpointRecord(xlogreader,
>> checkPointLoc, 2, false);
>> + if (record != NULL)
>> + {
>> + checkpoint_wait = false;
>> + CreateEndOfRecoveryRecord();
>> + }
>> + }
>
>
> If we must have this ReadCheckPointRecord check, it needs more than zero
> comments. Also, if it ever fails for some reason, I'd like to have a big fat
> warning in the log to caution that something went badly wrong.

> Why does it insist that we still have not only the latest checkpoint, but
> the previous one too? At recovery, we fall back to the previous checkpoint
> if we can't access the latest one, but that's just a desperate measure to
> try to recover something if things have gone badly wrong. It's OK to not
> have the WAL containing the previous checkpoint still around. In particular,
> right after restoring from a base backup, e.g with pg_basebackup -x, or with
> good old pg_start/stop_backup, the WAL included with the backup won't
> stretch back to previous checkpoint.

As you say, there are cases where the lack of a secondary checkpoint
could be considered normal, hence no message to confuse the user.

We don't actually need a fast promotion when restoring from backup, so
we don't do it.

I want this to work for the cases we need it, and not break when we
don't need it.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 12:19:39
Message-ID: 5107BE5B.1080709@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 29.01.2013 13:46, Simon Riggs wrote:
> On 29 January 2013 11:31, Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> wrote:
>> On 29.01.2013 02:07, Simon Riggs wrote:
>>>
>>> + /*
>>> + * If we've been explicitly promoted with fast
>>> option,
>>> + * end of recovery without a checkpoint if
>>> possible.
>>> + */
>>> + if (fast_promote)
>>> + {
>>> + checkPointLoc =
>>> ControlFile->prevCheckPoint;
>>> + record = ReadCheckpointRecord(xlogreader,
>>> checkPointLoc, 2, false);
>>> + if (record != NULL)
>>> + {
>>> + checkpoint_wait = false;
>>> + CreateEndOfRecoveryRecord();
>>> + }
>>> + }
>>
>>
>> If we must have this ReadCheckPointRecord check, it needs more than zero
>> comments. Also, if it ever fails for some reason, I'd like to have a big fat
>> warning in the log to caution that something went badly wrong.
>
>> Why does it insist that we still have not only the latest checkpoint, but
>> the previous one too? At recovery, we fall back to the previous checkpoint
>> if we can't access the latest one, but that's just a desperate measure to
>> try to recover something if things have gone badly wrong. It's OK to not
>> have the WAL containing the previous checkpoint still around. In particular,
>> right after restoring from a base backup, e.g with pg_basebackup -x, or with
>> good old pg_start/stop_backup, the WAL included with the backup won't
>> stretch back to previous checkpoint.
>
> As you say, there are cases where the lack of a secondary checkpoint
> could be considered normal, hence no message to confuse the user.
>
> We don't actually need a fast promotion when restoring from backup, so
> we don't do it.

You might want to bring the database up ASAP after restoring. If the
user requests that, the system shouldn't second-guess that.

PS. I think the implicit judgment you made that "pg_ctl promote" is now
the preferred method of promoting the server, over the trigger file
method, needs more discussion. I'm not sure I agree with that, but if we
do that, the docs should emphasize the pg_ctl promote more than the
trigger file. Also, I don't like conflating the shutdown mode argument
with promotion mode either, in pg_ctl. Perhaps it would be best to
revert this and take some more time to discuss the right behavior and
user interface for this (if it needs one).

PPS. doc changes are missing...

- Heikki


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 13:23:29
Message-ID: CA+U5nMJaTFK3Ykc+M8BYCeE52dwzdkvJtA5r-OufKYykBKcYAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 29 January 2013 12:19, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

> You might want to bring the database up ASAP after restoring. If the user
> requests that, the system shouldn't second-guess that.

In later releases we can relax further, if that is justified. I call
this acting conservatively in the interests of robustness.

> PS. I think the implicit judgment you made that "pg_ctl promote" is now the
> preferred method of promoting the server, over the trigger file method,
> needs more discussion. I'm not sure I agree with that, but if we do that,
> the docs should emphasize the pg_ctl promote more than the trigger file.

So why did you commit a second method?

> Also, I don't like conflating the shutdown mode argument with promotion mode
> either, in pg_ctl. Perhaps it would be best to revert this and take some
> more time to discuss the right behavior and user interface for this (if it
> needs one).

I don't think so. This was discussed on list. You are asking for
additional features, which I've explained why they aren't added by me.
We have time left to add them, but these minor points aren't more
important than other patches in the queue.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 16:27:08
Message-ID: CAHGQGwH6dek8Fth+pkLYpOitwEc+GY1T0MG5k7VQMwxs99TzRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Jan 29, 2013 at 9:07 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Fast promote mode skips checkpoint at end of recovery.
> pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
> can achieve very fast failover when the apply delay is low. Write new WAL record
> XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
> readers. If we skip synchronous end of recovery checkpoint we request a normal
> spread checkpoint so that the window of re-recovery is low.

When I tested this feature, I encountered the following FATAL message.

FATAL: highest timeline 1 of the primary is behind recovery timeline 2

Is this an intentional behavior or bug? What I did in my test is:

1. Set up one master (A), one standby (B), one cascade standby (C)
2. After running pgbench -i -s 10, I promoted the standby (B) with fast mode
3. Then, I shut down the server (B) with immediate mode after it has been
brought up to the master before end-of-recovery checkpoint has not been
completed.
4. Restart the server (B).
5. After the standby (C) established the replication connection with (B),
I got the above FATAL messages repeatedly.

Promoting (B) increments the timeline ID to 2 and generates the timeline
history file. But after restarting (B), its timeline ID is reset to 1
unexpectedly.
This seems to be the cause of the problem.

To address this problem, we should switch to new timeline ID whenever
we read the XLOG_END_OF_RECOVERY even if it's a crash recovery?

Regards,

--
Fujii Masao


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 16:38:28
Message-ID: CAHGQGwHcgrkO54M2VvzZFTmkQpJMb=aqB_1UhVnGVq_Uzn_Rkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Jan 30, 2013 at 1:27 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Tue, Jan 29, 2013 at 9:07 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Fast promote mode skips checkpoint at end of recovery.
>> pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
>> can achieve very fast failover when the apply delay is low. Write new WAL record
>> XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
>> readers. If we skip synchronous end of recovery checkpoint we request a normal
>> spread checkpoint so that the window of re-recovery is low.
>
> When I tested this feature, I encountered the following FATAL message.
>
> FATAL: highest timeline 1 of the primary is behind recovery timeline 2
>
> Is this an intentional behavior or bug? What I did in my test is:
>
> 1. Set up one master (A), one standby (B), one cascade standby (C)
> 2. After running pgbench -i -s 10, I promoted the standby (B) with fast mode
> 3. Then, I shut down the server (B) with immediate mode after it has been
> brought up to the master before end-of-recovery checkpoint has not been
> completed.
> 4. Restart the server (B).
> 5. After the standby (C) established the replication connection with (B),
> I got the above FATAL messages repeatedly.
>
> Promoting (B) increments the timeline ID to 2 and generates the timeline
> history file. But after restarting (B), its timeline ID is reset to 1
> unexpectedly.
> This seems to be the cause of the problem.
>
> To address this problem, we should switch to new timeline ID whenever
> we read the XLOG_END_OF_RECOVERY even if it's a crash recovery?

On second thought, we don't need such a complicated test case to produce
the problem which derives from the same cause of reported problem. The
procedure to produce the problem is:

1. Set up one master (A) and one standby (B)
2. Promote (B) with fast mode after running pgbench -i -s 10
3. Execute the write transaction on new master (B)
4. Shut down (B) with immediate mode before end-of-recovery checkpoint
has been completed
5. Restart (B)

Then you can confirm that the write transaction that you executed in #3 has
been lost.

Regards,

--
Fujii Masao


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 16:49:51
Message-ID: CA+U5nML25TB8-kH6kAPbHjNRap-c702zNPz8Nycdvvv3pHuESw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 29 January 2013 16:27, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Tue, Jan 29, 2013 at 9:07 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Fast promote mode skips checkpoint at end of recovery.
>> pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
>> can achieve very fast failover when the apply delay is low. Write new WAL record
>> XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
>> readers. If we skip synchronous end of recovery checkpoint we request a normal
>> spread checkpoint so that the window of re-recovery is low.
>
> When I tested this feature, I encountered the following FATAL message.
>
> FATAL: highest timeline 1 of the primary is behind recovery timeline 2
>
> Is this an intentional behavior or bug?

Tough one that.

> What I did in my test is:
>
> 1. Set up one master (A), one standby (B), one cascade standby (C)
> 2. After running pgbench -i -s 10, I promoted the standby (B) with fast mode
> 3. Then, I shut down the server (B) with immediate mode after it has been
> brought up to the master before end-of-recovery checkpoint has not been
> completed.
> 4. Restart the server (B).
> 5. After the standby (C) established the replication connection with (B),
> I got the above FATAL messages repeatedly.

Where do you get the errors, which server? The above doesn't contain a
promote command, so how does this make it fail.

Please show me the test case in more detail.

> Promoting (B) increments the timeline ID to 2 and generates the timeline
> history file. But after restarting (B), its timeline ID is reset to 1
> unexpectedly.
> This seems to be the cause of the problem.
>
> To address this problem, we should switch to new timeline ID whenever
> we read the XLOG_END_OF_RECOVERY even if it's a crash recovery?

We do. Do you see a problem with that code? There is no conditional recovery.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 16:51:34
Message-ID: CA+U5nM+hnF-mEkFo_LH7AyGy-nh1Xa7WWVt6vNTzNmrqALjX=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 29 January 2013 16:38, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:

> On second thought, we don't need such a complicated test case to produce
> the problem which derives from the same cause of reported problem. The
> procedure to produce the problem is:
>
> 1. Set up one master (A) and one standby (B)
> 2. Promote (B) with fast mode after running pgbench -i -s 10
> 3. Execute the write transaction on new master (B)
> 4. Shut down (B) with immediate mode before end-of-recovery checkpoint
> has been completed
> 5. Restart (B)
>
> Then you can confirm that the write transaction that you executed in #3 has
> been lost.

Thanks for the test case, that was quick!

It looks like my caution was justified about this.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Devrim Gündüz <devrim(at)gunduz(dot)org>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>,Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-29 16:56:46
Message-ID: a6aae210-ffe5-442b-88ed-01e35ca03528@email.android.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

>On 29 January 2013 16:27, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Tue, Jan 29, 2013 at 9:07 AM, Simon Riggs <simon(at)2ndquadrant(dot)com>
>wrote:
>>> Fast promote mode skips checkpoint at end of recovery.
>>> pg_ctl promote -m fast will skip the checkpoint at end of recovery
>so that we
>>> can achieve very fast failover when the apply delay is low. Write
>new WAL record
>>> XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for
>downstream log
>>> readers. If we skip synchronous end of recovery checkpoint we
>request a normal
>>> spread checkpoint so that the window of re-recovery is low.
>>
>> When I tested this feature, I encountered the following FATAL
>message.
>>
>> FATAL: highest timeline 1 of the primary is behind recovery
>timeline 2
>>
>> Is this an intentional behavior or bug?
>
>Tough one that.
>
>> What I did in my test is:
>>
>> 1. Set up one master (A), one standby (B), one cascade standby (C)
>> 2. After running pgbench -i -s 10, I promoted the standby (B) with
>fast mode
>> 3. Then, I shut down the server (B) with immediate mode after it has
>been
>> brought up to the master before end-of-recovery checkpoint has
>not been
>> completed.
>> 4. Restart the server (B).
>> 5. After the standby (C) established the replication connection with
>(B),
>> I got the above FATAL messages repeatedly.
>
>Where do you get the errors, which server? The above doesn't contain a
>promote command, so how does this make it fail.
>
>Please show me the test case in more detail.
>
>> Promoting (B) increments the timeline ID to 2 and generates the
>timeline
>> history file. But after restarting (B), its timeline ID is reset to 1
>> unexpectedly.
>> This seems to be the cause of the problem.
>>
>> To address this problem, we should switch to new timeline ID whenever
>> we read the XLOG_END_OF_RECOVERY even if it's a crash recovery?
>
>We do. Do you see a problem with that code? There is no conditional
>recovery.

Hi,

Could you please move this to -hackers, for archives' sake?

Regards, Devrim
--
Devrim Gündüz


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-30 17:26:26
Message-ID: CA+U5nMJGJkiLH-+GH+NHvrZm=czxQjeRvZn4umODH2sL8kcMZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 29 January 2013 16:51, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 29 January 2013 16:38, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>
>> On second thought, we don't need such a complicated test case to produce
>> the problem which derives from the same cause of reported problem. The
>> procedure to produce the problem is:
>>
>> 1. Set up one master (A) and one standby (B)
>> 2. Promote (B) with fast mode after running pgbench -i -s 10
>> 3. Execute the write transaction on new master (B)
>> 4. Shut down (B) with immediate mode before end-of-recovery checkpoint
>> has been completed
>> 5. Restart (B)
>>
>> Then you can confirm that the write transaction that you executed in #3 has
>> been lost.
>
> Thanks for the test case, that was quick!

OK, I can confirm this bug.

This needs more work as is, so I'll revert and re-post.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-01-31 19:33:53
Message-ID: CA+U5nMKNNgha-MRGYHZgEX3QrK-e7nFaSCObfNhnOKYS2WM1uQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 30 January 2013 17:26, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>> Thanks for the test case, that was quick!
>
> OK, I can confirm this bug.
>
> This needs more work as is, so I'll revert and re-post.

The fix was pretty simple in the end, so I've not reverted, just
applied the fix.

If anyone really wants me to revert, pls start new hackers thread to
discuss, or comment on changes.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-06 16:36:38
Message-ID: 51128696.1060006@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 31.01.2013 21:33, Simon Riggs wrote:
> If anyone really wants me to revert, pls start new hackers thread to
> discuss, or comment on changes.

Yes, I still think this needs fixing or reverting. Let me reiterate my
my complaints:

1. I don't like the check in ReadCheckPointRecord() that the WAL
containing last and previous checkpoint still exists. Several reasons
for that:

1.1. I don't think such a check is necessary to begin with. We replayed
that WAL record a while ago, so there's no reason to believe that it's
gone now. If there is a bug that causes that to happen, you're screwed
with or without this patch.

1.2. If we do that check, and it fails because the latest checkpoint is
not present, there should be a big fat warning in the log because
something's wrong. If you ask for fast promotion, and the system doesn't
do that, a log message is the least we can do.

1.3. Why check for the "prev" checkpoint? The latest checkpoint is
enough to recover, so why insist that also the previous one is present,
too? There are normal scenarios where it won't be, like just after
recovering from a base backup. I consider it a bug that fast promotion
doesn't work right after restoring from a base backup.

2. I don't like demoting the trigger file method to a second class
citizen. I think we should make all functionality available through both
methods. If there was a good reason for deprecating the trigger file
method, I could live with that, but this patch is not such a reason.

3. I don't like conflating the promotion modes and shutdown modes in the
pg_ctl option. Shutdown modes and promotion modes are separate concepts.
The "fast" option is pretty clear, but why does "smart" mean "create an
immediate checkpoint before promotion"? How is that smarter than the
fast mode?

The "pg_ctl --help" on that is a bit confusing too:

> Options for stop, restart or promote: -m, --mode=MODE MODE can
> be "smart", "fast", or "immediate"

The "immediate" mode is not actually valid for "pg_ctl promote". That is
clarified later in the output by listing out what the modes mean, but
that above line is misleading,

4. I think fast promotion should be the default. Why not? There are
cases where you want the promotion to happen ASAP, and there are cases
where you don't care. But there are no scenarios where you want
promotion to be slow,

5. Docs changes are missing.

Here's what I think should be done:

1. Remove the check that prev checkpoint record exists.

2. Always do fast promotion if in standby mode. Remove the pg_ctl option.

3. Improve docs.

- Heikki


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-06 17:43:40
Message-ID: CA+U5nMJmrs=W3WSP_Z7DzEju9vsJ6FurRgF7sjmu9i8cFqzogg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 6 February 2013 16:36, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 31.01.2013 21:33, Simon Riggs wrote:
>>
>> If anyone really wants me to revert, pls start new hackers thread to
>> discuss, or comment on changes.
>
>
> Yes, I still think this needs fixing or reverting. Let me reiterate my
> my complaints:

I'm sorry that they are complaints rather than just feedback, and will
work to address them.

> 1.3. Why check for the "prev" checkpoint? The latest checkpoint is
> enough to recover, so why insist that also the previous one is present,
> too?

That was there from Kyotaro's patch and I left it as it was since it
had been reviewed prior to me. I thought it was OK too, but now I
think your arguments are good and I'm now happy to change to just the
last checkpoint. That does bring into question what the value of the
prev checkpoint is in any situation, not just this one...

> There are normal scenarios where it won't be, like just after
> recovering from a base backup. I consider it a bug that fast promotion
> doesn't work right after restoring from a base backup.

OK

> 2. I don't like demoting the trigger file method to a second class
> citizen. I think we should make all functionality available through both
> methods. If there was a good reason for deprecating the trigger file
> method, I could live with that, but this patch is not such a reason.

I don't understand why we introduced a second method if they both will
continue to be used. I see no reason for that, other than backwards
compatibility. Enhancing both mechanisms suggests both will be
supported into the future. Please explain why the second mode exists?

> 3. I don't like conflating the promotion modes and shutdown modes in the
> pg_ctl option. Shutdown modes and promotion modes are separate concepts.
> The "fast" option is pretty clear, but why does "smart" mean "create an
> immediate checkpoint before promotion"? How is that smarter than the
> fast mode?

> The "pg_ctl --help" on that is a bit confusing too:
>
>> Options for stop, restart or promote: -m, --mode=MODE MODE can
>> be "smart", "fast", or "immediate"
>
>
> The "immediate" mode is not actually valid for "pg_ctl promote". That is
> clarified later in the output by listing out what the modes mean, but
> that above line is misleading,

We can always rename them, as you wish.

> 4. I think fast promotion should be the default. Why not? There are
> cases where you want the promotion to happen ASAP, and there are cases
> where you don't care. But there are no scenarios where you want
> promotion to be slow,

Not true. Slow means safe and stable, and there are many scenarios
where we want safe and stable. (Of course, nobody specifically
requests slow). My feeling is that this is an area of exposure that we
have no need and therefore no business touching. I will of course go
with what others think here, but I don't find the argument that we
should go fast always personally convincing. I am willing to relax it
over time once we get zero field problems as a result.

> 5. Docs changes are missing.

OK

> Here's what I think should be done:
>
> 1. Remove the check that prev checkpoint record exists.

Agreed

> 2. Always do fast promotion if in standby mode. Remove the pg_ctl option.

Disagreed, other viewpoints welcome.

> 3. Improve docs.

Agreed

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-06 17:56:11
Message-ID: 5112993B.7060203@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers


On 02/06/2013 09:43 AM, Simon Riggs wrote:

>> 4. I think fast promotion should be the default. Why not? There are
>> cases where you want the promotion to happen ASAP, and there are cases
>> where you don't care. But there are no scenarios where you want
>> promotion to be slow,
>
> Not true. Slow means safe and stable, and there are many scenarios
> where we want safe and stable. (Of course, nobody specifically
> requests slow). My feeling is that this is an area of exposure that we
> have no need and therefore no business touching. I will of course go
> with what others think here, but I don't find the argument that we
> should go fast always personally convincing. I am willing to relax it
> over time once we get zero field problems as a result.

Promotion, should by default should take the most safe, stable route and
only that route.

+1 On Simon's response.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-06 18:02:17
Message-ID: CA+TgmoZi9LyCtJF7UHw8fE+OcZU06FhDiL+iEPUvWgXpTBio_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Feb 6, 2013 at 12:43 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> 2. I don't like demoting the trigger file method to a second class
>> citizen. I think we should make all functionality available through both
>> methods. If there was a good reason for deprecating the trigger file
>> method, I could live with that, but this patch is not such a reason.
>
> I don't understand why we introduced a second method if they both will
> continue to be used. I see no reason for that, other than backwards
> compatibility. Enhancing both mechanisms suggests both will be
> supported into the future. Please explain why the second mode exists?

I agree that we should be pushing people towards pg_ctl promote. I
have no strong opinion about whether backward-compatibility for the
trigger file method is a good idea or not. It might be a little soon
to relegate that to second-class status, but I'm not sure.

>> 4. I think fast promotion should be the default. Why not? There are
>> cases where you want the promotion to happen ASAP, and there are cases
>> where you don't care. But there are no scenarios where you want
>> promotion to be slow,
>
> Not true. Slow means safe and stable, and there are many scenarios
> where we want safe and stable. (Of course, nobody specifically
> requests slow). My feeling is that this is an area of exposure that we
> have no need and therefore no business touching. I will of course go
> with what others think here, but I don't find the argument that we
> should go fast always personally convincing. I am willing to relax it
> over time once we get zero field problems as a result.

I'm skeptical of the idea that we shouldn't default to fast-promote
because the fast-promote code might be buggy. We do sometimes default
new features to off on the grounds that they might be buggy - Hot
Standby got an on/off switch partly for that reason - but usually we
only add a knob if there's some plausible reason for wanting to change
the setting independently of the possibility of bugs. For instance,
in the case of Hot Standby, another of the reasons for adding a knob
was that people wanted a way to make sure that they wouldn't
accidentally connect to the standby when they intended to connect to
the master. That may or may not have been a sufficiently *good*
reason, but it was accepted as justification at the time.

So I would ask this question: why would someone want to turn off
fast-promote mode, assuming for the sake of argument that it isn't
buggy? I think there might be good reasons to do that, but I'm not
sure what they are. I doubt it will be a common thing to want. I
think most people are going to want fast-promote, but many may not
know enough to request it, which means that if it isn't the default,
the code may not get much testing anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-06 18:47:39
Message-ID: 5112A54B.8090500@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 06.02.2013 20:02, Robert Haas wrote:
> On Wed, Feb 6, 2013 at 12:43 PM, Simon Riggs<simon(at)2ndquadrant(dot)com> wrote:
>>> 2. I don't like demoting the trigger file method to a second class
>>> citizen. I think we should make all functionality available through both
>>> methods. If there was a good reason for deprecating the trigger file
>>> method, I could live with that, but this patch is not such a reason.
>>
>> I don't understand why we introduced a second method if they both will
>> continue to be used. I see no reason for that, other than backwards
>> compatibility. Enhancing both mechanisms suggests both will be
>> supported into the future. Please explain why the second mode exists?
>
> I agree that we should be pushing people towards pg_ctl promote. I
> have no strong opinion about whether backward-compatibility for the
> trigger file method is a good idea or not. It might be a little soon
> to relegate that to second-class status, but I'm not sure.

Both the trigger file and pg_ctl promote methods are useful in different
setups. If you point the trigger file on an NFS mount or similar, that
allows triggering promotion from a different host without providing
shell access. You might want to put the trigger file on an NFS mount
that also contains the WAL archive, for example. A promotion script that
also controls the network routers to redirect traffic and STONITH the
dead node, can then simply "touch /mnt/.../trigger" to promote. Sure, it
could also ssh to the server and run "pg_ctl promote", but that requires
more setup.

- Heikki


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-06 20:24:21
Message-ID: CA+TgmoYkxCLA2G2_P+MqNiYut-6jF+QhRZ4T_dxqJmmQzig1cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Feb 6, 2013 at 1:47 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 06.02.2013 20:02, Robert Haas wrote:
>>
>> On Wed, Feb 6, 2013 at 12:43 PM, Simon Riggs<simon(at)2ndquadrant(dot)com>
>> wrote:
>>>>
>>>> 2. I don't like demoting the trigger file method to a second class
>>>> citizen. I think we should make all functionality available through both
>>>> methods. If there was a good reason for deprecating the trigger file
>>>> method, I could live with that, but this patch is not such a reason.
>>>
>>>
>>> I don't understand why we introduced a second method if they both will
>>> continue to be used. I see no reason for that, other than backwards
>>> compatibility. Enhancing both mechanisms suggests both will be
>>> supported into the future. Please explain why the second mode exists?
>>
>>
>> I agree that we should be pushing people towards pg_ctl promote. I
>> have no strong opinion about whether backward-compatibility for the
>> trigger file method is a good idea or not. It might be a little soon
>> to relegate that to second-class status, but I'm not sure.
>
>
> Both the trigger file and pg_ctl promote methods are useful in different
> setups. If you point the trigger file on an NFS mount or similar, that
> allows triggering promotion from a different host without providing shell
> access. You might want to put the trigger file on an NFS mount that also
> contains the WAL archive, for example. A promotion script that also controls
> the network routers to redirect traffic and STONITH the dead node, can then
> simply "touch /mnt/.../trigger" to promote. Sure, it could also ssh to the
> server and run "pg_ctl promote", but that requires more setup.

Good point. I hadn't thought about that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-07 08:41:39
Message-ID: CA+U5nMLGMZBPTU+__rcpo1yHtGCw1yWquUedVF8GqH=MBtjx2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 6 February 2013 18:02, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> So I would ask this question: why would someone want to turn off
> fast-promote mode, assuming for the sake of argument that it isn't
> buggy?

You can write a question many ways, and lead people towards a
conclusion as a result.

Why would someone want to turn off safe-promote mode, assuming it was
fast enough?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-07 09:04:17
Message-ID: 51136E11.8090805@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 07.02.2013 10:41, Simon Riggs wrote:
> On 6 February 2013 18:02, Robert Haas<robertmhaas(at)gmail(dot)com> wrote:
>
>> So I would ask this question: why would someone want to turn off
>> fast-promote mode, assuming for the sake of argument that it isn't
>> buggy?
>
> You can write a question many ways, and lead people towards a
> conclusion as a result.
>
> Why would someone want to turn off safe-promote mode, assuming it was
> fast enough?

Okay, I'll bite..

Because in some of your servers, the safe/slow promotion is not fast
enough, and you want to use the same promotion script in both scenarios,
to keep things simple.

Because you're not sure if it's fast enough, and want to play it safe.

Because faster is nicer, even if the slow mode would be "fast enough".

It makes me uncomfortable that we're adding switches to pg_ctl promote
just because we're worried there might be bugs in our code. If we don't
trust the code as it is, it needs more testing. We can analyze the code
more thoroughly, to make an educated guess on what's likely to happen if
it's broken, and consider adding some sanity checks etc. to make the
consequences less severe. We should not put the burden on our users to
decide if the code is trustworthy enough to use.

Note that we still wouldn't do fast promotion in crash recovery, so
there's that escape hatch if there is indeed a bug in our code and fast
promotion fails.

- Heikki


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-07 09:47:56
Message-ID: CA+U5nMLwGYcNDFduE1s6spVy-zP4HPXxPUqo6e8acKPLLUsx8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 7 February 2013 09:04, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

> It makes me uncomfortable that we're adding switches to pg_ctl promote just
> because we're worried there might be bugs in our code. If we don't trust the
> code as it is, it needs more testing. We can analyze the code more
> thoroughly, to make an educated guess on what's likely to happen if it's
> broken, and consider adding some sanity checks etc. to make the consequences
> less severe. We should not put the burden on our users to decide if the code
> is trustworthy enough to use.

I don't think I said I was worried about bugs in code, did I? The
point is that this has been a proven mechanism for many years and
we're now discussing turning that off completely with no user option
to put it back, which has considerable risk with it.

Acknowledging risks and taking risk mitigating actions is a normal
part of any IT project. If we start getting unexplained errors it
could take a long time to trace that back to the lack of a shutdown
checkpoint.

I don't mind saying openly this worries me and its why I took months
to commit it. If there was no risk here and its all so easy, why
didn't we commit this last year, or why didn't you override me and
commit this earlier in this cycle?

I have to say I care very little for the beauty or lack of command
switches, in such a case. The "cost" there is low.

Tell me you understand the risk I am discussing, tell me in your
opinion we're safe and I'm being unnecessarily cautious, maybe even
foolishly so, and I'll relent. I'll stand by that and take the flak.
But saying you don't like a switch is like telling me you don't like
the colour of my car safety belt.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-07 11:48:07
Message-ID: CA+TgmobN6KMhO6waAMVwWhmhr00GP0DWTBNsNs70h8EwFG=-LA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Thu, Feb 7, 2013 at 4:04 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> It makes me uncomfortable that we're adding switches to pg_ctl promote just
> because we're worried there might be bugs in our code. If we don't trust the
> code as it is, it needs more testing. We can analyze the code more
> thoroughly, to make an educated guess on what's likely to happen if it's
> broken, and consider adding some sanity checks etc. to make the consequences
> less severe. We should not put the burden on our users to decide if the code
> is trustworthy enough to use.

+1

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-07 15:18:40
Message-ID: 1360250320.51279.YahooMailNeo@web162905.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 07.02.2013 10:41, Simon Riggs wrote:

>> Why would someone want to turn off safe-promote mode, assuming it was
>> fast enough?

> Because faster is nicer, even if the slow mode would be "fast enough".

http://www.youtube.com/watch?v=H3R-rtWPyJY

-Kevin


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Record previous TLI in end-of-recovery record (was Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.)
Date: 2013-02-07 16:07:44
Message-ID: 5113D150.3070507@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

(this is unrelated to the other discussion about this patch)

On 29.01.2013 02:07, Simon Riggs wrote:
> Fast promote mode skips checkpoint at end of recovery.
> pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
> can achieve very fast failover when the apply delay is low. Write new WAL record
> XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
> readers. If we skip synchronous end of recovery checkpoint we request a normal
> spread checkpoint so that the window of re-recovery is low.

It just occurred to me that it would be really nice if the
end-of-recovery record, and the timeline-switching shutdown checkpoint
record too for that matter, would include the previous timeline's ID
that we forked from, in addition to the new TLI. Although it's not
required for anything at the moment, it would be useful debugging
information. It would allow reconstructing timeline history files from
the WAL; that might come handy.

Barring objections, I'll add that.

- Heikki


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Record previous TLI in end-of-recovery record (was Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.)
Date: 2013-02-07 16:24:55
Message-ID: CA+U5nM+unoR960REbyMt57=WDdzk-k6pn_-K1Din0Q173kpEJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 7 February 2013 16:07, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

> It just occurred to me that it would be really nice if the end-of-recovery
> record, and the timeline-switching shutdown checkpoint record too for that
> matter, would include the previous timeline's ID that we forked from, in
> addition to the new TLI. Although it's not required for anything at the
> moment, it would be useful debugging information. It would allow
> reconstructing timeline history files from the WAL; that might come handy.
>
> Barring objections, I'll add that.

Good idea, please do.

That means a shutdown checkpoint becomes it's own record type.... but
my understanding of our other conversations was that you want to never
use shutdown checkpoints for end of recovery ever again, so that seems
unnecesary. Sorry to mix things up.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fast promote mode skips checkpoint at end of recovery.
Date: 2013-02-07 18:58:02
Message-ID: CA+U5nMKmDD7hGCYzOo=iFM=eK5OPDXCEzmq79fgLWr0TJk=sXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 6 February 2013 17:43, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>> Here's what I think should be done:
>>
>> 1. Remove the check that prev checkpoint record exists.
>
> Agreed

Done

>> 2. Always do fast promotion if in standby mode. Remove the pg_ctl option.
>
> Disagreed, other viewpoints welcome.

Waiting for further comments.

>> 3. Improve docs.
>
> Agreed

Pending.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services