Re: Streaming Replication: Checkpoint_segment and wal_keep_segments on standby

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Sander, Ingo (NSN - DE/Munich)" <ingo(dot)sander(at)nsn(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Streaming Replication: Checkpoint_segment and wal_keep_segments on standby
Date: 2010-06-10 10:19:26
Message-ID: 4C10BC2E.1020708@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/06/10 09:14, Fujii Masao wrote:
> On Thu, Jun 10, 2010 at 12:09 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> BTW, should there be doc changes for this? I didn't find anything explaining
>> how restartpoints are triggered, we should add a paragraph somewhere.
>
> +1
>
> What about the attached patch?

> (description of wal_keep_segments)
> *** 1902,1907 **** SET ENABLE_SEQSCAN TO OFF;
> --- 1902,1908 ----
> for standby purposes, and the number of old WAL segments available
> for standbys is determined based only on the location of the previous
> checkpoint and status of WAL archiving.
> + This parameter has no effect on a restartpoint.
> This parameter can only be set in the <filename>postgresql.conf</>
> file or on the server command line.
> </para>

Hmm, I wonder if wal_keep_segments should take effect during recovery
too? We don't support cascading slaves, but if you have two slaves
connected to one master (without an archive), and you perform failover
to one of them, without wal_keep_segments the 2nd slave might not find
all the files it needs in the new master. Then again, that won't work
without an archive anyway, because we error out at a TLI mismatch in
replication. Seems like this is 9.1 material..

> *** a/doc/src/sgml/wal.sgml
> --- b/doc/src/sgml/wal.sgml
> ***************
> *** 424,429 ****
> --- 424,430 ----
> <para>
> There will always be at least one WAL segment file, and will normally
> not be more than (2 + <varname>checkpoint_completion_target</varname>) * <varname>checkpoint_segments</varname> + 1
> + or <varname>checkpoint_segments</> + <xref linkend="guc-wal-keep-segments"> + 1
> files. Each segment file is normally 16 MB (though this size can be
> altered when building the server). You can use this to estimate space
> requirements for <acronym>WAL</acronym>.

That's not true, wal_keep_segments is the minimum number of files
retained, independently of checkpoint_segments. The corret formula is (2
+ checkpoint_completion_target * checkpoint_segments, wal_keep_segments)

> <para>
> + In archive recovery or standby mode, the server periodically performs
> + <firstterm>restartpoints</><indexterm><primary>restartpoint</></>
> + which are similar to checkpoints in normal operation: the server forces
> + all its state to disk, updates the <filename>pg_control</> file to
> + indicate that the already-processed WAL data need not be scanned again,
> + and then recycles old log segment files if they are in the
> + <filename>pg_xlog</> directory. Note that this recycling is not affected
> + by <varname>wal_keep_segments</> at all. A restartpoint is triggered,
> + if at least one checkpoint record has been replayed since the last
> + restartpoint, every <varname>checkpoint_timeout</> seconds, or every
> + <varname>checkoint_segments</> log segments only in standby mode,
> + whichever comes first....

That last sentence is a bit unclear. How about:

A restartpoint is triggered if at least one checkpoint record has been
replayed and <varname>checkpoint_timeout</> seconds have passed since
last restartpoint. In standby mode, a restartpoint is also triggered if
<varname>checkoint_segments</> log segments have been replayed since
last restartpoint and at least one checkpoint record has been replayed
since.

> ... In log shipping case, the checkpoint interval
> + on the standby is normally smaller than that on the master.
> + </para>

What does that mean? Restartpoints can't be performed more frequently
than checkpoints in the master because restartpoints can only be
performed at checkpoint records.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Max Williams 2010-06-10 10:19:41 Re: Large (almost 50%!) performance drop after upgrading to 8.4.4?
Previous Message Heikki Linnakangas 2010-06-10 09:10:37 Re: InvalidXLogRecPtr in docs