Re: [HACKERS] Docs for archive_cleanup_command are poor

Lists: pgsql-docspgsql-hackers
From: Brendan Jurd <direvus(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Docs for archive_cleanup_command are poor
Date: 2010-10-09 01:04:21
Message-ID: AANLkTin7HkSNoXimB5XQq69Wnw4a04oBSM1EiRGEOTZ3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

Hi folks,

I have just set up HS+SR for the first time, and for the most part,
the docs were excellent. The one exception for me was the discussion
of archive_cleanup_command. This is a pretty important part of
constructing a healthy standby server, and IMO the docs don't give it
the treatment it deserves.

Under "25.2.4. Setting Up a Standby Server", we have:

"You can use archive_cleanup_command to prune the archive of files no
longer needed by the standby."

... then a few paragraphs later ...

"If you're using a WAL archive, its size can be minimized using the
archive_cleanup_command option to remove files that are no longer
required by the standby server. Note however, that if you're using the
archive for backup purposes, you need to retain files needed to
recover from at least the latest base backup, even if they're no
longer needed by the standby."

So there are a couple of brief mentions of what
archive_cleanup_command is for, but nothing about how it works, no
exampes of how to use it, and no links at all. Contrast how we deal
with archive_command, restore_command and primary_conninfo.

I'd like to suggest a few ways we could improve on this:

1. Remove the former paragraph. It's stranded out there on its own in
the middle of some unrelated text, and doesn't say anything of
substance not also said in the latter paragraph.

2. Include an example archive_cleanup_command in the recovery.conf
example snippet.

3. Link to 26.1 which actually explains how a_c_c works.

4. Mention, and link to, pg_archivecleanup from both 25.2.4 and 26.1.
This is the utility that most newcomers to WAL archiving will want to
use, so it's rather weird of us not to advertise it.

I'm willing to write a patch for this, but I thought I'd raise the
suggestions on-list first, before getting too invested. So, please
comment if you have an opinion on this.

Cheers,
BJ


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-12 12:28:58
Message-ID: AANLkTi=Lx4S-5udAkh6T=Fcan9xpvkHAJdZ2hXrq2OcG@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Sat, Oct 9, 2010 at 10:04 AM, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> Hi folks,
>
> I have just set up HS+SR for the first time, and for the most part,
> the docs were excellent.  The one exception for me was the discussion
> of archive_cleanup_command.  This is a pretty important part of
> constructing a healthy standby server, and IMO the docs don't give it
> the treatment it deserves.
>
> Under "25.2.4. Setting Up a Standby Server", we have:
>
> "You can use archive_cleanup_command to prune the archive of files no
> longer needed by the standby."
>
> ... then a few paragraphs later ...
>
> "If you're using a WAL archive, its size can be minimized using the
> archive_cleanup_command  option to remove files that are no longer
> required by the standby server. Note however, that if you're using the
> archive for backup purposes, you need to retain files needed to
> recover from at least the latest base backup, even if they're no
> longer needed by the standby."
>
> So there are a couple of brief mentions of what
> archive_cleanup_command is for, but nothing about how it works, no
> exampes of how to use it, and no links at all.  Contrast how we deal
> with archive_command, restore_command and primary_conninfo.
>
> I'd like to suggest a few ways we could improve on this:
>
> 1. Remove the former paragraph.  It's stranded out there on its own in
> the middle of some unrelated text, and doesn't say anything of
> substance not also said in the latter paragraph.
>
> 2. Include an example archive_cleanup_command in the recovery.conf
> example snippet.
>
> 3. Link to 26.1 which actually explains how a_c_c works.
>
> 4. Mention, and link to, pg_archivecleanup from both 25.2.4 and 26.1.
> This is the utility that most newcomers to WAL archiving will want to
> use, so it's rather weird of us not to advertise it.
>
> I'm willing to write a patch for this, but I thought I'd raise the
> suggestions on-list first, before getting too invested.  So, please
> comment if you have an opinion on this.

Agreed.

And, ISTM that we should mention that we must not just specify
pg_archivecleanup in archive_cleanup_command when there are multiple
standby servers. This is because, in that case, we must calculate
the oldest restart point in those standbys and delete the archived
WAL files according to that point.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Brendan Jurd <direvus(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-13 21:45:03
Message-ID: AANLkTi=gnbvRrxbqrK0CFHbbtG-FBYoshzsXm7qwpSb5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Tue, Oct 12, 2010 at 8:28 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Sat, Oct 9, 2010 at 10:04 AM, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
>> Hi folks,
>>
>> I have just set up HS+SR for the first time, and for the most part,
>> the docs were excellent.  The one exception for me was the discussion
>> of archive_cleanup_command.  This is a pretty important part of
>> constructing a healthy standby server, and IMO the docs don't give it
>> the treatment it deserves.
>>
>> Under "25.2.4. Setting Up a Standby Server", we have:
>>
>> "You can use archive_cleanup_command to prune the archive of files no
>> longer needed by the standby."
>>
>> ... then a few paragraphs later ...
>>
>> "If you're using a WAL archive, its size can be minimized using the
>> archive_cleanup_command  option to remove files that are no longer
>> required by the standby server. Note however, that if you're using the
>> archive for backup purposes, you need to retain files needed to
>> recover from at least the latest base backup, even if they're no
>> longer needed by the standby."
>>
>> So there are a couple of brief mentions of what
>> archive_cleanup_command is for, but nothing about how it works, no
>> exampes of how to use it, and no links at all.  Contrast how we deal
>> with archive_command, restore_command and primary_conninfo.
>>
>> I'd like to suggest a few ways we could improve on this:
>>
>> 1. Remove the former paragraph.  It's stranded out there on its own in
>> the middle of some unrelated text, and doesn't say anything of
>> substance not also said in the latter paragraph.
>>
>> 2. Include an example archive_cleanup_command in the recovery.conf
>> example snippet.
>>
>> 3. Link to 26.1 which actually explains how a_c_c works.
>>
>> 4. Mention, and link to, pg_archivecleanup from both 25.2.4 and 26.1.
>> This is the utility that most newcomers to WAL archiving will want to
>> use, so it's rather weird of us not to advertise it.
>>
>> I'm willing to write a patch for this, but I thought I'd raise the
>> suggestions on-list first, before getting too invested.  So, please
>> comment if you have an opinion on this.
>
> Agreed.

Is someone working on a patch?

> And, ISTM that we should mention that we must not just specify
> pg_archivecleanup in archive_cleanup_command when there are multiple
> standby servers. This is because, in that case, we must calculate
> the oldest restart point in those standbys and delete the archived
> WAL files according to that point.

How do we expect people to do that, by the way?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-14 01:15:32
Message-ID: AANLkTik3LsxqarT6sm8gS-vbHOhzVW0_D9eGMEBBjNOp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On 14 October 2010 08:45, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Is someone working on a patch?

Yes, I will prepare a patch to get us started.

Cheers,
BJ


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-14 15:24:17
Message-ID: AANLkTikdBWvvx=ZFvmqoVWeCF+Dyka_doPkAxkKw7J=b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On 12 October 2010 23:28, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Sat, Oct 9, 2010 at 10:04 AM, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
>> I have just set up HS+SR for the first time, and for the most part,
>> the docs were excellent.  The one exception for me was the discussion
>> of archive_cleanup_command.  This is a pretty important part of
>> constructing a healthy standby server, and IMO the docs don't give it
>> the treatment it deserves.
...

> Agreed.
>
> And, ISTM that we should mention that we must not just specify
> pg_archivecleanup in archive_cleanup_command when there are multiple
> standby servers. This is because, in that case, we must calculate
> the oldest restart point in those standbys and delete the archived
> WAL files according to that point.

As promised, here is a patch to try to address $SUBJECT.

Summary of changes:

In 25.2.4. "Setting Up a Standby Server":

* Get rid of the extraneous short paragraph,
* move the full-size paragraph up to where the now-extinct short para was,
* add an archive_cleanup_command to the example recovery.conf,
* flesh out the wording,
* add links to 26.1 and F.22.

In 26.1. "Archive recovery settings":

* Add detail to the description of how it works,
* add an example recovery.conf snippet,
* per Fujii-san's comment, indicate that multi-standby setups require
more finesse,
* link to F.22.

In F.22. "pg_archivecleanup":

* Edit and clarify wording,
* standardise label for the <archivelocation> argument,
* again indicate the multi-standby issue,
* link to 25.2.

I'll drop this onto the next open commitfest. If it passes muster, it
sure wouldn't hurt to backpatch it to 9.0.

Cheers,
BJ

Attachment Content-Type Size
acc-docs.diff text/plain 8.5 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-14 18:33:24
Message-ID: 1287081204.1725.1094.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Fri, 2010-10-15 at 02:24 +1100, Brendan Jurd wrote:

> I'll drop this onto the next open commitfest. If it passes muster, it
> sure wouldn't hurt to backpatch it to 9.0.

Committed. Not sure there's anything there worth backpatching? There
aren't any doc bugs there.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Brendan Jurd <direvus(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-14 18:50:26
Message-ID: AANLkTik=2GAdhk6GjE84OgAJyC2rc31Mus2oZeyu0kHD@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Thu, Oct 14, 2010 at 2:33 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Fri, 2010-10-15 at 02:24 +1100, Brendan Jurd wrote:
>
>> I'll drop this onto the next open commitfest.  If it passes muster, it
>> sure wouldn't hurt to backpatch it to 9.0.
>
> Committed. Not sure there's anything there worth backpatching? There
> aren't any doc bugs there.

It's a while until 9.1 comes out, so it might be helpful to back-patch to 9.0.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-14 18:51:28
Message-ID: AANLkTimiQrsWrc3eyiP91QnxPDwEqx=X7wwRuFxKC8rF@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On 15 October 2010 05:33, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Fri, 2010-10-15 at 02:24 +1100, Brendan Jurd wrote:
>> I'll drop this onto the next open commitfest.  If it passes muster, it
>> sure wouldn't hurt to backpatch it to 9.0.
>
> Committed. Not sure there's anything there worth backpatching? There
> aren't any doc bugs there.
>

Thanks for the commit Simon.

Agreed that there are no doc bugs. The reason I suggested a backpatch
is that I'm concerned that a lot of people are going to be approaching
the whole Standby topic for the first time with 9.0, so it would be
nice to give those folks an accessible account of how
archive_cleanup_command is meant to be used.

I was also working from the assumption that the "we only packpatch bug
fixes" policy applied to the code, not so much to the documentation.
If I was in error about that, well fair enough then.

Cheers,
BJ


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-14 22:05:48
Message-ID: 5681.1287093948@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

Brendan Jurd <direvus(at)gmail(dot)com> writes:
> Agreed that there are no doc bugs. The reason I suggested a backpatch
> is that I'm concerned that a lot of people are going to be approaching
> the whole Standby topic for the first time with 9.0, so it would be
> nice to give those folks an accessible account of how
> archive_cleanup_command is meant to be used.

Yeah, if this is a material improvement in the usefulness of the HS
docs, I think including it into 9.0.x isn't a bad idea.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Brendan Jurd <direvus(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject: Re: [HACKERS] Docs for archive_cleanup_command are poor
Date: 2010-10-15 09:20:51
Message-ID: 1287134451.1725.1301.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Thu, 2010-10-14 at 18:05 -0400, Tom Lane wrote:
> Brendan Jurd <direvus(at)gmail(dot)com> writes:
> > Agreed that there are no doc bugs. The reason I suggested a backpatch
> > is that I'm concerned that a lot of people are going to be approaching
> > the whole Standby topic for the first time with 9.0, so it would be
> > nice to give those folks an accessible account of how
> > archive_cleanup_command is meant to be used.
>
> Yeah, if this is a material improvement in the usefulness of the HS
> docs, I think including it into 9.0.x isn't a bad idea.

Done.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services