Re: Review of pg_archivecleanup -x option patch

Lists: pgsql-hackers
From: Alex Shulgin <ash(at)commandprompt(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Review of pg_archivecleanup -x option patch
Date: 2012-02-01 12:53:47
Message-ID: 87obtin290.fsf@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

This is my first patch review ever, so please bear with me.

The patch[1] is in the context diff format and applies cleanly to
current git HEAD. Documentation is updated by the patch.

The code does implement what's the patch is supposed to do.

Do we want that? According to Greg's original mail, yes.

One problem I've found while trying the example workflow is this:

~/local/postgresql/HEAD$ ./bin/pg_archivecleanup -d -x .gz data/archive/ 000000010000000000000002.00000020.backup.gz
pg_archivecleanup: invalid filename input
Try "pg_archivecleanup --help" for more information.

I think we could accept the suffixed WAL filename and strip the suffix;
another idea is to actually detect the suffix (if any) from the last
command line argument, thus rendering the -x option unnecessary.

By the way, I for one would use the word 'suffix' instead of 'extension'
which sounds like some MS-DOS thingy, but after briefly grepping the
docs I can see that both words are used in this context already (and the
ratio is not in favor of my choice of wording.)

Other than that the patch looks good.

--
Alex

[1] http://archives.postgresql.org/pgsql-hackers/2011-02/msg00547.php


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-02-13 02:19:53
Message-ID: 4F387349.10905@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/01/2012 07:53 AM, Alex Shulgin wrote:
> One problem I've found while trying the example workflow is this:
>
> ~/local/postgresql/HEAD$ ./bin/pg_archivecleanup -d -x .gz data/archive/ 000000010000000000000002.00000020.backup.gz
> pg_archivecleanup: invalid filename input
> Try "pg_archivecleanup --help" for more information.
>
> I think we could accept the suffixed WAL filename and strip the suffix;
> another idea is to actually detect the suffix (if any) from the last
> command line argument, thus rendering the -x option unnecessary.

Going full-on automatic with detecting the suffix seems like it might do
the wrong thing when presented with a strange extension. I know a lot
of the time I use pg_archivecleanup is inside scripts that tend to get
setup and then ignored. If one of those has a bug, or unexpected data
appears in the archive, I'd rather see this sort of major, obvious
failure--not an attempt to sort things out that might make a bad decision.

The smaller step of automatically stripping the specified suffix from
the input file name, when it matches the one we've told the program to
expect, seems like a usability improvement unlikely to lead to bad
things though. I've certainly made the mistake you've shown when using
the patched version of the program myself, just didn't think about
catching and correcting it myself before. We can rev this to add that
feature and resubmit easily enough, will turn that around soon.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-08 13:25:12
Message-ID: CA+Tgmob4NdTWAWy=bX5woT2jAYnGA8AxbE1cRggJmitX1_cNbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 12, 2012 at 9:19 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> The smaller step of automatically stripping the specified suffix from the
> input file name, when it matches the one we've told the program to expect,
> seems like a usability improvement unlikely to lead to bad things though.
>  I've certainly made the mistake you've shown when using the patched version
> of the program myself, just didn't think about catching and correcting it
> myself before.  We can rev this to add that feature and resubmit easily
> enough, will turn that around soon.

This change seems plenty small enough to slip in even at this late
date, but I guess we're lacking a new version of the patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-09 05:46:42
Message-ID: CAJKUy5i8Otnb7QvT0HwjaWt5-aUbrZXqh+HD4z1caud706Do_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 8, 2012 at 8:25 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Feb 12, 2012 at 9:19 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>> The smaller step of automatically stripping the specified suffix from the
>> input file name, when it matches the one we've told the program to expect,
>> seems like a usability improvement unlikely to lead to bad things though.
>>  I've certainly made the mistake you've shown when using the patched version
>> of the program myself, just didn't think about catching and correcting it
>> myself before.  We can rev this to add that feature and resubmit easily
>> enough, will turn that around soon.
>
> This change seems plenty small enough to slip in even at this late
> date, but I guess we're lacking a new version of the patch.
>

Sorry, here's the patch rebased and with the suggestion from Alex.
Which reminds me, I never thank him for the review (shame on me) :D

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-09 05:47:14
Message-ID: CAJKUy5hjEz21+s-3qVteKWzLn-qLK2D-8pbjSp+z7Fe7BpNGaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 9, 2012 at 12:46 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Thu, Mar 8, 2012 at 8:25 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Sun, Feb 12, 2012 at 9:19 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>>> The smaller step of automatically stripping the specified suffix from the
>>> input file name, when it matches the one we've told the program to expect,
>>> seems like a usability improvement unlikely to lead to bad things though.
>>>  I've certainly made the mistake you've shown when using the patched version
>>> of the program myself, just didn't think about catching and correcting it
>>> myself before.  We can rev this to add that feature and resubmit easily
>>> enough, will turn that around soon.
>>
>> This change seems plenty small enough to slip in even at this late
>> date, but I guess we're lacking a new version of the patch.
>>
>
> Sorry, here's the patch rebased and with the suggestion from Alex.
> Which reminds me, I never thank him for the review (shame on me) :D
>

with the patch this time

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

Attachment Content-Type Size
archivecleanup_extensions-v3.patch text/x-patch 6.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-09 14:07:50
Message-ID: CA+TgmoZDYD_W7K_S1ZuEnqVNOaRWYCX=EETX+R27VB7akrrcEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 9, 2012 at 12:47 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> Sorry, here's the patch rebased and with the suggestion from Alex.
>> Which reminds me, I never thank him for the review (shame on me) :D
>
> with the patch this time

This may be a stupid idea, but it seems to me that it might be better
to dispense with all of the logic in here to detect whether the file
name is still going to be long enough after chomping the extension. I
feel like that's just making things complicated. I assume the
extensions we're thinking people will want to strip here are things
like ".gz", in which case there should be no confusion; and if
someone's dumb enough to use an extension like "0" (with no dot or
anything) but only sometimes then I think they deserve what they get
(viz: errors). See attached (only lightly tested) patch for what I'm
thinking of.

Also, I'm wondering about this warning in the documentation:

+ extension added by the compression program. Note that the
+ <filename>.backup</> file name passed to the program should not
+ include the extension.

IIUC, the latest change you made makes that warning obsolete, no?

[rhaas pgsql]$ contrib/pg_archivecleanup/pg_archivecleanup -d -x .gz .
000000010000000000000010.00000020.backup.gz
pg_archivecleanup: keep WAL file "./000000010000000000000010" and later

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
archivecleanup-extension-rmh.patch application/octet-stream 4.4 KB

From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-10 19:38:22
Message-ID: CAJKUy5gVGSJVUn89FagFDCVSqOMGOVvWCGj1ENVLgFqw0Uz6Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 9, 2012 at 9:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Mar 9, 2012 at 12:47 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>> Sorry, here's the patch rebased and with the suggestion from Alex.
>>> Which reminds me, I never thank him for the review (shame on me) :D
>>
>> with the patch this time
>
> This may be a stupid idea, but it seems to me that it might be better
> to dispense with all of the logic in here to detect whether the file
> name is still going to be long enough after chomping the extension.  I
> feel like that's just making things complicated.

while i like the idea of separating the logic, i don't like the results:

for example i tried this (notice that i forgot the point), and it just
says nothing (not even that the file with the extension but without
the point doesn't exist). that's why we were checking that the length
matches

$ ./pg_archivecleanup -x "bz2" /tmp 000000010000000100000058
$

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-10 19:57:49
Message-ID: CAJKUy5gf1yj0OFzN9p7VbUAFcsDm_kow6uG-v1+_hpG9xqk8xw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 10, 2012 at 2:38 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Fri, Mar 9, 2012 at 9:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, Mar 9, 2012 at 12:47 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>> Sorry, here's the patch rebased and with the suggestion from Alex.
>>>> Which reminds me, I never thank him for the review (shame on me) :D
>>>
>>> with the patch this time
>>
>> This may be a stupid idea, but it seems to me that it might be better
>> to dispense with all of the logic in here to detect whether the file
>> name is still going to be long enough after chomping the extension.  I
>> feel like that's just making things complicated.
>
> while i like the idea of separating the logic, i don't like the results:
>
> for example i tried this (notice that i forgot the point), and it just
> says nothing (not even that the file with the extension but without
> the point doesn't exist). that's why we were checking that the length
> matches
>
> $ ./pg_archivecleanup -x "bz2" /tmp 000000010000000100000058
> $
>

i would add this in the middle of the TrimExtension() function:

if ((flen - (elen+1) != XLOG_DATA_FNAME_LEN && flen - (elen+1) !=
XLOG_BACKUP_FNAME_LEN) &&
(flen != XLOG_DATA_FNAME_LEN && flen != XLOG_BACKUP_FNAME_LEN))
{
fprintf(stderr, "%s: invalid filename input\n", progname);
fprintf(stderr, "Try \"%s --help\" for more information.\n", progname);
exit(2);
}

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-12 01:28:52
Message-ID: CA+TgmoZfVscg-Vd5ZmeQMTDNKzLEm5uQwCR8LLPupBWZ7pGKgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 10, 2012 at 2:38 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Fri, Mar 9, 2012 at 9:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, Mar 9, 2012 at 12:47 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>> Sorry, here's the patch rebased and with the suggestion from Alex.
>>>> Which reminds me, I never thank him for the review (shame on me) :D
>>>
>>> with the patch this time
>>
>> This may be a stupid idea, but it seems to me that it might be better
>> to dispense with all of the logic in here to detect whether the file
>> name is still going to be long enough after chomping the extension.  I
>> feel like that's just making things complicated.
>
> while i like the idea of separating the logic, i don't like the results:
>
> for example i tried this (notice that i forgot the point), and it just
> says nothing (not even that the file with the extension but without
> the point doesn't exist). that's why we were checking that the length
> matches
>
> $ ./pg_archivecleanup -x "bz2" /tmp 000000010000000100000058

Hmm, but I thought that the idea was that the extension was optional.
Perhaps I'm missing something but I don't think the previous patch
will complain about that either; or at least I don't see why the
behavior should be any different.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-28 13:47:02
Message-ID: CA+TgmoaQfF6p8Ete096RbHH+yn2asX6WMj_C7+SgHjTYDBwM8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 11, 2012 at 9:28 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sat, Mar 10, 2012 at 2:38 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> On Fri, Mar 9, 2012 at 9:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Fri, Mar 9, 2012 at 12:47 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>>> Sorry, here's the patch rebased and with the suggestion from Alex.
>>>>> Which reminds me, I never thank him for the review (shame on me) :D
>>>>
>>>> with the patch this time
>>>
>>> This may be a stupid idea, but it seems to me that it might be better
>>> to dispense with all of the logic in here to detect whether the file
>>> name is still going to be long enough after chomping the extension.  I
>>> feel like that's just making things complicated.
>>
>> while i like the idea of separating the logic, i don't like the results:
>>
>> for example i tried this (notice that i forgot the point), and it just
>> says nothing (not even that the file with the extension but without
>> the point doesn't exist). that's why we were checking that the length
>> matches
>>
>> $ ./pg_archivecleanup -x "bz2" /tmp 000000010000000100000058
>
> Hmm, but I thought that the idea was that the extension was optional.
> Perhaps I'm missing something but I don't think the previous patch
> will complain about that either; or at least I don't see why the
> behavior should be any different.

Can someone enlighten me on this point?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-03-28 16:13:34
Message-ID: CAJKUy5ijJE_xbZq91YzZSgN73uULQ9ppjPvog4-7ca+6ck0Rjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 28, 2012 at 8:47 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>
>>> $ ./pg_archivecleanup -x "bz2" /tmp 000000010000000100000058
>>
>> Hmm, but I thought that the idea was that the extension was optional.
>> Perhaps I'm missing something but I don't think the previous patch
>> will complain about that either; or at least I don't see why the
>> behavior should be any different.
>
> Can someone enlighten me on this point?
>

mmm! you're right... it's not complaining either... i was sure it was...
and i'm not sure i want to contor things for that...

so, just forget my last mail about that... your refactor is just fine for me

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review of pg_archivecleanup -x option patch
Date: 2012-04-05 18:20:00
Message-ID: CA+TgmoZkAx4z4aJUKGxL3YM0QxSuO8pUmiaoYRt9zwMEoNU4Hg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 28, 2012 at 12:13 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Wed, Mar 28, 2012 at 8:47 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>>
>>>> $ ./pg_archivecleanup -x "bz2" /tmp 000000010000000100000058
>>>
>>> Hmm, but I thought that the idea was that the extension was optional.
>>> Perhaps I'm missing something but I don't think the previous patch
>>> will complain about that either; or at least I don't see why the
>>> behavior should be any different.
>>
>> Can someone enlighten me on this point?
>>
>
> mmm! you're right... it's not complaining either... i was sure it was...
> and i'm not sure i want to contor things for that...
>
> so, just forget my last mail about that... your refactor is just fine for me

OK, committed that way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company