Re: odd output in restore mode

Lists: pgsql-hackerspgsql-patches
From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject: odd output in restore mode
Date: 2008-05-12 20:57:01
Message-ID: 4828AF1D.7060405@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


I have just been working on setting up a continuous recovery failover
system, and noticed some odd log lines, shown below. (Using 8.3).

First note that our parsing of recovery.conf in xlog.c is pretty bad,
and at least we need to document the quirks if it's not going to be
fixed. log_restartpoints is said to be boolean, but when I set it to an
unquoted true I got a fatal error, while a quoted 'on' sets it to false,
as seen. Ick. What is more, I apparently managed to get the recovery
server to lose a WAL file and hang totally by having a bad
recovery.conf. Triple ick.

Second, what is all this about .history files? My understanding is that
they are not necessary, so surely we should try to stat them to see if
they are present before trying to copy them. These lines are going to
confuse a lot of people, I suspect (including me).

Lastly, not quite related to this output, but in the same general area,
should we have an option on pg_standby to allow removing the archive
file after it has been restored?

cheers

andrew

LOG: database system was interrupted; last known up at 2008-05-12
15:18:23 EDT
LOG: starting archive recovery
LOG: log_restartpoints = false
LOG: restore_command = '../bin/pg_standby -t ../common_archive/failover
../common_archive %f %p %r '
cp: cannot stat `../common_archive/00000001.history': No such file or
directory
cp: cannot stat `../common_archive/00000001.history': No such file or
directory
cp: cannot stat `../common_archive/00000001.history': No such file or
directory
LOG: restored log file "0000000100000000000000A5.00000068.backup" from
archive
LOG: restored log file "0000000100000000000000A5" from archive
LOG: automatic recovery in progress
LOG: redo starts at 0/A50000B0
LOG: restored log file "0000000100000000000000A6" from archive
LOG: restored log file "0000000100000000000000A7" from archive
LOG: restored log file "0000000100000000000000A8" from archive
LOG: restored log file "0000000100000000000000A9" from archive
trigger file found
LOG: could not open file "pg_xlog/0000000100000000000000AA" (log file
0, segment 170): No such file or directory
LOG: redo done at 0/A9000068
LOG: restored log file "0000000100000000000000A9" from archive
cp: cannot stat `../common_archive/00000002.history': No such file or
directory
cp: cannot stat `../common_archive/00000002.history': No such file or
directory
cp: cannot stat `../common_archive/00000002.history': No such file or
directory
LOG: selected new timeline ID: 2
cp: cannot stat `../common_archive/00000001.history': No such file or
directory
cp: cannot stat `../common_archive/00000001.history': No such file or
directory
cp: cannot stat `../common_archive/00000001.history': No such file or
directory
LOG: archive recovery complete
LOG: database system is ready to accept connections
LOG: autovacuum launcher started


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-12 22:14:00
Message-ID: 1210630440.29684.249.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2008-05-12 at 16:57 -0400, Andrew Dunstan wrote:

> I have just been working on setting up a continuous recovery failover
> system, and noticed some odd log lines, shown below. (Using 8.3).

Hmmm, well, the first time you use something complex, there are some
surprising features, I guess. Most especially the log lines are there to
allow production issues to be diagnosed, not to create a beautiful log.

Many of the things that look somewhat strange are there for a reason,
since a wide range of options and save-your-customers-ass scenarios are
covered by the recovery code.

Suggestions for improvement are always welcome and you are welcome to
suggest doc changes, as many people do.

> First note that our parsing of recovery.conf in xlog.c is pretty bad,
> and at least we need to document the quirks if it's not going to be
> fixed. log_restartpoints is said to be boolean, but when I set it to an
> unquoted true I got a fatal error, while a quoted 'on' sets it to false,
> as seen. Ick.

Yes, some improvements are definitely due there.

> What is more, I apparently managed to get the recovery
> server to lose a WAL file and hang totally by having a bad
> recovery.conf. Triple ick.

Sounds like a bug you should report in the normal way. Correctness is
paramount. Or are you confusing the message in the log for file AA with
an error?

> Second, what is all this about .history files? My understanding is that
> they are not necessary, so surely we should try to stat them to see if
> they are present before trying to copy them. These lines are going to
> confuse a lot of people, I suspect (including me).

I try to keep it as simple as possible, since much of this code only
gets run when you really need it to work. The request for the .history
file and the cp are examples of that.

> Lastly, not quite related to this output, but in the same general area,
> should we have an option on pg_standby to allow removing the archive
> file after it has been restored?

There already is one, but its more complex than that. (%r)

> LOG: database system was interrupted; last known up at 2008-05-12
> 15:18:23 EDT
> LOG: starting archive recovery
> LOG: log_restartpoints = false
> LOG: restore_command = '../bin/pg_standby -t ../common_archive/failover
> ../common_archive %f %p %r '
> cp: cannot stat `../common_archive/00000001.history': No such file or
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or
> directory
> LOG: restored log file "0000000100000000000000A5.00000068.backup" from
> archive
> LOG: restored log file "0000000100000000000000A5" from archive
> LOG: automatic recovery in progress
> LOG: redo starts at 0/A50000B0
> LOG: restored log file "0000000100000000000000A6" from archive
> LOG: restored log file "0000000100000000000000A7" from archive
> LOG: restored log file "0000000100000000000000A8" from archive
> LOG: restored log file "0000000100000000000000A9" from archive
> trigger file found
> LOG: could not open file "pg_xlog/0000000100000000000000AA" (log file
> 0, segment 170): No such file or directory
> LOG: redo done at 0/A9000068
> LOG: restored log file "0000000100000000000000A9" from archive
> cp: cannot stat `../common_archive/00000002.history': No such file or
> directory
> cp: cannot stat `../common_archive/00000002.history': No such file or
> directory
> cp: cannot stat `../common_archive/00000002.history': No such file or
> directory
> LOG: selected new timeline ID: 2
> cp: cannot stat `../common_archive/00000001.history': No such file or
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or
> directory
> cp: cannot stat `../common_archive/00000001.history': No such file or
> directory
> LOG: archive recovery complete
> LOG: database system is ready to accept connections
> LOG: autovacuum launcher started

There is an outstanding Windows issue with pg_standby that your help
would be appreciated with, shown on latest commitfest page. It's a
Windows issue and I don't maintain a Windows dev environment.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-12 22:58:37
Message-ID: 4828CB9D.7070102@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
>
>
>> What is more, I apparently managed to get the recovery
>> server to lose a WAL file and hang totally by having a bad
>> recovery.conf. Triple ick.
>>
>
> Sounds like a bug you should report in the normal way. Correctness is
> paramount. Or are you confusing the message in the log for file AA with
> an error?
>

No, it had to do with pg_standby waiting for a WAL file that had already
gone, somehow. I will try to reproduce it when I get a spare moment.
>
>> Second, what is all this about .history files? My understanding is that
>> they are not necessary, so surely we should try to stat them to see if
>> they are present before trying to copy them. These lines are going to
>> confuse a lot of people, I suspect (including me).
>>
>
> I try to keep it as simple as possible, since much of this code only
> gets run when you really need it to work. The request for the .history
> file and the cp are examples of that.
>

I don't follow. AFAICT no .history file was in fact archived. ISTM that
in this case we should only call RestoreWALFileForRecovery if the file
in fact exists.

>> Lastly, not quite related to this output, but in the same general area,
>> should we have an option on pg_standby to allow removing the archive
>> file after it has been restored?
>>
>
> There already is one, but its more complex than that. (%r)
>

I was using %r. But the WAL files that have been restored (according to
the log) are still in the archive dir. So it looks like %r isn't working
properly.

> There is an outstanding Windows issue with pg_standby that your help
> would be appreciated with, shown on latest commitfest page. It's a
> Windows issue and I don't maintain a Windows dev environment.
>
>

The patch has been rejected for now, according to the Commitfest page.
Not sure what you want my help on.

BTW, none of what I reported was on Windows - it's on Linux.

cheers

andrew


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-12 23:37:53
Message-ID: 1210635473.29684.268.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2008-05-12 at 18:58 -0400, Andrew Dunstan wrote:

> No, it had to do with pg_standby waiting for a WAL file that had already
> gone, somehow. I will try to reproduce it when I get a spare moment.

Sounds like the bug I just fixed.

> > There is an outstanding Windows issue with pg_standby that your help
> > would be appreciated with, shown on latest commitfest page. It's a
> > Windows issue and I don't maintain a Windows dev environment.

> The patch has been rejected for now, according to the Commitfest page.
> Not sure what you want my help on.

Well, the patch was rejected long ago, not sure why its in this
commitfest. But its an open issue on the Windows port.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: odd output in restore mode
Date: 2008-05-13 01:06:26
Message-ID: 200805122106.27351.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Monday 12 May 2008 18:58:37 Andrew Dunstan wrote:
> Simon Riggs wrote:
> >> Lastly, not quite related to this output, but in the same general area,
> >> should we have an option on pg_standby to allow removing the archive
> >> file after it has been restored?
> >
> > There already is one, but its more complex than that. (%r)
>
> I was using %r. But the WAL files that have been restored (according to
> the log) are still in the archive dir. So it looks like %r isn't working
> properly.
>

Are you sure you've moved passed the latest restart point? Just because a WAL
file has been processed doesn't mean it can be deleted.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 01:15:55
Message-ID: 4828EBCB.6010004@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
> On Mon, 2008-05-12 at 18:58 -0400, Andrew Dunstan wrote:
>
>
>> No, it had to do with pg_standby waiting for a WAL file that had already
>> gone, somehow. I will try to reproduce it when I get a spare moment.
>>
>
> Sounds like the bug I just fixed.
>

Yes, so I see. I didn't have that fix, so I'll test again with the patch.
>
>
>>> There is an outstanding Windows issue with pg_standby that your help
>>> would be appreciated with, shown on latest commitfest page. It's a
>>> Windows issue and I don't maintain a Windows dev environment.
>>>
>
>
>> The patch has been rejected for now, according to the Commitfest page.
>> Not sure what you want my help on.
>>
>
> Well, the patch was rejected long ago, not sure why its in this
> commitfest. But its an open issue on the Windows port.
>
>

Surely the right fix is to use the recently implemented
pgwin32_safestat() (if we aren't already - I suspect we probably are)
and remove the kluge in pg_standby.c.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 01:38:58
Message-ID: 6489.1210642738@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Simon Riggs wrote:
>> Well, the patch was rejected long ago, not sure why its in this
>> commitfest. But its an open issue on the Windows port.

> Surely the right fix is to use the recently implemented
> pgwin32_safestat() (if we aren't already - I suspect we probably are)
> and remove the kluge in pg_standby.c.

I think the open issue is how to know whether pgwin32_safestat fixes the
problem that the kluge tried to work around.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: odd output in restore mode
Date: 2008-05-13 02:40:38
Message-ID: 4828FFA6.7000806@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Robert Treat wrote:
> On Monday 12 May 2008 18:58:37 Andrew Dunstan wrote:
>
>> Simon Riggs wrote:
>>
>>>> Lastly, not quite related to this output, but in the same general area,
>>>> should we have an option on pg_standby to allow removing the archive
>>>> file after it has been restored?
>>>>
>>> There already is one, but its more complex than that. (%r)
>>>
>> I was using %r. But the WAL files that have been restored (according to
>> the log) are still in the archive dir. So it looks like %r isn't working
>> properly.
>>
>>
>
> Are you sure you've moved passed the latest restart point? Just because a WAL
> file has been processed doesn't mean it can be deleted.
>
>

Thanks. It wasn't that, but when I ran with the very latest patches this
problem went away.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 03:03:25
Message-ID: 482904FD.4030006@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> Simon Riggs wrote:
>>
>>> Well, the patch was rejected long ago, not sure why its in this
>>> commitfest. But its an open issue on the Windows port.
>>>
>
>
>> Surely the right fix is to use the recently implemented
>> pgwin32_safestat() (if we aren't already - I suspect we probably are)
>> and remove the kluge in pg_standby.c.
>>
>
> I think the open issue is how to know whether pgwin32_safestat fixes the
> problem that the kluge tried to work around.
>
>
>

Well, I think we need to consider quite a number of scenarios. The
archive directory could be local, on a remote Windows machine, or on a
remote Samba server. The archive file could be copied by Windows copy,
or Unix cp, or scp, or rsync, among others.

I'd like to know the setup that was found to produce the error, to start
with.

cheers

andrew


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: odd output in restore mode
Date: 2008-05-13 05:37:24
Message-ID: 1210657044.29684.271.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2008-05-12 at 22:40 -0400, Andrew Dunstan wrote:
>
> Robert Treat wrote:
> > On Monday 12 May 2008 18:58:37 Andrew Dunstan wrote:
> >
> >> Simon Riggs wrote:
> >>
> >>>> Lastly, not quite related to this output, but in the same general area,
> >>>> should we have an option on pg_standby to allow removing the archive
> >>>> file after it has been restored?
> >>>>
> >>> There already is one, but its more complex than that. (%r)
> >>>
> >> I was using %r. But the WAL files that have been restored (according to
> >> the log) are still in the archive dir. So it looks like %r isn't working
> >> properly.
> >>

> > Are you sure you've moved passed the latest restart point? Just because a WAL
> > file has been processed doesn't mean it can be deleted.
> >
> Thanks. It wasn't that, but when I ran with the very latest patches this
> problem went away.
>

Thanks for testing.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 05:44:35
Message-ID: 1210657475.29684.280.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2008-05-12 at 23:03 -0400, Andrew Dunstan wrote:
> Tom Lane wrote:
> > Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> >> Simon Riggs wrote:
> >>
> >>> Well, the patch was rejected long ago, not sure why its in this
> >>> commitfest. But its an open issue on the Windows port.
> >>>
> >
> >> Surely the right fix is to use the recently implemented
> >> pgwin32_safestat() (if we aren't already - I suspect we probably are)
> >> and remove the kluge in pg_standby.c.
> >>
> >
> > I think the open issue is how to know whether pgwin32_safestat fixes the
> > problem that the kluge tried to work around.
> >
> Well, I think we need to consider quite a number of scenarios. The
> archive directory could be local, on a remote Windows machine, or on a
> remote Samba server. The archive file could be copied by Windows copy,
> or Unix cp, or scp, or rsync, among others.
>
> I'd like to know the setup that was found to produce the error, to start
> with.

It's a race condition, not a deterministic bug with recreatable
conditions. My understanding is the current code was introduced to work
around the implementation of stat on Windows which says the filesize is
correct even while it is still copying it. The 1sec delay fixed that but
is clearly not a foolproof fix and introduces a delay also, which was
the original complaint.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 07:42:26
Message-ID: 937d27e10805130042m35584e2fk79dc3490173d3bd4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, May 13, 2008 at 2:38 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
> > Simon Riggs wrote:
> >> Well, the patch was rejected long ago, not sure why its in this
> >> commitfest. But its an open issue on the Windows port.
>
> > Surely the right fix is to use the recently implemented
> > pgwin32_safestat() (if we aren't already - I suspect we probably are)
> > and remove the kluge in pg_standby.c.
>
> I think the open issue is how to know whether pgwin32_safestat fixes the
> problem that the kluge tried to work around.

Per the comments on the commitfest page, I don't believe it is.
pgwin32_safestat fixes a bug in which stat() returns stale information
(if memory serves). The hack in pg_standby was added because copy in
Windows appears to preallocate the required space for the file it's
copying, thus checking the file size to verify that the copy has
completed is not a valid test.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 08:32:55
Message-ID: 1210667576.29684.283.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, 2008-05-13 at 08:42 +0100, Dave Page wrote:
> On Tue, May 13, 2008 at 2:38 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> >
> > > Simon Riggs wrote:
> > >> Well, the patch was rejected long ago, not sure why its in this
> > >> commitfest. But its an open issue on the Windows port.
> >
> > > Surely the right fix is to use the recently implemented
> > > pgwin32_safestat() (if we aren't already - I suspect we probably are)
> > > and remove the kluge in pg_standby.c.
> >
> > I think the open issue is how to know whether pgwin32_safestat fixes the
> > problem that the kluge tried to work around.
>
> Per the comments on the commitfest page, I don't believe it is.
> pgwin32_safestat fixes a bug in which stat() returns stale information
> (if memory serves). The hack in pg_standby was added because copy in
> Windows appears to preallocate the required space for the file it's
> copying, thus checking the file size to verify that the copy has
> completed is not a valid test.

Could somebody suggest and test an improvement to the Windows code, to
fix the kluge?

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Dave Page <dpage(at)pgadmin(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 12:00:29
Message-ID: 482982DD.6060901@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
> On Tue, 2008-05-13 at 08:42 +0100, Dave Page wrote:
>
>> On Tue, May 13, 2008 at 2:38 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>>
>>>
>>>> Simon Riggs wrote:
>>>>
>>> >> Well, the patch was rejected long ago, not sure why its in this
>>> >> commitfest. But its an open issue on the Windows port.
>>>
>>> > Surely the right fix is to use the recently implemented
>>> > pgwin32_safestat() (if we aren't already - I suspect we probably are)
>>> > and remove the kluge in pg_standby.c.
>>>
>>> I think the open issue is how to know whether pgwin32_safestat fixes the
>>> problem that the kluge tried to work around.
>>>
>> Per the comments on the commitfest page, I don't believe it is.
>> pgwin32_safestat fixes a bug in which stat() returns stale information
>> (if memory serves). The hack in pg_standby was added because copy in
>> Windows appears to preallocate the required space for the file it's
>> copying, thus checking the file size to verify that the copy has
>> completed is not a valid test.
>>
>
> Could somebody suggest and test an improvement to the Windows code, to
> fix the kluge?
>
>

Given what Dave says, I'm not sure there is an easy one, at least
without a lot of testing. Greg Stark's suggestion might or might not work.

However, we should probably make the behaviour switchable. If the
archive_command populating the archive_directory were rsync, for
example, this problem should not occur, because it copies to a temp
file, and then renames it, so we should never see an incomplete file
even though rsync also apparently preallocates space.

We should also document it better in the code, along the lines of Dave's
comment above.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Dave Page <dpage(at)pgadmin(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 12:26:35
Message-ID: 482988FB.5070307@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


I wrote:
>
> However, we should probably make the behaviour switchable. If the
> archive_command populating the archive_directory were rsync, for
> example, this problem should not occur, because it copies to a temp
> file, and then renames it, so we should never see an incomplete file
> even though rsync also apparently preallocates space.
>
>

Another and probably simpler thing to try would be the GnuWin32 version
of cp. If we can verify that it behaves itself, we should probably
recommend it for use in archive_command instead of the native Windows copy.

I'm still not sure how to construct a test, though.

cheers

andrew


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 15:08:52
Message-ID: 20080513150852.GD6966@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:

> Another and probably simpler thing to try would be the GnuWin32 version
> of cp. If we can verify that it behaves itself, we should probably
> recommend it for use in archive_command instead of the native Windows
> copy.

Perhaps use xcopy, which should be more ubiquitous?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 16:05:58
Message-ID: 4829BC66.4020602@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>
>> Another and probably simpler thing to try would be the GnuWin32 version
>> of cp. If we can verify that it behaves itself, we should probably
>> recommend it for use in archive_command instead of the native Windows
>> copy.
>>
>
> Perhaps use xcopy, which should be more ubiquitous?
>
>

I would be very surprised if xcopy did not exhibit the same
preallocating behaviour as copy.

cheers

andrew


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 16:11:30
Message-ID: 20080513161130.GJ6966@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:

> I would be very surprised if xcopy did not exhibit the same
> preallocating behaviour as copy.

I, on the other hand, would not say anything until someone tried it, and
then wouldn't be surprised if it behaved either way :-)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-13 16:15:12
Message-ID: 937d27e10805130915n7fa68737ka573d43b516f7bc0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, May 13, 2008 at 5:11 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Andrew Dunstan wrote:
>
> > I would be very surprised if xcopy did not exhibit the same
> > preallocating behaviour as copy.
>
> I, on the other hand, would not say anything until someone tried it, and
> then wouldn't be surprised if it behaved either way :-)

It pre-allocates the space as copy does. And yes, I did test :-p

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-18 12:38:12
Message-ID: 48302334.2020906@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Dave Page wrote:
> On Tue, May 13, 2008 at 5:11 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
>
>> Andrew Dunstan wrote:
>>
>> > I would be very surprised if xcopy did not exhibit the same
>> > preallocating behaviour as copy.
>>
>> I, on the other hand, would not say anything until someone tried it, and
>> then wouldn't be surprised if it behaved either way :-)
>>
>
> It pre-allocates the space as copy does. And yes, I did test :-p
>
>
>

Dave,

I don't know how you tested, but could you please repeat the test with
GnuWin32's cp.exe? If it doesn't preallocate the space then I think our
way forward is reasonably clear:

. we recommend its use for Windows archive_command settings
. we provide the delay kluge as switchable behaviour on Windows instead
of having it always on.

cheers

andrew


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-05-18 21:16:17
Message-ID: 937d27e10805181416u9a81f5cl86e66f6d86e01439@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sun, May 18, 2008 at 1:38 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> I don't know how you tested,

Copy a large file across a relatively slow network, and check the size
on the destination drive before it finishes.

> but could you please repeat the test with
> GnuWin32's cp.exe? If it doesn't preallocate the space then I think our way
> forward is reasonably clear:

It does not pre-allocate.

> . we recommend its use for Windows archive_command settings
> . we provide the delay kluge as switchable behaviour on Windows instead of
> having it always on.

Sounds reasonable to me.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: odd output in restore mode
Date: 2008-06-30 22:11:02
Message-ID: 200806302211.m5UMB3B21422@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
>
> I have just been working on setting up a continuous recovery failover
> system, and noticed some odd log lines, shown below. (Using 8.3).
>
> First note that our parsing of recovery.conf in xlog.c is pretty bad,
> and at least we need to document the quirks if it's not going to be
> fixed. log_restartpoints is said to be boolean, but when I set it to an
> unquoted true I got a fatal error, while a quoted 'on' sets it to false,
> as seen. Ick. What is more, I apparently managed to get the recovery

I have fixed the boolean problem with the attached, applied patch. It
exposes guc.c::parse_bool() for use in xlog.c.

I assume all the other problems you reported have been corrected.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachment Content-Type Size
/rtmp/diff text/x-diff 3.3 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-06-30 22:13:12
Message-ID: 200806302213.m5UMDCW22128@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Dave Page wrote:
> On Sun, May 18, 2008 at 1:38 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >
> > I don't know how you tested,
>
> Copy a large file across a relatively slow network, and check the size
> on the destination drive before it finishes.
>
> > but could you please repeat the test with
> > GnuWin32's cp.exe? If it doesn't preallocate the space then I think our way
> > forward is reasonably clear:
>
> It does not pre-allocate.
>
> > . we recommend its use for Windows archive_command settings
> > . we provide the delay kluge as switchable behaviour on Windows instead of
> > having it always on.
>
> Sounds reasonable to me.

Are there any changes we need to make here?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Dave Page <dpage(at)pgadmin(dot)org>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in restore mode
Date: 2008-06-30 23:29:03
Message-ID: 48696C3F.4060901@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian wrote:
> Dave Page wrote:
>
>> On Sun, May 18, 2008 at 1:38 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>
>>> I don't know how you tested,
>>>
>> Copy a large file across a relatively slow network, and check the size
>> on the destination drive before it finishes.
>>
>>
>>> but could you please repeat the test with
>>> GnuWin32's cp.exe? If it doesn't preallocate the space then I think our way
>>> forward is reasonably clear:
>>>
>> It does not pre-allocate.
>>
>>
>>> . we recommend its use for Windows archive_command settings
>>> . we provide the delay kluge as switchable behaviour on Windows instead of
>>> having it always on.
>>>
>> Sounds reasonable to me.
>>
>
> Are there any changes we need to make here?
>
>

Yes. Simon has promised a patch to do the above.

cheers

andrew


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] odd output in restore mode
Date: 2008-07-01 08:39:08
Message-ID: 1214901548.3845.544.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Mon, 2008-06-30 at 19:29 -0400, Andrew Dunstan wrote:
>
> Bruce Momjian wrote:
> > Dave Page wrote:
> >
> >> On Sun, May 18, 2008 at 1:38 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >>
> >>> I don't know how you tested,
> >>>
> >> Copy a large file across a relatively slow network, and check the size
> >> on the destination drive before it finishes.
> >>
> >>
> >>> but could you please repeat the test with
> >>> GnuWin32's cp.exe? If it doesn't preallocate the space then I think our way
> >>> forward is reasonably clear:
> >>>
> >> It does not pre-allocate.
> >>
> >>
> >>> . we recommend its use for Windows archive_command settings
> >>> . we provide the delay kluge as switchable behaviour on Windows instead of
> >>> having it always on.
> >>>
> >> Sounds reasonable to me.
> >>
> >
> > Are there any changes we need to make here?
> >
> >
>
> Yes. Simon has promised a patch to do the above.

Patch implements

* recommendation to use GnuWin32 cp on Windows
* provide "holdtime" delay, default 0 (on all platforms)
* default stays same on Windows="copy" to ensure people upgrading don't
get stung

Patch should be backpatched to 8.3, plus to CVS HEAD.

We should recommend in next 8.3 release notes that people use "-p" or
"-l" rather than just letting it default.

Will add permalink to Wiki when patch appears in archives.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Attachment Content-Type Size
pg_standby_win32.v1.patch text/x-patch 9.7 KB

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Dave Page" <dpage(at)pgadmin(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] odd output in restore mode
Date: 2008-07-01 10:44:22
Message-ID: 486A0A86.1080509@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
> Patch implements
>
> * recommendation to use GnuWin32 cp on Windows
> * provide "holdtime" delay, default 0 (on all platforms)
> * default stays same on Windows="copy" to ensure people upgrading don't
> get stung

This seems pretty kludgey to me. I wouldn't want to install GnuWin32
utilities on a production system just for the "cp" command, and I don't
know how I would tune holdtime properly for using "copy". And it seems
risky to have defaults that are known to not work reliably.

How about implementing a replacement function for "cp" ourselves? It
seems pretty trivial to do. We could use that on Unixes as well, which
would keep the differences between Win32 and other platforms smaller,
and thus ensure the codepath gets more testing.

(Sorry for jumping into the discussion so late, I didn't follow this
thread earlier, and just read it now in the archives while looking at
the patch.)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] odd output in restore mode
Date: 2008-07-01 11:49:47
Message-ID: 1214912987.3845.584.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-07-01 at 13:44 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > Patch implements
> >
> > * recommendation to use GnuWin32 cp on Windows
> > * provide "holdtime" delay, default 0 (on all platforms)
> > * default stays same on Windows="copy" to ensure people upgrading don't
> > get stung
>
> This seems pretty kludgey to me. I wouldn't want to install GnuWin32
> utilities on a production system just for the "cp" command, and I don't
> know how I would tune holdtime properly for using "copy". And it seems
> risky to have defaults that are known to not work reliably.
>
> How about implementing a replacement function for "cp" ourselves? It
> seems pretty trivial to do. We could use that on Unixes as well, which
> would keep the differences between Win32 and other platforms smaller,
> and thus ensure the codepath gets more testing.
>
> (Sorry for jumping into the discussion so late, I didn't follow this
> thread earlier, and just read it now in the archives while looking at
> the patch.)

If you've heard complaints about any of this from users, I haven't.
AFAIK we're doing this because it *might* cause a problem. Bear in mind
that link is the preferred performance option, not copy. So AFAICS we're
tuning a secondary option on one specific port, without it being a
raised issue and in an area of code that will be superceded in the next
release.

So further embellishments would be a long way down my own priority list,
putting it politely. Yet I have no objections to the suggestion overall;
we have done that already for alter tablespace.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] odd output in restore mode
Date: 2008-07-01 15:20:43
Message-ID: 200807011520.m61FKhZ20311@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
> > > * recommendation to use GnuWin32 cp on Windows
> > > * provide "holdtime" delay, default 0 (on all platforms)
> > > * default stays same on Windows="copy" to ensure people upgrading don't
> > > get stung
> >
> > This seems pretty kludgey to me. I wouldn't want to install GnuWin32
> > utilities on a production system just for the "cp" command, and I don't
> > know how I would tune holdtime properly for using "copy". And it seems
> > risky to have defaults that are known to not work reliably.
> >
> > How about implementing a replacement function for "cp" ourselves? It
> > seems pretty trivial to do. We could use that on Unixes as well, which
> > would keep the differences between Win32 and other platforms smaller,
> > and thus ensure the codepath gets more testing.
> >
> > (Sorry for jumping into the discussion so late, I didn't follow this
> > thread earlier, and just read it now in the archives while looking at
> > the patch.)
>
> If you've heard complaints about any of this from users, I haven't.
> AFAIK we're doing this because it *might* cause a problem. Bear in mind
> that link is the preferred performance option, not copy. So AFAICS we're
> tuning a secondary option on one specific port, without it being a
> raised issue and in an area of code that will be superceded in the next
> release.
>
> So further embellishments would be a long way down my own priority list,
> putting it politely. Yet I have no objections to the suggestion overall;
> we have done that already for alter tablespace.

OK, based on these observations I think we need to learn more about the
issues before making any changes to our code.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-23 00:19:16
Message-ID: 48867904.9090001@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Below my comments on the CommitFest patch:
pg_standby minor changes for Windows

Simon, I'm sorry you got me, a Postgres newbie, signed up for
reviewing your patch ;)

To start with, I'm not quite sure of the status of this patch
since Bruce's last comment on the -patches alias:

Bruce Momjian wrote:
> OK, based on these observations I think we need to learn more about the
> issues before making any changes to our code.

From easy to difficult:

1. Issues with applying the patch to CVS HEAD:

The second file in the patch
Index: doc/src/sgml/standby.sgml
appears to be misnamed -- the existing file in HEAD is
Index: doc/src/sgml/pgstandby.sgml

However, still had issues after fixing the file name:

md(at)Garu:~/pg/pgsql$ patch -c -p0 < ../pg_standby.patch
patching file contrib/pg_standby/pg_standby.c
patching file doc/src/sgml/pgstandby.sgml
Hunk #1 FAILED at 136.
Hunk #2 FAILED at 168.
Hunk #3 FAILED at 245.
Hunk #4 FAILED at 255.
4 out of 4 hunks FAILED -- saving rejects to file doc/src/sgml/pgstandby.sgml.rej

2. Missing description for new command-line options in pgstandby.sgml

Simon Riggs wrote:
> Patch implements
> * recommendation to use GnuWin32 cp on Windows

Saw that in the changes to pgstandby.sgml, and looks ok to me, but:
- no description of the proposed new command-line options -h and -p?

3. No coding style issues seen

Just one comment: the logic that selects the actual restore command to
be used has moved from CustomizableInitialize() to main() -- a matter
of personal taste, perhaps. But in my view the:
+ the #ifdef WIN32/HAVE_WORKING_LINK logic has become easier to read

4. Issue: missing break in switch, silent override of '-l' argument?

This behaviour has been in there before and is not addresses by the
patch: The user-selected Win32 "mklink" command mode is never applied
due to a missing 'break' in CustomizableInitialize():

switch (restoreCommandType)
{
case RESTORE_COMMAND_WIN32_MKLINK:
SET_RESTORE_COMMAND("mklink", WALFilePath, xlogFilePath);
case RESTORE_COMMAND_WIN32_COPY:
SET_RESTORE_COMMAND("copy", WALFilePath, xlogFilePath);
break;

A similar behaviour on Non-Win32 platforms where the user-selected
"ln" may be silently changed to "cp" in main():

#if HAVE_WORKING_LINK
restoreCommandType = RESTORE_COMMAND_LN;
#else
restoreCommandType = RESTORE_COMMAND_CP;
#endif

If both Win32/Non-Win32 cases reflect the intended behaviour:
- I'd prefer a code comment in the above case-fall-through,
- suggest a message to the user about the ignored "ln" / "mklink",
- observe that the logic to override of the '-l' option is now in two
places: CustomizableInitialize() and main().

5. Minor wording issue in usage message on new '-p' option

I was wondering if the "always" in the usage text
fprintf(stderr, " -p always uses GNU compatible 'cp' command on all platforms\n");
is too strong, since multiple restore command options overwrite each
other, e.g. "-p -c" applies Windows's "copy" instead of Gnu's "cp".

6. Minor code comment suggestion

Unrelated to this patch, I wonder if the code comments on all four
time-related vars better read "seconds" instead of "amount of time":
int sleeptime = 5; /* amount of time to sleep between file checks */
int holdtime = 0; /* amount of time to wait once file appears full */
int waittime = -1; /* how long we have been waiting, -1 no wait
* yet */
int maxwaittime = 0; /* how long are we prepared to wait for? */

7. Question: benefits of separate holdtime option from sleeptime?

Simon Riggs wrote:
> * provide "holdtime" delay, default 0 (on all platforms)

Going back on the hackers+patches emails and parsing the code
comments, I'm sorry if I missed that, but I'm not sure I've understood
the exact tuning benefits that introducing the new holdtime option
provides over using the existing sleeptime, as it's been the case
(just on Win32 only).

8. Unresolved question of implementing now/later a "cp" replacement

Simon Riggs wrote:
> On Tue, 2008-07-01 at 13:44 +0300, Heikki Linnakangas wrote:
>> This seems pretty kludgey to me. I wouldn't want to install GnuWin32
>> utilities on a production system just for the "cp" command, and I don't
>> know how I would tune holdtime properly for using "copy". And it seems
>> risky to have defaults that are known to not work reliably.
>>
>> How about implementing a replacement function for "cp" ourselves? It
>> seems pretty trivial to do. We could use that on Unixes as well, which
>> would keep the differences between Win32 and other platforms smaller,
>> and thus ensure the codepath gets more testing.
>
> If you've heard complaints about any of this from users, I haven't.
> AFAIK we're doing this because it *might* cause a problem. Bear in mind
> that link is the preferred performance option, not copy. So AFAICS we're
> tuning a secondary option on one specific port, without it being a
> raised issue and in an area of code that will be superceded in the next
> release.
>
> So further embellishments would be a long way down my own priority list,
> putting it politely. Yet I have no objections to the suggestion overall;
> we have done that already for alter tablespace.

Don't have much to add to the whether/now/later question of providing
a "cp" replacement, but I guess the existing command-line options and
documentation wouldn't have to change with our own "cp" replacement
while the newly proposed '-h' and '-p' would become moot then, right?

Regards,
Martin


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-23 07:09:28
Message-ID: 1216796968.3894.532.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-07-22 at 17:19 -0700, Martin Zaun wrote:
> 1. Issues with applying the patch to CVS HEAD:

Sounds awful. Thanks for the review, will fix.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-23 10:59:43
Message-ID: 1216810783.3894.602.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-07-22 at 17:19 -0700, Martin Zaun wrote:

> 1. Issues with applying the patch to CVS HEAD:

For me, the patch applies cleanly to CVS HEAD.

I do notice that there are two files "standby.sgml" and
"pgstandby.sgml". I can't see where "standby.sgml" comes from, but I
haven't created it; perhaps it is a relic of the SGML build process.
I've recreated my source tree since I wrote the patch also. Weird.

I'll redo the patch so it points at pgstandby.sgml, which is the one
thats listed as being in the main source tree.

> 2. Missing description for new command-line options in pgstandby.sgml
>
> - no description of the proposed new command-line options -h and -p?

These are done. The patch issues have missed those hunks.

> 3. No coding style issues seen
>
> Just one comment: the logic that selects the actual restore command to
> be used has moved from CustomizableInitialize() to main() -- a matter
> of personal taste, perhaps. But in my view the:
> + the #ifdef WIN32/HAVE_WORKING_LINK logic has become easier to read

Thanks

> 4. Issue: missing break in switch, silent override of '-l' argument?
>
> This behaviour has been in there before

Well spotted. I don't claim to test this for Windows.

> 5. Minor wording issue in usage message on new '-p' option
>
> I was wondering if the "always" in the usage text
> fprintf(stderr, " -p always uses GNU compatible 'cp' command on all platforms\n");
> is too strong, since multiple restore command options overwrite each
> other, e.g. "-p -c" applies Windows's "copy" instead of Gnu's "cp".

I was assuming you don't turn the switch off again immediately
afterwards.

> 6. Minor code comment suggestion
>
> Unrelated to this patch, I wonder if the code comments on all four
> time-related vars better read "seconds" instead of "amount of time":
> int sleeptime = 5; /* amount of time to sleep between file checks */
> int holdtime = 0; /* amount of time to wait once file appears full */
> int waittime = -1; /* how long we have been waiting, -1 no wait
> * yet */
> int maxwaittime = 0; /* how long are we prepared to wait for? */

As you say, unrelated to the patch.

> 7. Question: benefits of separate holdtime option from sleeptime?
>
> Simon Riggs wrote:
> > * provide "holdtime" delay, default 0 (on all platforms)
>
> Going back on the hackers+patches emails and parsing the code
> comments, I'm sorry if I missed that, but I'm not sure I've understood
> the exact tuning benefits that introducing the new holdtime option
> provides over using the existing sleeptime, as it's been the case
> (just on Win32 only).

This is central to the patch, since the complaint was about the delay
introduced by doing that previously.

> 8. Unresolved question of implementing now/later a "cp" replacement

The patch implements what's been agreed.

I'm not rewriting "cp", for reasons already discussed.

Not a comment to you Martin, but it's fairly clear that I'm not
maintaining this correctly for Windows. I've never claimed to have
tested this on Windows, and only included Windows related items as
requested by others. I need to make it clear that I'm not going to
maintain it at all, for Windows. If others wish to report Windows issues
then they can suggest appropriate fixes and test them also.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Martin Zaun" <Martin(dot)Zaun(at)Sun(dot)COM>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Dave Page" <dpage(at)pgadmin(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-23 18:38:56
Message-ID: 48877AC0.7070603@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
> On Tue, 2008-07-22 at 17:19 -0700, Martin Zaun wrote:
>> 8. Unresolved question of implementing now/later a "cp" replacement
>
> The patch implements what's been agreed.
>
> I'm not rewriting "cp", for reasons already discussed.
>
> Not a comment to you Martin, but it's fairly clear that I'm not
> maintaining this correctly for Windows. I've never claimed to have
> tested this on Windows, and only included Windows related items as
> requested by others. I need to make it clear that I'm not going to
> maintain it at all, for Windows. If others wish to report Windows issues
> then they can suggest appropriate fixes and test them also.

Hmm. I just realized that replacing the "cp" command within pg_standby
won't help at all. The problem is with the command that copies the files
*to* the archivelocation that pg_standby polls, not with the copy
pg_standby does from archivelocation to pg_xlog. And we don't have much
control over that.

We really need a more reliable way of detecting that a file has been
fully copied. One simple improvement would be to check the xlp_magic
field of the last page, though it still wouldn't be bullet-proof.

Do the commands that preallocate the space keep the file exclusively
locked during the copy? If they do, shouldn't we get an error in trying
to run the restore copy command, and retry after the 1s sleep in
RestoreWALFileForRecovery? Though if the archive location is a samba
mount or something, I guess we can't rely on Windows-style exclusive
locking.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Dave Page" <dpage(at)pgadmin(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>, "Martin Zaun" <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-23 19:11:06
Message-ID: 48873C7E.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

>>> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> wrote:

> We really need a more reliable way of detecting that a file has been

> fully copied.

In our scripts we handle this by copying to a temp directory on the
same mount point as the archive directory and doing a mv to the
archive location when the copy is successfully completed. I think
that this even works on Windows. Could that just be documented as a
strong recommendation for the archive script?

-Kevin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-23 21:01:18
Message-ID: 48879C1E.9090600@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Kevin Grittner wrote:
>>>> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> wrote:
>>>>
>
>
>> We really need a more reliable way of detecting that a file has been
>>
>
>
>> fully copied.
>>
>
> In our scripts we handle this by copying to a temp directory on the
> same mount point as the archive directory and doing a mv to the
> archive location when the copy is successfully completed. I think
> that this even works on Windows. Could that just be documented as a
> strong recommendation for the archive script?
>
>
>
>

Needs testing at least. If it does in fact work then we can just adjust
the docs and be done - or maybe provide a .bat file or perl script that
would work as na archive_command on Windows.

cheers

andrew


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-23 23:05:56
Message-ID: 1216854356.3894.699.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Wed, 2008-07-23 at 21:38 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Tue, 2008-07-22 at 17:19 -0700, Martin Zaun wrote:
> >> 8. Unresolved question of implementing now/later a "cp" replacement
> >
> > The patch implements what's been agreed.
> >
> > I'm not rewriting "cp", for reasons already discussed.
> >
> > Not a comment to you Martin, but it's fairly clear that I'm not
> > maintaining this correctly for Windows. I've never claimed to have
> > tested this on Windows, and only included Windows related items as
> > requested by others. I need to make it clear that I'm not going to
> > maintain it at all, for Windows. If others wish to report Windows issues
> > then they can suggest appropriate fixes and test them also.
>
> Hmm. I just realized that replacing the "cp" command within pg_standby
> won't help at all. The problem is with the command that copies the files
> *to* the archivelocation that pg_standby polls, not with the copy
> pg_standby does from archivelocation to pg_xlog. And we don't have much
> control over that.
>
> We really need a more reliable way of detecting that a file has been
> fully copied. One simple improvement would be to check the xlp_magic
> field of the last page, though it still wouldn't be bullet-proof.
>
> Do the commands that preallocate the space keep the file exclusively
> locked during the copy? If they do, shouldn't we get an error in trying
> to run the restore copy command, and retry after the 1s sleep in
> RestoreWALFileForRecovery? Though if the archive location is a samba
> mount or something, I guess we can't rely on Windows-style exclusive
> locking.

With respect, I need to refer you back to the my last paragraph above.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-25 19:46:18
Message-ID: 1217015178.3894.992.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-07-22 at 17:19 -0700, Martin Zaun wrote:

> reviewing your patch

Current status is this:

* My understanding is that Dave and Andrew (and therefore Simon) think
the approach proposed here is an acceptable one. Heikki disagrees and
wants different approach. Perhaps I misunderstand.

* Patch needs work to complete the proposed approach

* I'm willing to change the patch, but not able to test it on Windows.

Is there someone able to test the patch, if I make the changes? If not,
we should just kick this out of the CommitFest queue now and be done. If
nobody cares enough about this issue to test a fix, we shouldn't bother.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-25 20:31:26
Message-ID: 398.1217017886@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Tue, 2008-07-22 at 17:19 -0700, Martin Zaun wrote:
>> reviewing your patch

> Current status is this:
> * My understanding is that Dave and Andrew (and therefore Simon) think
> the approach proposed here is an acceptable one. Heikki disagrees and
> wants different approach. Perhaps I misunderstand.
> * Patch needs work to complete the proposed approach
> * I'm willing to change the patch, but not able to test it on Windows.

I thought the latest conclusion was that changing the behavior of
pg_standby itself wouldn't address the problem anyway, and that what we
need is just a docs patch recommending that people use safe copying
methods in their scripts that copy to the archive area?

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-25 20:53:02
Message-ID: 1217019182.3894.1007.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Fri, 2008-07-25 at 16:31 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Tue, 2008-07-22 at 17:19 -0700, Martin Zaun wrote:
> >> reviewing your patch
>
> > Current status is this:
> > * My understanding is that Dave and Andrew (and therefore Simon) think
> > the approach proposed here is an acceptable one. Heikki disagrees and
> > wants different approach. Perhaps I misunderstand.
> > * Patch needs work to complete the proposed approach
> > * I'm willing to change the patch, but not able to test it on Windows.
>
> I thought the latest conclusion was that changing the behavior of
> pg_standby itself wouldn't address the problem anyway, and that what we
> need is just a docs patch recommending that people use safe copying
> methods in their scripts that copy to the archive area?

Plus the rest of this patch, which is really very simple.

pg_standby currently waits (on Windows) for the sleep time. We agreed
that this sleep would be on by default, but optional.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-25 20:58:57
Message-ID: 792.1217019537@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Fri, 2008-07-25 at 16:31 -0400, Tom Lane wrote:
>> I thought the latest conclusion was that changing the behavior of
>> pg_standby itself wouldn't address the problem anyway, and that what we
>> need is just a docs patch recommending that people use safe copying
>> methods in their scripts that copy to the archive area?

> Plus the rest of this patch, which is really very simple.

Why? AFAICT the patch is just a kluge that adds user-visible complexity
without providing a solution that's actually sure to work.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-25 21:25:01
Message-ID: 1217021101.3894.1025.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Fri, 2008-07-25 at 16:58 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Fri, 2008-07-25 at 16:31 -0400, Tom Lane wrote:
> >> I thought the latest conclusion was that changing the behavior of
> >> pg_standby itself wouldn't address the problem anyway, and that what we
> >> need is just a docs patch recommending that people use safe copying
> >> methods in their scripts that copy to the archive area?
>
> > Plus the rest of this patch, which is really very simple.
>
> Why? AFAICT the patch is just a kluge that adds user-visible complexity
> without providing a solution that's actually sure to work.

First, I'm not the one objecting to the current behaviour.

Currently, there is a wait in there that can be removed if you use a
copy utility that sets size after it does a copy. So we agreed to make
it optional (at PGCon).

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Dave Page" <dpage(at)pgadmin(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>, "Martin Zaun" <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-28 07:28:06
Message-ID: 488D7506.9000903@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
> Kevin Grittner wrote:
>>>>> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> wrote:
>>> We really need a more reliable way of detecting that a file has been
>>> fully copied.
>>
>> In our scripts we handle this by copying to a temp directory on the
>> same mount point as the archive directory and doing a mv to the
>> archive location when the copy is successfully completed. I think
>> that this even works on Windows. Could that just be documented as a
>> strong recommendation for the archive script?
>
> Needs testing at least. If it does in fact work then we can just adjust
> the docs and be done

Yeah.

> - or maybe provide a .bat file or perl script that
> would work as na archive_command on Windows.

We're not talking about archive_command. We're talking about the thing
that copies files to the directory that pg_standby polls.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-28 13:09:40
Message-ID: 488DC514.3030800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Heikki Linnakangas wrote:
> Andrew Dunstan wrote:
>
>
>> - or maybe provide a .bat file or perl script that would work as na
>> archive_command on Windows.
>
> We're not talking about archive_command. We're talking about the thing
> that copies files to the directory that pg_standby polls.

Er, that's what the archive_command is. Look at the pg_standby docs and
you'll see that that's where we're currently recommending use of windows
copy. Perhaps you're confusing this with the restore_command?

cheers

andrew


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Dave Page" <dpage(at)pgadmin(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>, "Martin Zaun" <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-28 13:46:14
Message-ID: 488DCDA6.2020000@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
>
>
> Heikki Linnakangas wrote:
>> Andrew Dunstan wrote:
>>
>>
>>> - or maybe provide a .bat file or perl script that would work as na
>>> archive_command on Windows.
>>
>> We're not talking about archive_command. We're talking about the thing
>> that copies files to the directory that pg_standby polls.
>
> Er, that's what the archive_command is. Look at the pg_standby docs and
> you'll see that that's where we're currently recommending use of windows
> copy. Perhaps you're confusing this with the restore_command?

Oh, right. I was thinking that archive_command copies the files to an
archive location, and there's yet another process copying files from
there to the directory pg_standby polls. But indeed in the simple
configuration, archive_command is the command that we're interested in.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-28 15:59:12
Message-ID: Pine.GSO.4.64.0807281136140.26479@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, 23 Jul 2008, Kevin Grittner wrote:

> In our scripts we handle this by copying to a temp directory on the
> same mount point as the archive directory and doing a mv to the
> archive location when the copy is successfully completed. I think
> that this even works on Windows. Could that just be documented as a
> strong recommendation for the archive script?

This is exactly what I always do. I think the way cp is shown in the
examples promotes what's really a bad practice for lots of reasons, this
particular problem being just one of them.

I've been working on an improved archive_command shell script that I
expect to submit for comments and potential inclusion in the documentation
as a better base for other people to build on. This is one of the options
for how it can operate. It would be painful but not impossible to convert
a subset of that script to run under Windows as well, at least enough to
cover this particular issue.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-28 16:06:17
Message-ID: 488DEE79.4010004@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Greg Smith wrote:
> On Wed, 23 Jul 2008, Kevin Grittner wrote:
>
>> In our scripts we handle this by copying to a temp directory on the
>> same mount point as the archive directory and doing a mv to the
>> archive location when the copy is successfully completed. I think
>> that this even works on Windows. Could that just be documented as a
>> strong recommendation for the archive script?
>
> This is exactly what I always do. I think the way cp is shown in the
> examples promotes what's really a bad practice for lots of reasons,
> this particular problem being just one of them.
>
> I've been working on an improved archive_command shell script that I
> expect to submit for comments and potential inclusion in the
> documentation as a better base for other people to build on. This is
> one of the options for how it can operate. It would be painful but not
> impossible to convert a subset of that script to run under Windows as
> well, at least enough to cover this particular issue.
>
>

A Perl script using the (standard) File::Copy module along with the
builtin function rename() should be moderately portable. It would to be
nice not to have to maintain two scripts.

cheers

andrew


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Dave Page" <dpage(at)pgadmin(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>, "Martin Zaun" <Martin(dot)Zaun(at)Sun(dot)COM>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-29 14:27:23
Message-ID: 488F28CB.8000201@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
>
>
> Greg Smith wrote:
>> On Wed, 23 Jul 2008, Kevin Grittner wrote:
>>
>>> In our scripts we handle this by copying to a temp directory on the
>>> same mount point as the archive directory and doing a mv to the
>>> archive location when the copy is successfully completed. I think
>>> that this even works on Windows. Could that just be documented as a
>>> strong recommendation for the archive script?
>>
>> This is exactly what I always do. I think the way cp is shown in the
>> examples promotes what's really a bad practice for lots of reasons,
>> this particular problem being just one of them.
>>
>> I've been working on an improved archive_command shell script that I
>> expect to submit for comments and potential inclusion in the
>> documentation as a better base for other people to build on. This is
>> one of the options for how it can operate. It would be painful but not
>> impossible to convert a subset of that script to run under Windows as
>> well, at least enough to cover this particular issue.
>
> A Perl script using the (standard) File::Copy module along with the
> builtin function rename() should be moderately portable. It would to be
> nice not to have to maintain two scripts.

It's also not very nice to require a Perl installation on Windows, just
for a replacement of Copy. Would a simple .bat script work?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-30 16:51:34
Message-ID: 48909C16.6070401@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Heikki Linnakangas wrote:
> Andrew Dunstan wrote:
>> Greg Smith wrote:
>>> On Wed, 23 Jul 2008, Kevin Grittner wrote:
>>>
>>> I've been working on an improved archive_command shell script that I
>>> expect to submit for comments and potential inclusion in the
>>> documentation as a better base for other people to build on. This is
>>> one of the options for how it can operate. It would be painful but
>>> not impossible to convert a subset of that script to run under
>>> Windows as well, at least enough to cover this particular issue.
>>
>> A Perl script using the (standard) File::Copy module along with the
>> builtin function rename() should be moderately portable. It would to
>> be nice not to have to maintain two scripts.
>
> It's also not very nice to require a Perl installation on Windows, just
> for a replacement of Copy. Would a simple .bat script work?

With these avenues to be explored, can the pg_standby patch on the
CommitFest wiki be moved to the "Returned with Feedback" section?

Regards,
Martin


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Martin Zaun" <Martin(dot)Zaun(at)Sun(dot)COM>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Dave Page" <dpage(at)pgadmin(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-31 16:05:16
Message-ID: 4891E2BC.203@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martin Zaun wrote:
> Heikki Linnakangas wrote:
>> Andrew Dunstan wrote:
>>> Greg Smith wrote:
>>>> On Wed, 23 Jul 2008, Kevin Grittner wrote:
>>>>
>>>> I've been working on an improved archive_command shell script that I
>>>> expect to submit for comments and potential inclusion in the
>>>> documentation as a better base for other people to build on. This is
>>>> one of the options for how it can operate. It would be painful but
>>>> not impossible to convert a subset of that script to run under
>>>> Windows as well, at least enough to cover this particular issue.
>>>
>>> A Perl script using the (standard) File::Copy module along with the
>>> builtin function rename() should be moderately portable. It would to
>>> be nice not to have to maintain two scripts.
>>
>> It's also not very nice to require a Perl installation on Windows,
>> just for a replacement of Copy. Would a simple .bat script work?
>
> With these avenues to be explored, can the pg_standby patch on the
> CommitFest wiki be moved to the "Returned with Feedback" section?

Yes, I think we can conclude that we don't want this patch as it is.
Instead, we want a documentation patch that describes the problem,
mentioning that GNU cp is safe, or you can use the copy+rename trick.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Martin Zaun" <Martin(dot)Zaun(at)Sun(dot)COM>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Dave Page" <dpage(at)pgadmin(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "List pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-07-31 16:32:38
Message-ID: 9981.1217521958@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Martin Zaun wrote:
>> With these avenues to be explored, can the pg_standby patch on the
>> CommitFest wiki be moved to the "Returned with Feedback" section?

> Yes, I think we can conclude that we don't want this patch as it is.
> Instead, we want a documentation patch that describes the problem,
> mentioning that GNU cp is safe, or you can use the copy+rename trick.

Right, after which we remove the presently hacked-in delay.

I've updated the commitfest page accordingly.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS]odd output in restore mode
Date: 2008-08-02 10:07:49
Message-ID: 1217671669.3934.119.camel@ebony.t-mobile.de.
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Thu, 2008-07-31 at 12:32 -0400, Tom Lane wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> > Martin Zaun wrote:
> >> With these avenues to be explored, can the pg_standby patch on the
> >> CommitFest wiki be moved to the "Returned with Feedback" section?
>
> > Yes, I think we can conclude that we don't want this patch as it is.
> > Instead, we want a documentation patch that describes the problem,
> > mentioning that GNU cp is safe, or you can use the copy+rename trick.
>
> Right, after which we remove the presently hacked-in delay.
>
> I've updated the commitfest page accordingly.

Well, this is a strange conclusion, leaving me slightly bemused.

The discussion between Andrew and I at PGcon concluded that we would
* document which other tools to use
* remove the delay

Now we have rejected the patch which does that, but then re-requested
the exact same thing again.

The patch interprets "remove the delay" as "remove the delay in a way
which will not screw up existing users of pg_standby when they upgrade".
Doing that requires us to have a configurable delay, which defaults to
the current behaviour, but that can be set to zero (the recommended
way). Which is what the patch implements.

Andrew, Heikki: ISTM its time to just make the changes yourselves. This
is just going round and round to no benefit. This doesn't warrant such a
long discussion and review process.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Greg Smith <gsmith(at)gregsmith(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS]odd output in restore mode
Date: 2008-08-02 14:27:21
Message-ID: 48946EC9.1010000@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
> Well, this is a strange conclusion, leaving me slightly bemused.
>
> The discussion between Andrew and I at PGcon concluded that we would
> * document which other tools to use
> * remove the delay
>
> Now we have rejected the patch which does that, but then re-requested
> the exact same thing again.
>
> The patch interprets "remove the delay" as "remove the delay in a way
> which will not screw up existing users of pg_standby when they upgrade".
> Doing that requires us to have a configurable delay, which defaults to
> the current behaviour, but that can be set to zero (the recommended
> way). Which is what the patch implements.
>
> Andrew, Heikki: ISTM its time to just make the changes yourselves. This
> is just going round and round to no benefit. This doesn't warrant such a
> long discussion and review process.
>

You ought to know by now that the length and ferocity of the discussion
bears no relation at all to the importance of the subject ;-)

Personally, I think it's reasonable to provide the delay as long as it's
switchable, although I would have preferred zero to be the default. If
we remove it altogether then we force bigger changes on people who are
currently using Windows copy. But I can live with that since changing
their archive_command is the better path by far anyway, either to use
Gnu cp or the copy / rename trick.

cheers

andrew


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Greg Smith <gsmith(at)gregsmith(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-11-11 22:24:41
Message-ID: 200811112224.mABMOfj27949@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Have we made any progress on this, namely better documentation and
removing the Win32 delay code?

---------------------------------------------------------------------------

Andrew Dunstan wrote:
>
>
> Simon Riggs wrote:
> > Well, this is a strange conclusion, leaving me slightly bemused.
> >
> > The discussion between Andrew and I at PGcon concluded that we would
> > * document which other tools to use
> > * remove the delay
> >
> > Now we have rejected the patch which does that, but then re-requested
> > the exact same thing again.
> >
> > The patch interprets "remove the delay" as "remove the delay in a way
> > which will not screw up existing users of pg_standby when they upgrade".
> > Doing that requires us to have a configurable delay, which defaults to
> > the current behaviour, but that can be set to zero (the recommended
> > way). Which is what the patch implements.
> >
> > Andrew, Heikki: ISTM its time to just make the changes yourselves. This
> > is just going round and round to no benefit. This doesn't warrant such a
> > long discussion and review process.
> >
>
> You ought to know by now that the length and ferocity of the discussion
> bears no relation at all to the importance of the subject ;-)
>
> Personally, I think it's reasonable to provide the delay as long as it's
> switchable, although I would have preferred zero to be the default. If
> we remove it altogether then we force bigger changes on people who are
> currently using Windows copy. But I can live with that since changing
> their archive_command is the better path by far anyway, either to use
> Gnu cp or the copy / rename trick.
>
> cheers
>
> andrew
>

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Greg Smith <gsmith(at)gregsmith(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-11-11 23:09:09
Message-ID: 491A1095.1030706@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


I have a fairly large TODO list, and Simon has thrown in the towel (and
I imagine he also has a large TODO list).

anyone else want to step in?

cheers

andrew

Bruce Momjian wrote:
> Have we made any progress on this, namely better documentation and
> removing the Win32 delay code?
>
> ---------------------------------------------------------------------------
>
> Andrew Dunstan wrote:
>
>> Simon Riggs wrote:
>>
>>> Well, this is a strange conclusion, leaving me slightly bemused.
>>>
>>> The discussion between Andrew and I at PGcon concluded that we would
>>> * document which other tools to use
>>> * remove the delay
>>>
>>> Now we have rejected the patch which does that, but then re-requested
>>> the exact same thing again.
>>>
>>> The patch interprets "remove the delay" as "remove the delay in a way
>>> which will not screw up existing users of pg_standby when they upgrade".
>>> Doing that requires us to have a configurable delay, which defaults to
>>> the current behaviour, but that can be set to zero (the recommended
>>> way). Which is what the patch implements.
>>>
>>> Andrew, Heikki: ISTM its time to just make the changes yourselves. This
>>> is just going round and round to no benefit. This doesn't warrant such a
>>> long discussion and review process.
>>>
>>>
>> You ought to know by now that the length and ferocity of the discussion
>> bears no relation at all to the importance of the subject ;-)
>>
>> Personally, I think it's reasonable to provide the delay as long as it's
>> switchable, although I would have preferred zero to be the default. If
>> we remove it altogether then we force bigger changes on people who are
>> currently using Windows copy. But I can live with that since changing
>> their archive_command is the better path by far anyway, either to use
>> Gnu cp or the copy / rename trick.
>>
>> cheers
>>
>> andrew
>>
>>
>
>


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-12-15 21:15:51
Message-ID: 200812152115.mBFLFpV20891@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martin Zaun wrote:
> 4. Issue: missing break in switch, silent override of '-l' argument?
>
> This behaviour has been in there before and is not addresses by the
> patch: The user-selected Win32 "mklink" command mode is never applied
> due to a missing 'break' in CustomizableInitialize():
>
> switch (restoreCommandType)
> {
> case RESTORE_COMMAND_WIN32_MKLINK:
> SET_RESTORE_COMMAND("mklink", WALFilePath, xlogFilePath);
> case RESTORE_COMMAND_WIN32_COPY:
> SET_RESTORE_COMMAND("copy", WALFilePath, xlogFilePath);
> break;

I have added the missing 'break' to CVS HEAD; thanks.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-12-15 22:08:12
Message-ID: 200812152208.mBFM8C629422@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Since this patch was rejected, I have added the attached documentation
to pg_standby to mention the sleep() we do.

---------------------------------------------------------------------------

Martin Zaun wrote:
>
> Below my comments on the CommitFest patch:
> pg_standby minor changes for Windows
>
> Simon, I'm sorry you got me, a Postgres newbie, signed up for
> reviewing your patch ;)
>
> To start with, I'm not quite sure of the status of this patch
> since Bruce's last comment on the -patches alias:
>
> Bruce Momjian wrote:
> > OK, based on these observations I think we need to learn more about the
> > issues before making any changes to our code.
>
> From easy to difficult:
>
> 1. Issues with applying the patch to CVS HEAD:
>
> The second file in the patch
> Index: doc/src/sgml/standby.sgml
> appears to be misnamed -- the existing file in HEAD is
> Index: doc/src/sgml/pgstandby.sgml
>
> However, still had issues after fixing the file name:
>
> md(at)Garu:~/pg/pgsql$ patch -c -p0 < ../pg_standby.patch
> patching file contrib/pg_standby/pg_standby.c
> patching file doc/src/sgml/pgstandby.sgml
> Hunk #1 FAILED at 136.
> Hunk #2 FAILED at 168.
> Hunk #3 FAILED at 245.
> Hunk #4 FAILED at 255.
> 4 out of 4 hunks FAILED -- saving rejects to file doc/src/sgml/pgstandby.sgml.rej
>
>
> 2. Missing description for new command-line options in pgstandby.sgml
>
> Simon Riggs wrote:
> > Patch implements
> > * recommendation to use GnuWin32 cp on Windows
>
> Saw that in the changes to pgstandby.sgml, and looks ok to me, but:
> - no description of the proposed new command-line options -h and -p?
>
>
> 3. No coding style issues seen
>
> Just one comment: the logic that selects the actual restore command to
> be used has moved from CustomizableInitialize() to main() -- a matter
> of personal taste, perhaps. But in my view the:
> + the #ifdef WIN32/HAVE_WORKING_LINK logic has become easier to read
>
>
> 4. Issue: missing break in switch, silent override of '-l' argument?
>
> This behaviour has been in there before and is not addresses by the
> patch: The user-selected Win32 "mklink" command mode is never applied
> due to a missing 'break' in CustomizableInitialize():
>
> switch (restoreCommandType)
> {
> case RESTORE_COMMAND_WIN32_MKLINK:
> SET_RESTORE_COMMAND("mklink", WALFilePath, xlogFilePath);
> case RESTORE_COMMAND_WIN32_COPY:
> SET_RESTORE_COMMAND("copy", WALFilePath, xlogFilePath);
> break;
>
> A similar behaviour on Non-Win32 platforms where the user-selected
> "ln" may be silently changed to "cp" in main():
>
> #if HAVE_WORKING_LINK
> restoreCommandType = RESTORE_COMMAND_LN;
> #else
> restoreCommandType = RESTORE_COMMAND_CP;
> #endif
>
> If both Win32/Non-Win32 cases reflect the intended behaviour:
> - I'd prefer a code comment in the above case-fall-through,
> - suggest a message to the user about the ignored "ln" / "mklink",
> - observe that the logic to override of the '-l' option is now in two
> places: CustomizableInitialize() and main().
>
>
> 5. Minor wording issue in usage message on new '-p' option
>
> I was wondering if the "always" in the usage text
> fprintf(stderr, " -p always uses GNU compatible 'cp' command on all platforms\n");
> is too strong, since multiple restore command options overwrite each
> other, e.g. "-p -c" applies Windows's "copy" instead of Gnu's "cp".
>
>
> 6. Minor code comment suggestion
>
> Unrelated to this patch, I wonder if the code comments on all four
> time-related vars better read "seconds" instead of "amount of time":
> int sleeptime = 5; /* amount of time to sleep between file checks */
> int holdtime = 0; /* amount of time to wait once file appears full */
> int waittime = -1; /* how long we have been waiting, -1 no wait
> * yet */
> int maxwaittime = 0; /* how long are we prepared to wait for? */
>
>
> 7. Question: benefits of separate holdtime option from sleeptime?
>
> Simon Riggs wrote:
> > * provide "holdtime" delay, default 0 (on all platforms)
>
> Going back on the hackers+patches emails and parsing the code
> comments, I'm sorry if I missed that, but I'm not sure I've understood
> the exact tuning benefits that introducing the new holdtime option
> provides over using the existing sleeptime, as it's been the case
> (just on Win32 only).
>
>
> 8. Unresolved question of implementing now/later a "cp" replacement
>
> Simon Riggs wrote:
> > On Tue, 2008-07-01 at 13:44 +0300, Heikki Linnakangas wrote:
> >> This seems pretty kludgey to me. I wouldn't want to install GnuWin32
> >> utilities on a production system just for the "cp" command, and I don't
> >> know how I would tune holdtime properly for using "copy". And it seems
> >> risky to have defaults that are known to not work reliably.
> >>
> >> How about implementing a replacement function for "cp" ourselves? It
> >> seems pretty trivial to do. We could use that on Unixes as well, which
> >> would keep the differences between Win32 and other platforms smaller,
> >> and thus ensure the codepath gets more testing.
> >
> > If you've heard complaints about any of this from users, I haven't.
> > AFAIK we're doing this because it *might* cause a problem. Bear in mind
> > that link is the preferred performance option, not copy. So AFAICS we're
> > tuning a secondary option on one specific port, without it being a
> > raised issue and in an area of code that will be superceded in the next
> > release.
> >
> > So further embellishments would be a long way down my own priority list,
> > putting it politely. Yet I have no objections to the suggestion overall;
> > we have done that already for alter tablespace.
>
> Don't have much to add to the whether/now/later question of providing
> a "cp" replacement, but I guess the existing command-line options and
> documentation wouldn't have to change with our own "cp" replacement
> while the newly proposed '-h' and '-p' would become moot then, right?
>
> Regards,
> Martin

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachment Content-Type Size
/rtmp/diff text/x-diff 1.2 KB

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-12-15 22:08:14
Message-ID: 4946D54E.4060406@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian wrote:
> Martin Zaun wrote:
>> 4. Issue: missing break in switch, silent override of '-l' argument?
>>
>> This behaviour has been in there before and is not addresses by the
>> patch: The user-selected Win32 "mklink" command mode is never applied
>> due to a missing 'break' in CustomizableInitialize():
>>
>> switch (restoreCommandType)
>> {
>> case RESTORE_COMMAND_WIN32_MKLINK:
>> SET_RESTORE_COMMAND("mklink", WALFilePath, xlogFilePath);
>> case RESTORE_COMMAND_WIN32_COPY:
>> SET_RESTORE_COMMAND("copy", WALFilePath, xlogFilePath);
>> break;
>
> I have added the missing 'break' to CVS HEAD; thanks.

Why no backpatch to 8.3? Seems like a clear bugfix to me.

//Magnus


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-12-15 22:10:46
Message-ID: 200812152210.mBFMAkO29844@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Magnus Hagander wrote:
> Bruce Momjian wrote:
> > Martin Zaun wrote:
> >> 4. Issue: missing break in switch, silent override of '-l' argument?
> >>
> >> This behaviour has been in there before and is not addresses by the
> >> patch: The user-selected Win32 "mklink" command mode is never applied
> >> due to a missing 'break' in CustomizableInitialize():
> >>
> >> switch (restoreCommandType)
> >> {
> >> case RESTORE_COMMAND_WIN32_MKLINK:
> >> SET_RESTORE_COMMAND("mklink", WALFilePath, xlogFilePath);
> >> case RESTORE_COMMAND_WIN32_COPY:
> >> SET_RESTORE_COMMAND("copy", WALFilePath, xlogFilePath);
> >> break;
> >
> > I have added the missing 'break' to CVS HEAD; thanks.
>
> Why no backpatch to 8.3? Seems like a clear bugfix to me.

I knew that was going to be asked. At this point I am pulling comments
from rejected patches into CVS commits; these are not even submitted
patches. I am not comfortable backpatching anything when using that
system because obviously no one else even cared enough to submit a patch
for it, let alone test it. If someone wants to batckpatch this or
submit a patch to be backpatched, that is fine.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-12-15 22:18:32
Message-ID: 200812152218.mBFMIWI02151@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> > Martin Zaun wrote:
> >> With these avenues to be explored, can the pg_standby patch on the
> >> CommitFest wiki be moved to the "Returned with Feedback" section?
>
> > Yes, I think we can conclude that we don't want this patch as it is.
> > Instead, we want a documentation patch that describes the problem,
> > mentioning that GNU cp is safe, or you can use the copy+rename trick.
>
> Right, after which we remove the presently hacked-in delay.
>
> I've updated the commitfest page accordingly.

I have documented the sleep() call and that GNU cp is safe, but did not
remove the delay, nor mention copy+rename.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Martin Zaun <Martin(dot)Zaun(at)Sun(dot)COM>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] odd output in restore mode
Date: 2008-12-15 22:30:56
Message-ID: 1229380256.8673.351.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Mon, 2008-12-15 at 17:10 -0500, Bruce Momjian wrote:
> >
> > Why no backpatch to 8.3? Seems like a clear bugfix to me.
>
> I knew that was going to be asked.

8.3 is really where this is needed. 8.4 has almost no need of this.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support