Quick Links

Re: max_standby_delay considered harmful

Lists:	pgsql-hackers

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	max_standby_delay considered harmful
Date:	2010-05-03 15:37:04
Message-ID:	16681.1272901024@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I've finally wrapped my head around exactly what the max_standby_delay
code is doing, and I'm not happy with it. The way that code is designed
is that the setting represents a maximum allowed difference between the
standby server's system clock and the commit timestamps it is reading
from the WAL log; whenever this difference exceeds the setting, we'll
kill standby queries in hopes of catching up faster. Now, I can see
the attraction of defining it that way, for certain use-cases.
However, I think it is too fragile and too badly implemented to be
usable in the real world; and it certainly can't be the default
operating mode. There are three really fundamental problems with it:

1. The timestamps we are reading from the log might be historical,
if we are replaying from archive rather than reading a live SR stream.
In the current implementation that means zero grace period for standby
queries. Now if your only interest is catching up as fast as possible,
that could be a sane behavior, but this is clearly not the only possible
interest --- in fact, if that's all you care about, why did you allow
standby queries at all?

2. There could be clock skew between the master and slave servers.
If the master's clock is a minute or so ahead of the slave's, again we
get into a situation where standby queries have zero grace period, even
though killing them won't do a darn thing to permit catchup. If the
master is behind the slave then we have an artificially inflated grace
period, which is going to slow down the slave.

3. There could be significant propagation delay from master to slave,
if the WAL stream is being transmitted with pg_standby or some such.
Again this results in cutting into the standby queries' grace period,
for no defensible reason.

In addition to these fundamental problems there's a fatal implementation
problem: the actual comparison is not to the master's current clock
reading, but to the latest commit, abort, or checkpoint timestamp read
from the WAL. Thus, if the last commit was more than max_standby_delay
seconds ago, zero grace time. Now if the master is really idle then
there aren't going to be any conflicts anyway, but what if it's running
only long-running queries? Or what happens when it was idle for awhile
and then starts new queries? Zero grace period, that's what.

We could possibly improve matters for the SR case by having walsender
transmit the master's current clock reading every so often (probably
once per activity cycle), outside the WAL stream proper. The receiver
could subtract off its own clock reading in order to measure the skew,
and then we could cancel queries if the de-skewed transmission time
falls too far behind. However this doesn't do anything to fix the cases
where we aren't reading (and caught up to) a live SR broadcast.

I'm inclined to think that we should throw away all this logic and just
have the slave cancel competing queries if the replay process waits
more than max_standby_delay seconds to acquire a lock. This is simple,
understandable, and behaves the same whether we're reading live data or
not. Putting in something that tries to maintain a closed-loop maximum
delay between master and slave seems like a topic for future research
rather than a feature we have to have in 9.0. And in any case we'd
still want the plain max delay for non-SR cases, AFAICS, because there's
no sane way to use closed-loop logic in other cases.

Comments?

regards, tom lane

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 16:48:58
Message-ID:	1272905338.4161.34992.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 11:37 -0400, Tom Lane wrote:

> I've finally wrapped my head around exactly what the max_standby_delay
> code is doing, and I'm not happy with it.

Yes, I don't think I'd call it perfect yet.

> have the slave cancel competing queries if the replay process waits
> more than max_standby_delay seconds to acquire a lock. This is simple,
> understandable, and behaves the same whether we're reading live data or
> not.

I have no objection, and would welcome, adding another behaviour, since
that just gives us a better chance of having this feature do something
useful.

> I'm inclined to think that we should throw away all this logic

HS has been through 2 Alphas with the current behaviour and it will go
through 0 Alphas with the newly proposed behaviour. At this stage of
proceedings, that is extremely dangerous and I don't wish to do that.
The likelihood that we replace it with something worse seems fairly
high/certain: snap decision making never quite considers all angles.
Phrases like "throw away all this logic" don't give me confidence that
people that agree with that perspective would understand what they are
signing up to.

> Putting in something that tries to maintain a closed-loop maximum
> delay between master and slave seems like a topic for future research
> rather than a feature we have to have in 9.0. And in any case we'd
> still want the plain max delay for non-SR cases, AFAICS, because there's
> no sane way to use closed-loop logic in other cases.

I will be looking for ways to improve this over time.

--
Simon Riggs www.2ndQuadrant.com

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 16:54:05
Message-ID:	4BDEFFAD.70002@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Mon, 2010-05-03 at 11:37 -0400, Tom Lane wrote:
>
>> I've finally wrapped my head around exactly what the max_standby_delay
>> code is doing, and I'm not happy with it.
>
> Yes, I don't think I'd call it perfect yet.
>
>> have the slave cancel competing queries if the replay process waits
>> more than max_standby_delay seconds to acquire a lock. This is simple,
>> understandable, and behaves the same whether we're reading live data or
>> not.
>
> I have no objection, and would welcome, adding another behaviour, since
> that just gives us a better chance of having this feature do something
> useful.
>
>> I'm inclined to think that we should throw away all this logic
>
> HS has been through 2 Alphas with the current behaviour and it will go
> through 0 Alphas with the newly proposed behaviour. At this stage of
> proceedings, that is extremely dangerous and I don't wish to do that.
> The likelihood that we replace it with something worse seems fairly
> high/certain: snap decision making never quite considers all angles.
> Phrases like "throw away all this logic" don't give me confidence that
> people that agree with that perspective would understand what they are
> signing up to.

I'm not really sure how much serious testing outside of the small set of
people mostly interested in one or another specific aspect of HS/SR has
been actually done with the alphas to be honest.
I just started testing HS yesterday and I already ran twice into the
general issue tom is complaining about with max_standby_delay...

Stefan

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 17:02:47
Message-ID:	1272906167.4161.35009.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 18:54 +0200, Stefan Kaltenbrunner wrote:

> I'm not really sure how much serious testing outside of the small set of
> people mostly interested in one or another specific aspect of HS/SR has
> been actually done with the alphas to be honest.
> I just started testing HS yesterday and I already ran twice into the
> general issue tom is complaining about with max_standby_delay...

I guarantee that if that proposal goes in, people will complain about
that also. Last minute behaviour changes are bad news. I don't object to
adding something, just don't take anything away. It's not like the code
for it is pages long or anything.

The trade off is HA or queries and two modes make sense for user choice.

--
Simon Riggs www.2ndQuadrant.com

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 17:13:59
Message-ID:	20100503171359.GF21875@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> I guarantee that if that proposal goes in, people will complain about
> that also. Last minute behaviour changes are bad news. I don't object to
> adding something, just don't take anything away. It's not like the code
> for it is pages long or anything.

I have to disagree with this. If it goes into 9.0 this way then we're
signing up to support it for *years*. With something as fragile as the
existing setup (as outlined by Tom), that's probably not a good idea.
We've not signed up to support the existing behaviour at all yet-
alpha's aren't a guarentee of what we're going to release.

> The trade off is HA or queries and two modes make sense for user choice.

The option isn't being thrown out, it's just being made to depend on
something which is alot easier to measure while still being very useful
for the trade-off you're talking about. I don't really see a downside
to this, to be honest. Perhaps you could speak to the specific user
experience difference that you think there would be from this change?

+1 from me on Tom's proposal.

Thanks,

Stephen

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 17:21:08
Message-ID:	AANLkTimNx5bW0eTwcWUe079viqAQ9Sb5O-mdlMiZFb_v@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 3, 2010 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'm inclined to think that we should throw away all this logic and just
> have the slave cancel competing queries if the replay process waits
> more than max_standby_delay seconds to acquire a lock.

What if we somehow get into a situation where the replay process is
waiting for a lock over and over and over again, because it keeps
killing conflicting processes but something restarts them and they
take locks over again? It seems hard to ensure that replay will make
adequate progress with any substantially non-zero value of
max_standby_delay under this definition.

...Robert

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 18:22:31
Message-ID:	1272910951.4161.35022.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 13:13 -0400, Stephen Frost wrote:
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> > I guarantee that if that proposal goes in, people will complain about
> > that also. Last minute behaviour changes are bad news. I don't object to
> > adding something, just don't take anything away. It's not like the code
> > for it is pages long or anything.
>
> I have to disagree with this. If it goes into 9.0 this way then we're
> signing up to support it for *years*. With something as fragile as the
> existing setup (as outlined by Tom), that's probably not a good idea.
> We've not signed up to support the existing behaviour at all yet-
> alpha's aren't a guarentee of what we're going to release.

That's a great argument, either way. We will have to live with 9.0 for
many years and so that's why I mention having both. Make a choice either
way and we take a risk. Why?

> > The trade off is HA or queries and two modes make sense for user choice.
>
> The option isn't being thrown out, it's just being made to depend on
> something which is alot easier to measure while still being very useful
> for the trade-off you're talking about. I don't really see a downside
> to this, to be honest. Perhaps you could speak to the specific user
> experience difference that you think there would be from this change?
>
> +1 from me on Tom's proposal.

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 18:37:36
Message-ID:	1272911856.4161.35025.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 13:21 -0400, Robert Haas wrote:
> On Mon, May 3, 2010 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I'm inclined to think that we should throw away all this logic and just
> > have the slave cancel competing queries if the replay process waits
> > more than max_standby_delay seconds to acquire a lock.
>
> What if we somehow get into a situation where the replay process is
> waiting for a lock over and over and over again, because it keeps
> killing conflicting processes but something restarts them and they
> take locks over again? It seems hard to ensure that replay will make
> adequate progress with any substantially non-zero value of
> max_standby_delay under this definition.

That is one argument against, and a reason why just one route is bad.

We already have more than one way, so another option is useful

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 19:12:46
Message-ID:	1272913966.4161.35034.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 13:13 -0400, Stephen Frost wrote:

> Perhaps you could speak to the specific user
> experience difference that you think there would be from this change?

The difference is really to do with the weight you give to two different
considerations

* avoid query cancellations
* avoid having recovery fall behind, so that failover time is minimised

Some people recognise the trade-offs and are planning multiple standby
servers dedicated to different roles/objectives.

Some people envisage Hot Standby as a platform for running very fast
SELECTs, for which retrying the query is a reasonable possibility and
for whom keeping the standby as up-to-date as possible is an important
consideration from a data freshness perspective. Others view HS as a
weapon against long running queries.

My initial view was that the High Availability goal/role should be the
default or most likely mode of operation. I would say that the current
max_standby_delay favours the HA route since it specifically limits the
amount by which server can fall behind.

Tom's proposed behaviour (has also been proposed before) favours the
avoid query cancellation route though could lead to huge amounts of lag.

I'm happy to have both options because I know this is a trade-off that
solution engineers want to have control of, not something we as
developers can choose ahead of time.

--
Simon Riggs www.2ndQuadrant.com

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 19:27:21
Message-ID:	20100503192721.GG21875@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Mon, May 3, 2010 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I'm inclined to think that we should throw away all this logic and just
> > have the slave cancel competing queries if the replay process waits
> > more than max_standby_delay seconds to acquire a lock.
>
> What if we somehow get into a situation where the replay process is
> waiting for a lock over and over and over again, because it keeps
> killing conflicting processes but something restarts them and they
> take locks over again? It seems hard to ensure that replay will make
> adequate progress with any substantially non-zero value of
> max_standby_delay under this definition.

That was my first question too- but I reread what Tom wrote and came to
a different conclusion: If the reply process waits more than
max_standby_delay to acquire a lock, then it will kill off *everything*
it runs into from that point forward, until it's done with whatever is
currently available. At that point, the 'timer' would reset back to
zero.

When/how that timer gets reset was a question I had, but I feel like
"until nothing is available" makes sense and is what I assumed Tom was
thinking.

Thanks,

Stephen

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 19:32:46
Message-ID:	20100503193246.GH21875@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon,

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> Tom's proposed behaviour (has also been proposed before) favours the
> avoid query cancellation route though could lead to huge amounts of lag.

My impression of Tom's suggestion was that it would also be a maximum
amount of delay which would be allowed before killing off queries- not
that it would be able to wait indefinitely until no one is blocking.
Based on that, I don't know that there's really much user-seen behaviour
between the two, except in 'oddball' situations, where there's a time
skew between the servers, or a large lag, etc, in which case I think
Tom's proposal would be more likely what's 'expected', whereas what you
would get with the existing implementation (zero time delay, or far too
much) would be a 'gotcha'..

Thanks,

Stephen

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 19:37:08
Message-ID:	4BDF25E4.3060902@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon,

> My initial view was that the High Availability goal/role should be the
> default or most likely mode of operation. I would say that the current
> max_standby_delay favours the HA route since it specifically limits the
> amount by which server can fall behind.

I don't understand how Tom's approach would cause the slave to be
further behind than the current max_standy_delay code, and I can see
ways in which it would result in less delay. So, explain?

The main issue with Tom's list which struck me was that
max_standby_delay was linked to the system clock. HS is going to get
used by a lot of PG users who aren't running time sync on their servers,
or who let it get out of whack without fixing it. I'd thought that the
delay was somehow based on transaction timestamps coming from the
master. Keep in mind that there will be a *lot* of people using this
feature, including ones without compentent & available sysadmins.

The lock method appeals to me simply because it would eliminate the
"mass cancel" issues which Greg Smith was reporting every time the timer
runs down. That is, it seems to me that only the oldest queries would
be cancelled and not any new ones. The biggest drawback I can see to
Tom's approach is possible blocking on the slave due to the lock wait
from the recovery process. However, this could be managed with the new
lock-waits GUC, as well as statement timeout.

Overall, I think Tom's proposal gives me what I would prefer, which is
degraded performance on the slave but in ways which users are used to,
rather than a lot of query cancel, which will interfere with user
application porting.

Would the recovery lock show up in pg_locks? That would also be a good
diagnostic tool.

I am happy to test some of this on Amazon or GoGrid, which is what I was
planning on doing anyway.

P.S. can we avoid the "considered harmful" phrase? It carries a lot of
baggage ...

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 19:39:49
Message-ID:	21622.1272915589@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, May 3, 2010 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm inclined to think that we should throw away all this logic and just
>> have the slave cancel competing queries if the replay process waits
>> more than max_standby_delay seconds to acquire a lock.

> What if we somehow get into a situation where the replay process is
> waiting for a lock over and over and over again, because it keeps
> killing conflicting processes but something restarts them and they
> take locks over again?

They won't be able to take locks "over again", because the lock manager
won't allow requests to pass a pending previous request, except in
very limited circumstances that shouldn't hold here. They'll queue
up behind the replay process's lock request, not in front of it.
(If that isn't the case, it needs to be fixed, quite independently
of this concern.)

regards, tom lane

From:	"Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 19:41:01
Message-ID:	98262e6ddc08b92e913ed7bb559f60dc@biglumber.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> Based on that, I don't know that there's really much user-seen behaviour
> between the two, except in 'oddball' situations, where there's a time
> skew between the servers, or a large lag, etc, in which case I think

Certainly that one particular case can be solved by making the
servers be in time sync a prereq for HS working (in the traditional way).
And by "prereq" I mean a "user beware" documentation warning.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005031539
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkvfJr0ACgkQvJuQZxSWSsgSRwCgwAZpKJDqHX28y90rCx/CNXDt
JGgAoO9JeoBacvTJ09UJ+o1Nek3KtcYR
=gvch
-----END PGP SIGNATURE-----

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 19:54:00
Message-ID:	1272916440.4161.35129.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 15:32 -0400, Stephen Frost wrote:
> Simon,
>
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> > Tom's proposed behaviour (has also been proposed before) favours the
> > avoid query cancellation route though could lead to huge amounts of lag.
>
> My impression of Tom's suggestion was that it would also be a maximum
> amount of delay which would be allowed before killing off queries- not
> that it would be able to wait indefinitely until no one is blocking.
> Based on that, I don't know that there's really much user-seen behaviour
> between the two, except in 'oddball' situations, where there's a time
> skew between the servers, or a large lag, etc, in which case I think
> Tom's proposal would be more likely what's 'expected', whereas what you
> would get with the existing implementation (zero time delay, or far too
> much) would be a 'gotcha'..

If recovery waits for max_standby_delay every time something gets in its
way, it should be clear that if many things get in its way it will
progressively fall behind. There is no limit to this and it can always
fall further behind. It does result in fewer cancelled queries and I do
understand many may like that.

That is *significantly* different from how it works now. (Plus: If there
really was no difference, why not leave it as is?)

The bottom line is this is about conflict resolution. There is simply no
way to resolve conflicts without favouring one or other of the
protagonists. Whatever mechanism you come up with that favours one will,
disfavour the other. I'm happy to give choices, but I'm not happy to
force just one kind of conflict resolution.

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 20:08:05
Message-ID:	1272917285.4161.35186.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 15:39 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Mon, May 3, 2010 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I'm inclined to think that we should throw away all this logic and just
> >> have the slave cancel competing queries if the replay process waits
> >> more than max_standby_delay seconds to acquire a lock.
>
> > What if we somehow get into a situation where the replay process is
> > waiting for a lock over and over and over again, because it keeps
> > killing conflicting processes but something restarts them and they
> > take locks over again?
>
> They won't be able to take locks "over again", because the lock manager
> won't allow requests to pass a pending previous request, except in
> very limited circumstances that shouldn't hold here. They'll queue
> up behind the replay process's lock request, not in front of it.
> (If that isn't the case, it needs to be fixed, quite independently
> of this concern.)

Most conflicts aren't lock-manager locks, they are snapshot conflicts,
though clearly different workloads will have different characteristics.

Some conflicts are buffer conflicts and the semantics of buffer cleanup
locks and many other internal locks are that shared locks queue jump
past exclusive lock requests. Not something we should touch, now at
least.

I understand that you aren't impressed by everything about the current
patch but rushed changes may not help either.

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 20:10:36
Message-ID:	AANLkTinigyOHRex2G2WrJKHlVZXPMpTTrjhg_WCbAeaM@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 3, 2010 at 3:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, May 3, 2010 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I'm inclined to think that we should throw away all this logic and just
>>> have the slave cancel competing queries if the replay process waits
>>> more than max_standby_delay seconds to acquire a lock.
>
>> What if we somehow get into a situation where the replay process is
>> waiting for a lock over and over and over again, because it keeps
>> killing conflicting processes but something restarts them and they
>> take locks over again?
>
> They won't be able to take locks "over again", because the lock manager
> won't allow requests to pass a pending previous request, except in
> very limited circumstances that shouldn't hold here. They'll queue
> up behind the replay process's lock request, not in front of it.
> (If that isn't the case, it needs to be fixed, quite independently
> of this concern.)

Well, the new backends needn't try to take "the same" locks as the
existing backends - the point is that in the worst case this proposal
means waiting max_standby_delay for EACH replay that requires taking a
lock. And that might be a LONG time.

One idea I had while thinking this over was to bound the maximum
amount of unapplied WAL rather than the absolute amount of time lag.
Now, that's a little fruity, because your WAL volume might fluctuate
considerably, so you wouldn't really know how far the slave was behind
the master chronologically. However, it would avoid all the time skew
issues, and it would also more accurately model the idea of a bound on
recovery time should we need to promote the standby to master, so
maybe it works out to a win. You could still end up stuck
semi-permanently behind, but never by more than N segments.

Stephen's idea of a mode where we wait up to max_standby_delay for a
lock but then kill everything in our path until we've caught up again
is another possible way of approaching this problem, although it may
lead to "kill storms". Some of that may be inevitable, though: a
bound on WAL lag has the same issue - if the primary is generating WAL
faster than the standby can apply it, the standby will eventually
decide to slaughter everything in its path.

...Robert

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-03 22:04:32
Message-ID:	4BDF4870.9010101@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg, Robert,

> Certainly that one particular case can be solved by making the
> servers be in time sync a prereq for HS working (in the traditional way).
> And by "prereq" I mean a "user beware" documentation warning.
>

Last I checked, you work with *lots* of web developers and web
companies. I'm sure you can see the issue with the above.

> Stephen's idea of a mode where we wait up to max_standby_delay for a
> lock but then kill everything in our path until we've caught up again
> is another possible way of approaching this problem, although it may
> lead to "kill storms".

Personally, I thought that the kill storms were exactly what was wrong
with max_standby_delay. That is, with MSD, no matter *what* your
settings or traffic are, you're going to get query cancel occasionally.

I don't see the issue with Tom's approach from a wait perspective. The
max wait becomes 1.001X max_standby_delay; there's no way I can think of
that replay would wait longer than that. I've yet to see an explanation
why it would be longer.

Simon's assertion that not all operations take a conventional lock is a
much more serious potential flaw.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 02:45:10
Message-ID:	201005040245.o442jAg15359@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Mon, 2010-05-03 at 13:13 -0400, Stephen Frost wrote:
>
> > Perhaps you could speak to the specific user
> > experience difference that you think there would be from this change?
>
> The difference is really to do with the weight you give to two different
> considerations
>
> * avoid query cancellations
> * avoid having recovery fall behind, so that failover time is minimised
>
> Some people recognise the trade-offs and are planning multiple standby
> servers dedicated to different roles/objectives.

I understand Simon's point that the two behaviors have different
benefits. However, I believe few users will be able to understand when
to use which.

As I remember, 9.0 has two behaviors:

o master delays vacuum cleanup
o slave delays WAL application

and in 9.1 we will be adding:

o slave communicates snapshots to master

How would this figure into what we ultimately want in 9.1?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 07:36:14
Message-ID:	1272958574.4535.144.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 15:04 -0700, Josh Berkus wrote:

> I don't see the issue with Tom's approach from a wait perspective. The
> max wait becomes 1.001X max_standby_delay; there's no way I can think of
> that replay would wait longer than that. I've yet to see an explanation
> why it would be longer.

Yes, the max wait on any *one* blocker will be max_standby_delay. But if
you wait for two blockers, then the total time by which the standby lags
will now be 2*max_standby_delay. Add a third, fourth etc and the standby
lag keeps rising.

We need to avoid confusing these two measurables

* standby lag - defined as the total delay from when a WAL record is
written to the time the WAL record is applied. This includes both
transfer time and any delays imposed by Hot Standby.

* standby query delay - defined as the time that recovery will wait for
a query to complete before a cancellation takes place. (We could
complicate this by asking what happens when recovery is blocked twice by
the same query? Would it wait twice, or does it have to track how much
it has waited for each query in total so far?)

Currently max_standby_delay seeks to constrain the standby lag to a
particular value, as a way of providing a bounded time for failover, and
also to constrain the amount of WAL that needs to be stored as the lag
increases. Currently, there is no guaranteed minimum query delay given
to each query.

If every query is guaranteed its requested query delay then the standby
lag will be unbounded. Less cancellations, higher lag. Some people do
want this, though is not currently available. We can do this with two
new GUCs:

* standby_query_delay - USERSET parameter that allows user to specify a
guaranteed query delay, anywhere from 0 to maximum_standby_query_delay

* max_standby_query_delay - SIGHUP parameter - parameter exists to
provide DBA with a limit on the USERSET standby_query_delay, though I
can see some would say this is optional

Current behaviour is same as global settings of
standby_query_delay = 0
max_standby_query_delay = 0
max_standby_delay = X

So if people want minimal cancellations they would specify
standby_query_delay = Y (e.g. 30)
max_standby_query_delay = Z (e.g. 300)
max_standby_delay = -1

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 08:37:17
Message-ID:	1272962237.4535.458.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2010-05-03 at 22:45 -0400, Bruce Momjian wrote:

> As I remember, 9.0 has two behaviors:
>
> o master delays vacuum cleanup
> o slave delays WAL application
>
> and in 9.1 we will be adding:
>
> o slave communicates snapshots to master

> How would this figure into what we ultimately want in 9.1?

We would still want all options, since "slave communicates snapshot to
master" doesn't solve the problem it just moves the problem elsewhere.
It's a question of which factors the user wishes to emphasise for their
specific use.

> I understand Simon's point that the two behaviors have different
> benefits. However, I believe few users will be able to understand when
> to use which.

If users can understand how to set NDISTINCT for a column, they can
understand this. It's not about complexity of UI, its about solving
problems. When people hit an issue, I don't want to be telling people
"we thought you wouldn't understand it, so we removed the parachute".
They might not understand it *before* they hit a problem, so what? But
users certainly will afterwards and won't say "thanks" if you prevent an
option for them, especially for the stated reason. (My point about
ndistinct: 99% of users have no idea that exists or when to use it, but
it still exists as an option because it solves a known issue, just like
this.)

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 11:13:45
Message-ID:	AANLkTimhoUwZr8bqIkXTyiAWAr5GHJC-PjjRlbVu6Zd2@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, May 4, 2010 at 4:37 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> option for them, especially for the stated reason. (My point about
> ndistinct: 99% of users have no idea that exists or when to use it, but
> it still exists as an option because it solves a known issue, just like
> this.)

Slightly OT, but funnily enough, when I was up in New York a couple of
weeks ago with Bruce and a couple of other folks, I started talking
with a DBA up there about his frustrations with PostgreSQL, and - I'm
not making this up - the first example he gave me of something he
wished he could do in PG to improve query planning was manually
override ndistinct estimates. He was pleased to here that we'll have
that in 9.0 and I was pleased to be able to tell him it was my patch.
If you'd asked me what the odds that someone picking a missing feature
would have come up with that one were, I'd have said a billion-to-one
against. But I'm not making this up.

To be honest, I am far from convinced that the existing behavior is a
good one and I'm in favor of modifying it or ripping it out altogether
if we can think of something better. But it has to really be better,
of course, not just trading one set of pain points for another.

...Robert

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 11:41:40
Message-ID:	1272973300.4535.1328.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 07:13 -0400, Robert Haas wrote:
> On Tue, May 4, 2010 at 4:37 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > option for them, especially for the stated reason. (My point about
> > ndistinct: 99% of users have no idea that exists or when to use it, but
> > it still exists as an option because it solves a known issue, just like
> > this.)
>
> Slightly OT, but funnily enough, when I was up in New York a couple of
> weeks ago with Bruce and a couple of other folks, I started talking
> with a DBA up there about his frustrations with PostgreSQL, and - I'm
> not making this up - the first example he gave me of something he
> wished he could do in PG to improve query planning was manually
> override ndistinct estimates. He was pleased to here that we'll have
> that in 9.0 and I was pleased to be able to tell him it was my patch.
> If you'd asked me what the odds that someone picking a missing feature
> would have come up with that one were, I'd have said a billion-to-one
> against. But I'm not making this up.

It matches my experience. I think its a testament to the expertise of
our users as well to the hackers that have done so much to make that the
top of user's lists for change.

> To be honest, I am far from convinced that the existing behavior is a
> good one and I'm in favor of modifying it or ripping it out altogether
> if we can think of something better. But it has to really be better,
> of course, not just trading one set of pain points for another.

The only way I see as genuine better rather than just a different mix of
trade-offs is to come up with ways where there are no conflicts. Hannu
came up with one, using filesystem snapshots, but we haven't had time to
implement that yet.

--
Simon Riggs www.2ndQuadrant.com

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 13:12:48
Message-ID:	20100504131248.GI21875@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> If recovery waits for max_standby_delay every time something gets in its
> way, it should be clear that if many things get in its way it will
> progressively fall behind. There is no limit to this and it can always
> fall further behind. It does result in fewer cancelled queries and I do
> understand many may like that.

Guess I wasn't very clear in my previous description of what I *think*
the change would be (Tom, please jump in if I've got this wrong..).
Recovery wouldn't wait max_standby_delay every time; I agree, that would
be a big change in behaviour and could make it very difficult for the
slave to keep up. Rather, recovery would proceed as normal until it
encounters a lock, at which point it would start a counting down from
max_standby_delay, if the lock is released before it hits that, then it
will move on, if another lock is encoutered, it would start counting
down from where it left off last time. If it hits zero, it'll cancel
the other query, and any other queries that get in the way, until it's
caught up again completely. Once recovery is fully caught up, the
counter would reset again to max_standby_delay.

> That is *significantly* different from how it works now. (Plus: If there
> really was no difference, why not leave it as is?)

Because it's much more complicated the way it is, it doesn't really work
as one would expect in a number of situations, and it's trying to
guarantee something that it probably can't.

> The bottom line is this is about conflict resolution. There is simply no
> way to resolve conflicts without favouring one or other of the
> protagonists. Whatever mechanism you come up with that favours one will,
> disfavour the other. I'm happy to give choices, but I'm not happy to
> force just one kind of conflict resolution.

I don't think anyone is trying to get rid of the knob entirely; you're
right, you can't please everyone all the time, so there has to be some
kind of knob there which people can adjust based on their particular use
case and system. This is about what exactly the knob is and how it's
implemented and documented.

Thanks,

Stephen

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 13:49:56
Message-ID:	1272980996.4535.1875.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 09:12 -0400, Stephen Frost wrote:
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> > If recovery waits for max_standby_delay every time something gets in its
> > way, it should be clear that if many things get in its way it will
> > progressively fall behind. There is no limit to this and it can always
> > fall further behind. It does result in fewer cancelled queries and I do
> > understand many may like that.
>
> Guess I wasn't very clear in my previous description of what I *think*
> the change would be (Tom, please jump in if I've got this wrong..).
> Recovery wouldn't wait max_standby_delay every time; I agree, that would
> be a big change in behaviour and could make it very difficult for the
> slave to keep up. Rather, recovery would proceed as normal until it
> encounters a lock, at which point it would start a counting down from
> max_standby_delay, if the lock is released before it hits that, then it
> will move on, if another lock is encoutered, it would start counting
> down from where it left off last time. If it hits zero, it'll cancel
> the other query, and any other queries that get in the way, until it's
> caught up again completely. Once recovery is fully caught up, the
> counter would reset again to max_standby_delay.

This new clarification is almost exactly how it works already. Sounds
like the existing docs need some improvement.

The only difference is that max_standby_delay is measured from log
timestamp. Perhaps it should work from WAL receipt timestamp rather than
from log timestamp? That would make some of the problems go away without
significantly changing the definition. I'll look at that.

(And that conflicts are caused by more situations than just locks, but
that detail doesn't alter your point).

> > The bottom line is this is about conflict resolution. There is simply no
> > way to resolve conflicts without favouring one or other of the
> > protagonists. Whatever mechanism you come up with that favours one will,
> > disfavour the other. I'm happy to give choices, but I'm not happy to
> > force just one kind of conflict resolution.
>
> I don't think anyone is trying to get rid of the knob entirely; you're
> right, you can't please everyone all the time, so there has to be some
> kind of knob there which people can adjust based on their particular use
> case and system. This is about what exactly the knob is and how it's
> implemented and documented.

I'm happy with more than one way. It'd be nice if a single parameter,
giving one dimension of tuning, suited all ways people have said they
would like it to behave. I've not found a way of doing that.

I have no problem at all with adding additional parameters or mechanisms
to cater for the multiple dimensions of control people have asked for.
So your original interpretation is also valid for some users.

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 14:27:47
Message-ID:	1272983267.4535.2045.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Downthread, I said..

On Tue, 2010-05-04 at 14:49 +0100, Simon Riggs wrote:

> The only difference is that max_standby_delay is measured from log
> timestamp. Perhaps it should work from WAL receipt timestamp rather than
> from log timestamp? That would make some of the problems go away without
> significantly changing the definition. I'll look at that.

Patch to implement this idea attached: for discussion, not tested yet.
No docs yet.

The attached patch redefines "standby delay" to be the amount of time
elapsed from point of receipt to point of application. The "point of
receipt" is reset every chunk of data when streaming, or every file when
reading file by file. In all cases this new time is later than the
latest log time we would have used previously.

This addresses all of your points, as shown below.

On Mon, 2010-05-03 at 11:37 -0400, Tom Lane wrote:
> There are three really fundamental problems with it:
>
> 1. The timestamps we are reading from the log might be historical,
> if we are replaying from archive rather than reading a live SR stream.
> In the current implementation that means zero grace period for standby
> queries. Now if your only interest is catching up as fast as possible,
> that could be a sane behavior, but this is clearly not the only possible
> interest --- in fact, if that's all you care about, why did you allow
> standby queries at all?

The delay used is from time of receipt of WAL, no longer from log date.
So this would no longer apply.

> 2. There could be clock skew between the master and slave servers.
> If the master's clock is a minute or so ahead of the slave's, again we
> get into a situation where standby queries have zero grace period, even
> though killing them won't do a darn thing to permit catchup. If the
> master is behind the slave then we have an artificially inflated grace
> period, which is going to slow down the slave.

The timestamp is from standby, not master, so this would no longer
apply.

> 3. There could be significant propagation delay from master to slave,
> if the WAL stream is being transmitted with pg_standby or some such.
> Again this results in cutting into the standby queries' grace period,
> for no defensible reason.

The timestamp is taken immediately at the point the WAL is ready for
replay, so other timing overheads would not be included.

> In addition to these fundamental problems there's a fatal implementation
> problem: the actual comparison is not to the master's current clock
> reading, but to the latest commit, abort, or checkpoint timestamp read
> from the WAL. Thus, if the last commit was more than max_standby_delay
> seconds ago, zero grace time. Now if the master is really idle then
> there aren't going to be any conflicts anyway, but what if it's running
> only long-running queries? Or what happens when it was idle for awhile
> and then starts new queries? Zero grace period, that's what.
>
> We could possibly improve matters for the SR case by having walsender
> transmit the master's current clock reading every so often (probably
> once per activity cycle), outside the WAL stream proper. The receiver
> could subtract off its own clock reading in order to measure the skew,
> and then we could cancel queries if the de-skewed transmission time
> falls too far behind. However this doesn't do anything to fix the cases
> where we aren't reading (and caught up to) a live SR broadcast.

--
Simon Riggs www.2ndQuadrant.com

Attachment	Content-Type	Size
walrcv_timestamp.patch	text/x-patch	4.0 KB

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 14:42:29
Message-ID:	1272984149.4535.2133.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 14:49 +0100, Simon Riggs wrote:

Patch to implement this idea posted in response to OT, upthread, so I
can respond to the original complaints directly.

--
Simon Riggs www.2ndQuadrant.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 17:00:17
Message-ID:	201005041700.o44H0Hm10346@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Mon, 2010-05-03 at 22:45 -0400, Bruce Momjian wrote:
>
> > As I remember, 9.0 has two behaviors:
> >
> > o master delays vacuum cleanup
> > o slave delays WAL application
> >
> > and in 9.1 we will be adding:
> >
> > o slave communicates snapshots to master
>
> > How would this figure into what we ultimately want in 9.1?
>
> We would still want all options, since "slave communicates snapshot to
> master" doesn't solve the problem it just moves the problem elsewhere.
> It's a question of which factors the user wishes to emphasise for their
> specific use.
>
> > I understand Simon's point that the two behaviors have different
> > benefits. However, I believe few users will be able to understand when
> > to use which.
>
> If users can understand how to set NDISTINCT for a column, they can
> understand this. It's not about complexity of UI, its about solving
> problems. When people hit an issue, I don't want to be telling people
> "we thought you wouldn't understand it, so we removed the parachute".
> They might not understand it *before* they hit a problem, so what? But
> users certainly will afterwards and won't say "thanks" if you prevent an
> option for them, especially for the stated reason. (My point about
> ndistinct: 99% of users have no idea that exists or when to use it, but
> it still exists as an option because it solves a known issue, just like
> this.)

Well, this is kind of my point --- that if few people are going to need
a parameter and it is going to take us to tell them to use it, it isn't
a good parameter because the other 99.9% are going to stare at the
parameters and not konw what it does or how it is different from other
similar parameters. Adding another parameter might help 0.1% of our
users, but it is going to confuse the other 99.9%. :-(

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 17:36:35
Message-ID:	1272994595.4535.2705.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 13:00 -0400, Bruce Momjian wrote:

> Well, this is kind of my point --- that if few people are going to need
> a parameter and it is going to take us to tell them to use it, it isn't
> a good parameter because the other 99.9% are going to stare at the
> parameters and not konw what it does or how it is different from other
> similar parameters. Adding another parameter might help 0.1% of our
> users, but it is going to confuse the other 99.9%. :-(

You've missed my point. Most users of HS will need these parameters.
There is no need to understand them immediately, nor do I expect them to
do so. People won't understand why they exist until they've understood
the actual behaviour, received some errors and *then* they will
understand them, want them and need them. Just like deadlocks, ndistinct
and loads of other features we provide and support.

The current behaviour of max_standby_delay is designed to favour High
Availability users, not query users. I doubt that users with HA concerns
are only 0.1% of our users. I've accepted that some users may not put
that consideration first and so adding some minor, easy to implement
additional parameters will improve the behaviour for those people.
Forcing just one behaviour will be bad for many people.

--
Simon Riggs www.2ndQuadrant.com

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 18:27:29
Message-ID:	4BE06711.708@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon,

> Yes, the max wait on any *one* blocker will be max_standby_delay. But if
> you wait for two blockers, then the total time by which the standby lags
> will now be 2*max_standby_delay. Add a third, fourth etc and the standby
> lag keeps rising.

I still don't see how that works. If we're locking for applying log
segments, then any query which came in after the recovery lock would,
presumably, wait. So you'd have a lot of degraded query performance,
but no more than max_standby_delay of waiting to apply logs.

I'm more interested in your assertion that there's a lot in the
replication stream which doesn't take a lock; if that's the case, then
implementing any part of Tom's proposal is hopeless.

> * standby query delay - defined as the time that recovery will wait for
> a query to complete before a cancellation takes place. (We could
> complicate this by asking what happens when recovery is blocked twice by
> the same query? Would it wait twice, or does it have to track how much
> it has waited for each query in total so far?)

Aha! Now I see the confusion. AFAIK, Tom was proposing that the
pending recovery data would wait for max_standby_delay, total, then
cancel *all* queries which conflicted with it. Now that we've talked
this out, though, I can see that this can still result in "mass cancel"
issues, just like the current max_standby_delay. The main advantage I
can see to Tom's idea is that (presumably) it can be more discriminating
about which queries it cancels.

I agree that waiting on *each* query for "up to # time" would be a
completely different behavior, and as such, should be a option for DBAs.
We might make it the default option, but we wouldn't make it the only
option.

Speaking of which, was *your* more discriminating query cancel ever applied?

> Currently max_standby_delay seeks to constrain the standby lag to a
> particular value, as a way of providing a bounded time for failover, and
> also to constrain the amount of WAL that needs to be stored as the lag
> increases. Currently, there is no guaranteed minimum query delay given
> to each query.

Yeah, I can just see a lot of combinational issues with this. For
example, what if the user's network changes in some way to retard
delivery of log segments to the point where the delivery time is longer
than max_standby_delay? To say nothing about system clock synch, which
isn't perfect even if you have it set up.

I can see DBAs who are very focussed on HA wanting a standby-lag based
control anyway, when HA is far more important than the ability to run
queries on the slave. But I don't that that is the largest group; I
think that far more people will want to balance the two considerations.

Ultimately, as you say, we would like to have all three knobs:

standby lag: max time measured from master timestamp to slave timestamp

application lag: max time measured from local receipt of WAL records
(via log copy or recovery connection) to their application

query lag: max time any query which is blocking a recovery operation can run

These three, in combination, would let us cover most potential use
cases. So I think you've assessed that's where we're going in the
9.1-9.2 timeframe.

However, I'd say for 9.0 that "application lag" is the least confusing
option and the least dependant on the DBA's server room setup. So if we
can only have one of these for 9.0 (and I think going out with more than
one might be too complex, especially at this late date) I think that's
the way to go.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 18:31:19
Message-ID:	n2g407d949e1005041131h4fc0b29due129da7fb64e91f3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 3, 2010 at 4:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 1. The timestamps we are reading from the log might be historical,

> 2. There could be clock skew between the master and slave servers.

> 3. There could be significant propagation delay from master to slave,

So it sounds like what you're expecting is for max_standby_delay to
represent not the maximum lag between server commit and standby commit
but rather the maximum lag introduced by conflicts. Or perhaps maximum
lag introduced relative to the lag present at startup. I think it's
possible to implement either of these and it would solve all three
problems above:

The slave maintains a static measure of how far behind it is from the
master. Every time it executes a recovery operation or waits on a
conflict it adds the time it spent executing or waiting. Every time it
executes a commit record it subtracts the *difference* between this
commit record and the last. I assume we clip at 0 so it never goes
negative which has odd effects but it seems to match what I would
expect to happen.

In the face of a standby recovering historical logs then it would
start with a assumed delay of 0. As long as the conflicts don't slow
down execution of the logs so that they run slower than the server
then the measured delay would stay near 0. The only time queries would
be canceled would be if the conflicts are causing problems replaying
the logs.

In the face of clock skew it nothing changes as long as the clocks run
at the same speed.

In the face of an environment where the master is idle I think this
scheme has the same problems you described but I think this might be
manageable. Perhaps we need more timestamps in the master's log stream
aside from the commit timestamps. Or perhaps we don't care about
standby delay except when reading a commit record since any other
record isn't actually delayed unless its commit is delayed.

--
greg

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 19:50:31
Message-ID:	1273002631.4535.2900.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 11:27 -0700, Josh Berkus wrote:

> I still don't see how that works.
...

The good news is we agree by the time we get to the bottom... ;-)

> I'm more interested in your assertion that there's a lot in the
> replication stream which doesn't take a lock; if that's the case, then
> implementing any part of Tom's proposal is hopeless.

(No, still valid, the idea is generic)

> > * standby query delay - defined as the time that recovery will wait for
> > a query to complete before a cancellation takes place. (We could
> > complicate this by asking what happens when recovery is blocked twice by
> > the same query? Would it wait twice, or does it have to track how much
> > it has waited for each query in total so far?)
>
> Aha! Now I see the confusion.

BTW, Tom's proposal was approx half a sentence long so that is the
source of any confusion.

> AFAIK, Tom was proposing that the
> pending recovery data would wait for max_standby_delay, total, then
> cancel *all* queries which conflicted with it. Now that we've talked
> this out, though, I can see that this can still result in "mass cancel"
> issues, just like the current max_standby_delay. The main advantage I
> can see to Tom's idea is that (presumably) it can be more discriminating
> about which queries it cancels.

As I said to Stephen, this is exactly how it works already and wasn't
what was proposed.

> I agree that waiting on *each* query for "up to # time" would be a
> completely different behavior, and as such, should be a option for DBAs.
> We might make it the default option, but we wouldn't make it the only
> option.

Glad to hear you say that.

> Speaking of which, was *your* more discriminating query cancel ever applied?
>
> > Currently max_standby_delay seeks to constrain the standby lag to a
> > particular value, as a way of providing a bounded time for failover, and
> > also to constrain the amount of WAL that needs to be stored as the lag
> > increases. Currently, there is no guaranteed minimum query delay given
> > to each query.
>
> Yeah, I can just see a lot of combinational issues with this. For
> example, what if the user's network changes in some way to retard
> delivery of log segments to the point where the delivery time is longer
> than max_standby_delay? To say nothing about system clock synch, which
> isn't perfect even if you have it set up.
>
> I can see DBAs who are very focussed on HA wanting a standby-lag based
> control anyway, when HA is far more important than the ability to run
> queries on the slave. But I don't that that is the largest group; I
> think that far more people will want to balance the two considerations.
>
> Ultimately, as you say, we would like to have all three knobs:
>
> standby lag: max time measured from master timestamp to slave timestamp
>
> application lag: max time measured from local receipt of WAL records
> (via log copy or recovery connection) to their application

> query lag: max time any query which is blocking a recovery operation can run
>
> These three, in combination, would let us cover most potential use
> cases. So I think you've assessed that's where we're going in the
> 9.1-9.2 timeframe.
>
> However, I'd say for 9.0 that "application lag" is the least confusing
> option and the least dependant on the DBA's server room setup. So if we
> can only have one of these for 9.0 (and I think going out with more than
> one might be too complex, especially at this late date) I think that's
> the way to go.

Before you posted, I submitted a patch on this thread to redefine
max_standby_delay to depend upon the "application lag", as you've newly
defined it here - though obviously I didn't call it that. That solves
Tom's 3 issues. max_apply_delay might be technically more accurate term,
though isn't sufficiently better parameter name as to be worth the
change.

That patch doesn't implement his proposal, but that can be done as well
as (though IMHO not instead of). Given that two people have already
misunderstood what Tom proposed, and various people are saying we need
only one, I'm getting less inclined to have that at all.

--
Simon Riggs www.2ndQuadrant.com

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 20:57:38
Message-ID:	4BE08A42.40008@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>> AFAIK, Tom was proposing that the
>> pending recovery data would wait for max_standby_delay, total, then
>> cancel *all* queries which conflicted with it. Now that we've talked
>> this out, though, I can see that this can still result in "mass cancel"
>> issues, just like the current max_standby_delay. The main advantage I
>> can see to Tom's idea is that (presumably) it can be more discriminating
>> about which queries it cancels.
>
> As I said to Stephen, this is exactly how it works already and wasn't
> what was proposed.

Well, it's not exactly how it works, as I understand it ... doesn't the
timer running out on the slave currently cancel *all* running queries
with old snapshots, regardless of what relations they touch?

> Before you posted, I submitted a patch on this thread to redefine
> max_standby_delay to depend upon the "application lag", as you've newly
> defined it here - though obviously I didn't call it that. That solves
> Tom's 3 issues. max_apply_delay might be technically more accurate term,
> though isn't sufficiently better parameter name as to be worth the
> change.

Yeah, that looks less complicated for admins. Thanks.

> That patch doesn't implement his proposal, but that can be done as well
> as (though IMHO not instead of). Given that two people have already
> misunderstood what Tom proposed, and various people are saying we need
> only one, I'm getting less inclined to have that at all.

Given your clarification on the whole set of behaviors, I'm highly
dubious about the idea of implementing Tom's proposal when we're already
Beta 1. It seems like a 9.1 thing.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 22:53:33
Message-ID:	21996.1273013613@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Given your clarification on the whole set of behaviors, I'm highly
> dubious about the idea of implementing Tom's proposal when we're already
> Beta 1. It seems like a 9.1 thing.

I think you missed the point: "do nothing" is not a viable option.
I was proposing something that seemed simple enough to be safe to
drop into 9.0 at this point. I'm less convinced that what Simon
is proposing is safe enough.

regards, tom lane

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 23:06:00
Message-ID:	1273014360.4535.3069.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 18:53 -0400, Tom Lane wrote:

> I think you missed the point: "do nothing" is not a viable option.
> I was proposing something that seemed simple enough to be safe to
> drop into 9.0 at this point.

I've posted a patch that meets your stated objections. If you could
review that, this could be done in an hour.

There are other ways, but you'll need to explain a proposal in enough
detail that we're clear what you actually mean.

> I'm less convinced that what Simon is proposing is safe enough.

Which proposal?

--
Simon Riggs www.2ndQuadrant.com

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 23:26:41
Message-ID:	4BE0AD31.9010600@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> 1. The timestamps we are reading from the log might be historical,
> if we are replaying from archive rather than reading a live SR stream.
> In the current implementation that means zero grace period for standby
> queries. Now if your only interest is catching up as fast as possible,
> that could be a sane behavior, but this is clearly not the only possible
> interest --- in fact, if that's all you care about, why did you allow
> standby queries at all?
>

If the standby is not current, you may not want people to execute
queries against it. In some situations, returning results against
obsolete data is worse than not letting the query execute at all. As I
see it, the current max_standby_delay implementation includes the
expectation that the results you are getting are no more than
max_standby_delay behind the master, presuming that new data is still
coming in. If the standby has really fallen further behind than that,
there are situations where you don't want it doing anything but catching
up until that is no longer the case, and you especially don't want it
returning stale query data.

The fact that tuning in that direction could mean the standby never
actually executes any queries is something you need to monitor for--it
suggests the standby isn't powerful/well connected to the master enough
to keep up--but that's not necessarily the wrong behavior. Saying "I
only want the standby to execute queries if it's not too far behind the
master" is the answer to "why did you allow standby queries at all?"
when tuning for that use case.

> 2. There could be clock skew between the master and slave servers.
>

Not the database's problem to worry about. Document that time should be
carefully sync'd and move on. I'll add that.

Then people should adjust their max_standby_delay upwards to account for
that. For high availability purposes, it's vital that the delay number
be referenced to the commit records on the master. If lag is eating a
portion of that, again it's something people should be monitoring for,
but not something we can correct. The whole idea here is that
max_standby_delay is an upper bound on how stale the data on the standby
can be, and whether or not lag is a component to that doesn't impact how
the database is being asked to act.

> In addition to these fundamental problems there's a fatal implementation
> problem: the actual comparison is not to the master's current clock
> reading, but to the latest commit, abort, or checkpoint timestamp read
> from the WAL.
Right; this has been documented for months at
http://wiki.postgresql.org/wiki/Hot_Standby_TODO and on the list before
that, i.e. "If there's little activity in the master, that can lead to
surprising results." The suggested long-term fix has been adding
keepalive timestamps into SR, which seems to get reinvented every time
somebody plays with this for a bit. The HS documentation improvements
I'm working on will suggest that you make sure this doesn't happen, that
people have some sort of keepalive WAL-generating activity on the
master regularly, if they expect max_standby_delay to work reasonably in
the face of an idle master. It's not ideal, but it's straightforward to
work around in user space.

> I'm inclined to think that we should throw away all this logic and just
> have the slave cancel competing queries if the replay process waits
> more than max_standby_delay seconds to acquire a lock. This is simple,
> understandable, and behaves the same whether we're reading live data or
> not.

I don't consider something that allows queries to execute when not
playing recent "live" data is necessarily a step forward, from the
perspective of implementations preferring high-availability. It's
reasonable for some people to request that the last thing a standby
that's not current (<max_standby_delay behind the master, based on the
last thing received) should be doing is answering any queries, when it
doesn't have current data and it should be working on catchup instead.

Discussion here obviously has wandered past your fundamental objections
here and onto implementation trivia, but I didn't think the difference
between what you expected and what's actually committed already was
properly addressed before doing that.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 23:33:56
Message-ID:	22734.1273016036@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith <greg(at)2ndquadrant(dot)com> writes:
> If the standby is not current, you may not want people to execute
> queries against it. In some situations, returning results against
> obsolete data is worse than not letting the query execute at all. As I
> see it, the current max_standby_delay implementation includes the
> expectation that the results you are getting are no more than
> max_standby_delay behind the master, presuming that new data is still
> coming in. If the standby has really fallen further behind than that,
> there are situations where you don't want it doing anything but catching
> up until that is no longer the case, and you especially don't want it
> returning stale query data.

That is very possibly a useful thing to be able to specify, but the
current implementation has *nothing whatsoever* to do with making such a
guarantee. It will only kill queries that are creating a lock conflict.
I would even argue that it's a bad thing to have a parameter that looks
like it might do that, when it doesn't.

regards, tom lane

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 23:34:07
Message-ID:	4BE0AEEF.4070304@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 5/4/10 4:26 PM, Greg Smith wrote:
>
> Not the database's problem to worry about. Document that time should be
> carefully sync'd and move on. I'll add that.

Releasing a hot standby which *only* works for users with an operational
ntp implementation is highly unrealistic. Having built-in replication
in PostgreSQL was supposed to give the *majority* of users a *simple*
option for 2-server failover, not cater only to the high end. Every
administrative requirement we add to HS/SR eliminates another set of
potential users, as well as adding another set of potential failure
conditions which need to be monitored.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-04 23:51:12
Message-ID:	1273017072.26743.36.camel@jd-desktop.unknown.charter.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 16:34 -0700, Josh Berkus wrote:
> On 5/4/10 4:26 PM, Greg Smith wrote:
> >
> > Not the database's problem to worry about. Document that time should be
> > carefully sync'd and move on. I'll add that.
>
> Releasing a hot standby which *only* works for users with an operational
> ntp implementation is highly unrealistic. Having built-in replication
> in PostgreSQL was supposed to give the *majority* of users a *simple*
> option for 2-server failover, not cater only to the high end. Every
> administrative requirement we add to HS/SR eliminates another set of
> potential users, as well as adding another set of potential failure
> conditions which need to be monitored.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 00:21:02
Message-ID:	4BE0B9EE.6070908@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Josh Berkus wrote:
> Having built-in replication in PostgreSQL was supposed to give the *majority* of users a *simple*
> option for 2-server failover, not cater only to the high end.

If that's your criteria, 9.0 has already failed that goal. Just the
fact that you have to make your own base backup and manage that whole
area alone excludes "simple" as a goal anyone can claim 9.0 meets with a
straight face, long before you get to the mechanics of how HS handles
query cancellation. The new replication oriented features are
functional, but neither are close to simple yet. Based on the
complication level of replication in other database products, I wouldn't
put money on that even being possible. You can make a simpler path the
default one, but the minute you want to support more than one use case
the complexity involved in setting up replication explodes.

Anyway, I have no idea where the idea that recommending time
synchronization is a somehow a "high end" requirement, given that every
OS I'm aware of makes that trivial nowadays. Slave servers that drift
too far away from the master time are going to cause all sorts of
problems for user apps too. Any app that gauges how long ago something
happened by comparing a database timestamp with now() is going to give
misleading results for example, and I know I see those all the time.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 00:31:07
Message-ID:	24335.1273019467@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith <greg(at)2ndquadrant(dot)com> writes:
> Anyway, I have no idea where the idea that recommending time
> synchronization is a somehow a "high end" requirement,

Considering that clock skew was only one of several scenarios in which
the max_standby_delay code misbehaves, it's not that important whether
you consider it highly probable or not. The code still needs a
redesign, and we may as well eliminate the assumption of tight
synchronization while we are at it. There's no really good reason to
have that requirement in there.

regards, tom lane

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 00:48:31
Message-ID:	4BE0C05F.7030707@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> Releasing a hot standby which *only* works for users with an operational
> ntp implementation is highly unrealistic. Having built-in replication
> in PostgreSQL was supposed to give the *majority* of users a *simple*
> option for 2-server failover, not cater only to the high end. Every
> administrative requirement we add to HS/SR eliminates another set of
> potential users, as well as adding another set of potential failure
> conditions which need to be monitored.

To be completely practical, I'm saying that we should apply & test
Simon's latest patch moving the delay calculation to be application lag
instead of standby lag.

I'm also suggesting that we should have a standby lag option for 9.1 (as
well as, probably, a "wait forever" option ala Tom's suggestion).

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 03:06:32
Message-ID:	201005050306.o4536WH17128@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Greg Smith <greg(at)2ndquadrant(dot)com> writes:
> > Anyway, I have no idea where the idea that recommending time
> > synchronization is a somehow a "high end" requirement,
>
> Considering that clock skew was only one of several scenarios in which
> the max_standby_delay code misbehaves, it's not that important whether
> you consider it highly probable or not. The code still needs a
> redesign, and we may as well eliminate the assumption of tight
> synchronization while we are at it. There's no really good reason to
> have that requirement in there.

Should I be concerned that we are redesigning HS features at this stage
in the release?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 03:38:05
Message-ID:	AANLkTinDIyJgOZqPBymJA6bMiw3ZCrCau6Lp1h5eZ1Ko@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, May 4, 2010 at 11:06 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Should I be concerned that we are redesigning HS features at this stage
> in the release?

Yep. You can decide whether you want to be concerned by the redesign
itself, or by the concerns about the underlying code that are
motivating the redesigns, but yes, you should definitely be concerned.

...Robert

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 04:01:04
Message-ID:	4BE0ED80.3030102@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Tue, May 4, 2010 at 11:06 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
>> Should I be concerned that we are redesigning HS features at this stage
>> in the release?
>>
>
> Yep. You can decide whether you want to be concerned by the redesign
> itself, or by the concerns about the underlying code that are
> motivating the redesigns, but yes, you should definitely be concerned.
>
>
>

Our process is shot to pieces. But then, we knew that, didn't we ;-)

cheers

andrew

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 04:18:38
Message-ID:	t2s407d949e1005042118m3c74ee7fif7699cccf897e857@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 5:01 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Our process is shot to pieces. But then, we knew that, didn't we ;-)
>

Franky I think these kinds of usability questions are things that
we'll never have great success getting feedback on without users
banging on the system. There are solutions but none of them are
perfect. We a) either let releases go out with functionality that
nobody's sure users will be happy with and promise that the next
release will polish it, b) commit big functionality changes only at
the beginning of the release cycle and live with a two-release cycle
latency for such changes, or c) make last-minute changes during betas.

So far we've seen a combination of (a) and (c). I think (a) has worked
out well, sure we have some non-ideal behaviour in new features but
people forgive and forget when the later releases polish it. I've
always advocated for (b) though, I think we've been lucky with the
cases where we've needed last minute adjustments that they've worked
better than we had a right to expect.

--
greg

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 07:16:33
Message-ID:	1273043793.4535.3758.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-04 at 23:06 -0400, Bruce Momjian wrote:

> Should I be concerned that we are redesigning HS features at this stage
> in the release?

We knew we had to have one final discussion on HS snapshots. This is it.

Tom has raised valid issues, all of which already known. If we can
address them, we should.

A straightforward patch [walrcv_timestamp.patch] to address all of those
points. (Posted 13 hours prior to your post. That it was ignored by all
while debate continued is one point of concern, for me, though there
seems to have been confusion as to what that patch actually was.)

Tom has also raised a separate proposal, though that hasn't yet been
properly explained and there has been much debate about what he actually
meant. It is possible there is something worthwhile there, if that
involves adding a new capability. Myself, Stephen, Josh and Greg say
that changing max_standby_delay so there is no bounded startup time
would be a bad thing, if that is its only behaviour in 9.0.

I will tidy up walrcv_timestamp.patch and apply on Thu evening unless
there are concise, rational objections to that patch, which I consider
to be a bug fix and not blocked by beta.

Tom raised 7 other main points, that following detailed investigation
have resulted in 2 minor bugs, 2 unresolved questions on the patch and 1
further request for code comments. The 2 bugs affect corner cases only
and so are minor. They will be fixed over next few days since not
instant fixes. Open items list updated with items mentioned here, plus
performance query discussed on other thread. Nothing much here likely to
cause a problem if we need to go beta immediately, IMO.

I am mostly unavailable for next few days. (Repairing bikeshed.)

Expect at least 3 commits from me over next few days.

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 10:23:12
Message-ID:	AANLkTikoo-P8cbjFLLozsCRfcnMAgT5YlIRrUB7ANTqG@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 3:16 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Expect at least 3 commits from me over next few days.

I think you need to rethink the way that you decide when it's time to
commit things. There is certainly no consensus on any of the things
you are proposing to commit, nor have they been adequately (or, uh, at
all) reviewed. Saying that your proposal addresses all of Tom's
objections doesn't make it so. I am planning to read that patch and
offer an opinion on it, but I haven't done so yet and I imagine Tom
will weigh in at some point as well. Racing to commit a pile of code
that nobody else has tested is not going to improve anything.

...Robert

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 13:46:15
Message-ID:	4BE176A7.20902@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> The attached patch redefines "standby delay" to be the amount of time
> elapsed from point of receipt to point of application. The "point of
> receipt" is reset every chunk of data when streaming, or every file when
> reading file by file. In all cases this new time is later than the
> latest log time we would have used previously.

This seems completely wrong to me. If the WAL receiver keeps receiving
stuff, (last receive timestamp) - (current timestamp) would never be
more than a few seconds. Regardless of how much applying the WAL has
fallen behind.

To accomplish what you're trying to accomplish, you would need to label
each received WAL record with the timestamp when it was received, and
compare the reception timestamp of the record you're applying against
current timestamp.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 13:58:13
Message-ID:	4BE17975.6010200@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Comments?

There's currently three ways to set max_standby_delay:

max_standby_delay = -1 # Query wins
max_standby_delay = 0 # Recovery wins
max_standby_delay > X # Query wins until lag > X.

As Tom points out, the 3rd option has all sorts of problems. I very much
like the behavior that max_standby_delay tries to accomplish, but I have
to agree that it's not very reliable as it is. I don't like Tom's
proposal either; the standby can fall behind indefinitely, and queries
get a varying grace period.

Let's rip out the concept of a delay altogether, and make it a boolean.
If you really want your query to finish, set it to -1 (using the current
max_standby_delay nomenclature). If recovery is important to you, set it
to 0.

If you have the monitoring in place to sensibly monitor the delay
between primary and standby, and you want a limit on that, you can put
together a script to flip the switch in postgresql.conf if the standby
falls too much behind.

It would be nice to make that settable per-session, BTW. Though as soon
as you have one session using -1, the standby could fall behind. Still,
it might be useful if you run both kinds of queries on the same standby.

Ok, now that we've gotten over that, here's another proposal for what a
delay setting could look like. Let's have a setting similar to
statement_timeout, that specifies how long a statement is allowed to run
until it becomes subject to killing if it conflicts with recovery
(actually, it would have to be a per-transaction setting, at least in
serializable mode). This would be similar to Tom's proposal, and it
would have the same drawback that it would give no guarantee on how much
the standby can fall behind. However, it would be easier to understand:
a query gets to run for X seconds, and after that it will be killed if
it gets in the way.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 14:19:03
Message-ID:	87d3xah1ns.fsf@hi-media-techno.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> Comments?
>
> There's currently three ways to set max_standby_delay:
>
> max_standby_delay = -1 # Query wins
> max_standby_delay = 0 # Recovery wins
> max_standby_delay > X # Query wins until lag > X.
>
> As Tom points out, the 3rd option has all sorts of problems. I very much
> like the behavior that max_standby_delay tries to accomplish, but I have
> to agree that it's not very reliable as it is. I don't like Tom's
> proposal either; the standby can fall behind indefinitely, and queries
> get a varying grace period.
>
> Let's rip out the concept of a delay altogether, and make it a boolean.
> If you really want your query to finish, set it to -1 (using the current
> max_standby_delay nomenclature). If recovery is important to you, set it
> to 0.

I can't help but insisting on it, sorry. But.

The obvious solution to this problem for me is that to either make the
boolean reload friendly or to have pause/resume recovery. Ideally, both.

Then the default setting would be recovery wins, you pause the standby
replaying to ensure your query runs to completion. Very crude setting,
but 9.0 would offer easy to setup slave for *either* HA *or* off-load,
and a way to mitigate somehow.

The automated educated conflict solving based on some sort of timeout
running for one or all the current queries seems much harder to agree
upon when compared to applying existing code we tough we wouldn't yet
need. Let's revisit that decision: it seems to me we need it for 9.0.

Regards,
--
dim

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 14:56:43
Message-ID:	4BE1872B.5000103@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> I am mostly unavailable for next few days. (Repairing bikeshed.)

Hey, you're supposed to do the bikeshedding on-list! ;-)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 15:10:39
Message-ID:	2213.1273072239@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> To accomplish what you're trying to accomplish, you would need to label
> each received WAL record with the timestamp when it was received, and
> compare the reception timestamp of the record you're applying against
> current timestamp.

Yeah, this is why I thought that closed-loop lag control was a research
project.

In practice, we don't have to track it at the individual record level.
The real behavior of walsender is that we get a "gob" of WAL each
activity cycle, and so tracking the WAL start location and receipt time
for each gob ought to be sufficient. (In fact trying to ascribe any
finer-grain receipt time than that to individual WAL records is probably
bogus anyway.) It might be enough to remember the start location and
time for the latest gob, depending on exactly what control algorithm you
want to use.

But the whole thing requires significant thought and testing, which we
really haven't got time for now.

regards, tom lane

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 16:18:30
Message-ID:	AANLkTilaha4kDiLHoMNXAPh3wixCaJrK10OxY8Hi5qlP@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 9:58 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Tom Lane wrote:
>> Comments?
>
> There's currently three ways to set max_standby_delay:
>
> max_standby_delay = -1 # Query wins
> max_standby_delay = 0 # Recovery wins
> max_standby_delay > X # Query wins until lag > X.
>
> As Tom points out, the 3rd option has all sorts of problems. I very much
> like the behavior that max_standby_delay tries to accomplish, but I have
> to agree that it's not very reliable as it is. I don't like Tom's
> proposal either; the standby can fall behind indefinitely, and queries
> get a varying grace period.
>
> Let's rip out the concept of a delay altogether, and make it a boolean.
> If you really want your query to finish, set it to -1 (using the current
> max_standby_delay nomenclature). If recovery is important to you, set it
> to 0.

Does my proposal (upthread) to limit this by quantity of WAL rather
than time have any legs, or is that impractical and/or otherwise poor?

...Robert

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 16:30:47
Message-ID:	4BE19D37.4040305@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> Does my proposal (upthread) to limit this by quantity of WAL rather
> than time have any legs, or is that impractical and/or otherwise poor?

That would certainly be easier to implement sanely than a time-based
quantity. One problem is that we don't know how much unapplied WAL there
is, when you're not using streaming replication. And I'm not sure how
useful that is to users - it's very hard to estimate what to set it to.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 18:33:33
Message-ID:	AANLkTilYFXPV49pMsMNetPXetTP4COx6g96UfnBWrLib@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 12:30 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Robert Haas wrote:
>> Does my proposal (upthread) to limit this by quantity of WAL rather
>> than time have any legs, or is that impractical and/or otherwise poor?
>
> That would certainly be easier to implement sanely than a time-based
> quantity. One problem is that we don't know how much unapplied WAL there
> is, when you're not using streaming replication.

Hmm, that's a problem, likely fatally so.

> And I'm not sure how
> useful that is to users - it's very hard to estimate what to set it to.

I'm not sure whether that's really an issue or not. I mean, if we say
that the standby is allowed to be, say, 16MB behind the master, we
know that recovery time is bounded by how long it takes to replay
16MB. Which is in some ways more defined than saying we're behind the
primary by 30 min, which could take a long time to replay or not much
at all. But, I guess it's moot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 20:49:25
Message-ID:	1273092565.4535.4767.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-05 at 16:46 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > The attached patch redefines "standby delay" to be the amount of time
> > elapsed from point of receipt to point of application. The "point of
> > receipt" is reset every chunk of data when streaming, or every file when
> > reading file by file. In all cases this new time is later than the
> > latest log time we would have used previously.
>
> This seems completely wrong to me. If the WAL receiver keeps receiving
> stuff, (last receive timestamp) - (current timestamp) would never be
> more than a few seconds. Regardless of how much applying the WAL has
> fallen behind.

I see your point.

> To accomplish what you're trying to accomplish, you would need to label
> each received WAL record with the timestamp when it was received, and
> compare the reception timestamp of the record you're applying against
> current timestamp.

Yes, OK. Obviously doing it for every record would be silly, so sampling
WAL records is the only way. If we save the timestamp every 16MB of WAL
that would work for both file and streaming.

Of course, if WAL was written regularly none of this would be a problem.
Why not have WALSender write a new WAL record with a timestamp every X
seconds?

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 20:51:12
Message-ID:	1273092672.4535.4773.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-05 at 06:23 -0400, Robert Haas wrote:
> On Wed, May 5, 2010 at 3:16 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Expect at least 3 commits from me over next few days.
>
> I think you need to rethink the way that you decide when it's time to
> commit things. There is certainly no consensus on any of the things
> you are proposing to commit, nor have they been adequately (or, uh, at
> all) reviewed. Saying that your proposal addresses all of Tom's
> objections doesn't make it so. I am planning to read that patch and
> offer an opinion on it, but I haven't done so yet and I imagine Tom
> will weigh in at some point as well. Racing to commit a pile of code
> that nobody else has tested is not going to improve anything.

Only you have spoken of a race to commit and I have not said I would
refuse to listen to you or others.

Reading your words, it would be easy to forget we are a team of people
whose aim is software development. It's not the OK Corral.

Yesterday you berated me for unstable software. Today you oppose my
promise to fix that. Why is it, we all wonder, is it that you oppose
everything I say and do? No doubt you will oppose other committers in
the way you oppose me...

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 20:52:26
Message-ID:	1273092746.4535.4776.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-05 at 16:58 +0300, Heikki Linnakangas wrote:
> Let's have a setting similar to
> statement_timeout, that specifies how long a statement is allowed to
> run until it becomes subject to killing if it conflicts with recovery
> (actually, it would have to be a per-transaction setting, at least in
> serializable mode). This would be similar to Tom's proposal, and it
> would have the same drawback that it would give no guarantee on how
> much the standby can fall behind. However, it would be easier to
> understand:
> a query gets to run for X seconds, and after that it will be killed if
> it gets in the way.

If you want this, I have no problem with you getting this (though new feature
alert sirens going off, presumably).

I only have a problem with the suggestion that this replaces the current
max_standby_delay. There is no good case for only a single option.

--
Simon Riggs www.2ndQuadrant.com

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc:	"Greg Smith" <greg(at)2ndquadrant(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	On a somewhat disappointing correspondence (was: max_standby_delay considered harmful)
Date:	2010-05-05 21:45:14
Message-ID:	4BE1A09A0200002500031308@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

I've refrained from comment on max_standby_delay because I have
neither read the patch nor am likely to be an early adopter of HS;
however, as a potential eventual user I have to say that the
semantics for this GUC proposed by Simon seem sane and useful to me.

Certainly the documentation would need to be clear on the pitfalls
of using something other than 0 or -1, and there were technical
issues raised on the thread outside the scope of the semantics of
the GUC, but the issues around clock sync and transfer time ring of
FUD. We sync our central router to a bank of atomic clocks around
the world, and sync every server to the router -- if a server drifts
we would have much bigger problems than this GUC would pose, so we
monitor that and make loud noises should something drift.

Are there other controls that would be useful? Undoubtedly. Should
they be added to 9.0? I'm not in a position to say. I don't see
the point of ripping out one potentially useful control, which
*might* be sufficient for 9.0 because someone might choose to use it
inappropriately. Just make sure it's documented well enough.

> Yesterday you berated me for unstable software. Today you oppose
> my promise to fix that. Why is it, we all wonder, is it that you
> oppose everything I say and do?

Robert strikes me as a top-notch project manager, and his comments
struck me as totally in line with someone wearing that hat.

-Kevin

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 22:29:33
Message-ID:	4BE1F14D.4020202@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki, all,

> There's currently three ways to set max_standby_delay:
>
> max_standby_delay = -1 # Query wins
> max_standby_delay = 0 # Recovery wins
> max_standby_delay > X # Query wins until lag > X.
>
> As Tom points out, the 3rd option has all sorts of problems. I very much
> like the behavior that max_standby_delay tries to accomplish, but I have
> to agree that it's not very reliable as it is.

Wow, thanks for the summary. Based on that, I take back what I said to
Greg.

Because I think getting 9.0 out *on time* is more important than any of
these issues, I'm revising my opinion to be more in line with Greg
Smith. So, proposed path forwards.

(1) We work on getting the specific bugs Tom reported fixed.
(2) max_standby_delay default is 0
(3) documentation covers setting it to an integer, but warns extensively
about the required sysadminning and query cancel. As in "for advanced
users only".
(4) discussion of other synch methods gets shifted to 9.0

Ultimately, I think we'll be going to something lock-based like what Tom
suggested. However, I don't think that's doable without delaying 9.0
for 6 months, and I think that would be much worse than any current bug
with 9.0.

No matter how much we tinker with HS/SR, it's not going to be
bulletproof until 9.1. Or, more likely, 9.2.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-05 23:18:56
Message-ID:	201005052318.o45NIu018139@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Comments?
>
> There's currently three ways to set max_standby_delay:
>
> max_standby_delay = -1 # Query wins
> max_standby_delay = 0 # Recovery wins
> max_standby_delay > X # Query wins until lag > X.
>
> As Tom points out, the 3rd option has all sorts of problems. I very much
> like the behavior that max_standby_delay tries to accomplish, but I have
> to agree that it's not very reliable as it is. I don't like Tom's
> proposal either; the standby can fall behind indefinitely, and queries
> get a varying grace period.
>
> Let's rip out the concept of a delay altogether, and make it a boolean.
> If you really want your query to finish, set it to -1 (using the current
> max_standby_delay nomenclature). If recovery is important to you, set it
> to 0.
>
> If you have the monitoring in place to sensibly monitor the delay
> between primary and standby, and you want a limit on that, you can put
> together a script to flip the switch in postgresql.conf if the standby
> falls too much behind.
>
> It would be nice to make that settable per-session, BTW. Though as soon
> as you have one session using -1, the standby could fall behind. Still,
> it might be useful if you run both kinds of queries on the same standby.

+1 for a boolean

We are not supposed to be designing the behavior during beta, which is
exactly what we are doing, and I don't think we even know what behavior
we want, let alone have we implemented it. I think a boolean is very
clear and it gives you the chance to optimize _one_ case, which is
enough for 9.0. Let's revisit this for 9.1 when we will know a lot more
than we do now.

Once 9.1 reports slave snapshots back to the master, we might not need
anything more than a boolean here anyway.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: On a somewhat disappointing correspondence (was: max_standby_delay considered harmful)
Date:	2010-05-06 00:04:19
Message-ID:	201005060004.o4604JF25539@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>
> I've refrained from comment on max_standby_delay because I have
> neither read the patch nor am likely to be an early adopter of HS;
> however, as a potential eventual user I have to say that the
> semantics for this GUC proposed by Simon seem sane and useful to me.
>
> Certainly the documentation would need to be clear on the pitfalls
> of using something other than 0 or -1, and there were technical
> issues raised on the thread outside the scope of the semantics of
> the GUC, but the issues around clock sync and transfer time ring of
> FUD. We sync our central router to a bank of atomic clocks around
> the world, and sync every server to the router -- if a server drifts
> we would have much bigger problems than this GUC would pose, so we
> monitor that and make loud noises should something drift.
>
> Are there other controls that would be useful? Undoubtedly. Should
> they be added to 9.0? I'm not in a position to say. I don't see
> the point of ripping out one potentially useful control, which
> *might* be sufficient for 9.0 because someone might choose to use it
> inappropriately. Just make sure it's documented well enough.

We are not very good at _removing_ functionality/GUCs, and based on the
discussion so far, I think there is a very slim chance we would get it
right for 9.0, which is why I suggested converting it to a boolean and
revisiting this for 9.1.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 00:23:17
Message-ID:	AANLkTimdSPiOLxum9ho3zIWj5fLc0K-Nkv-Tv7ATIuo3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 7:18 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Heikki Linnakangas wrote:
>> Tom Lane wrote:
>> > Comments?
>>
>> There's currently three ways to set max_standby_delay:
>>
>> max_standby_delay = -1 # Query wins
>> max_standby_delay = 0 # Recovery wins
>> max_standby_delay > X # Query wins until lag > X.
>>
>> As Tom points out, the 3rd option has all sorts of problems. I very much
>> like the behavior that max_standby_delay tries to accomplish, but I have
>> to agree that it's not very reliable as it is. I don't like Tom's
>> proposal either; the standby can fall behind indefinitely, and queries
>> get a varying grace period.
>>
>> Let's rip out the concept of a delay altogether, and make it a boolean.
>> If you really want your query to finish, set it to -1 (using the current
>> max_standby_delay nomenclature). If recovery is important to you, set it
>> to 0.
>>
>> If you have the monitoring in place to sensibly monitor the delay
>> between primary and standby, and you want a limit on that, you can put
>> together a script to flip the switch in postgresql.conf if the standby
>> falls too much behind.
>>
>> It would be nice to make that settable per-session, BTW. Though as soon
>> as you have one session using -1, the standby could fall behind. Still,
>> it might be useful if you run both kinds of queries on the same standby.
>
> +1 for a boolean
>
> We are not supposed to be designing the behavior during beta, which is
> exactly what we are doing, and I don't think we even know what behavior
> we want, let alone have we implemented it. I think a boolean is very
> clear and it gives you the chance to optimize _one_ case, which is
> enough for 9.0. Let's revisit this for 9.1 when we will know a lot more
> than we do now.

The existing behavior is probably not optimal, but I'm not seeing what
benefit we get out of neutering it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 00:29:52
Message-ID:	201005060029.o460Tqp02775@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> >> If you have the monitoring in place to sensibly monitor the delay
> >> between primary and standby, and you want a limit on that, you can put
> >> together a script to flip the switch in postgresql.conf if the standby
> >> falls too much behind.
> >>
> >> It would be nice to make that settable per-session, BTW. Though as soon
> >> as you have one session using -1, the standby could fall behind. Still,
> >> it might be useful if you run both kinds of queries on the same standby.
> >
> > +1 for a boolean
> >
> > We are not supposed to be designing the behavior during beta, which is
> > exactly what we are doing, and I don't think we even know what behavior
> > we want, let alone have we implemented it. ?I think a boolean is very
> > clear and it gives you the chance to optimize _one_ case, which is
> > enough for 9.0. ?Let's revisit this for 9.1 when we will know a lot more
> > than we do now.
>
> The existing behavior is probably not optimal, but I'm not seeing what
> benefit we get out of neutering it.

We get to design it right, or maybe not need it at all in 9.1.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 00:36:37
Message-ID:	1997.1273106197@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Robert Haas wrote:
>> The existing behavior is probably not optimal, but I'm not seeing what
>> benefit we get out of neutering it.

> We get to design it right, or maybe not need it at all in 9.1.

Yeah. The good thing about a boolean is that it covers the two
noncontroversial cases (no-wait and wait forever), and doesn't lock
us into supporting cases that we don't really know how to do well
yet.

regards, tom lane

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 01:17:49
Message-ID:	4BE218BD.8090302@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki Linnakangas wrote:
> Let's rip out the concept of a delay altogether, and make it a boolean.
> If you really want your query to finish, set it to -1 (using the current
> max_standby_delay nomenclature). If recovery is important to you, set it
> to 0.
>

So the only user options would be "allow long-running queries to block
WAL application forever" and "always cancel queries on conflict?" That
would be taking away the behavior I was going to suggest as the default
to many customers I work with. I expect a non-trivial subset of people
using this feature will set max_standby_delay to is some small number of
minutes, similarly to how archive_timeout is sized now. Enough time to
get reasonably sized queries executed, not so long as to allow something
that might try to run for hours on the standby to increase failover
catchup time very much.

The way the behavior works is admittedly limited, and certainly some
people are going to want to set it to either 0 or -1. But taking it
away altogether is going to cripple one category of potential Hot
Standby use in the field. Consider this for a second: do you really
think that Simon would have waded into this coding mess, or that I would
have spent as much energy as I have highlighting issues with its use, if
there wasn't demand for it? If it wouldn't hurt the usefulness of
PostgreSQL 9.0 significantly to cut it, I'd have suggested that myself
two months ago and saved everyone (especially myself) a lot of trouble.

> If you have the monitoring in place to sensibly monitor the delay
> between primary and standby, and you want a limit on that, you can put
> together a script to flip the switch in postgresql.conf if the standby
> falls too much behind.
>

There's a couple of things you should do in order for max_standby_delay
to working as well as it can. Watching clock sync and forcing periodic
activity are two of them that always come up. Those are both trivial to
script for, and something I wouldn't expect any admin to object to.

If you need a script that involves changing a server setting to do
something, that translates into "you can't do that" for a typical DBA.
The idea of a program regularly changing a server configuration setting
on a production system is one you just can't sell. That makes this idea
incredibly more difficult to use in the field than any of the
workarounds that cope with the known max_standby_delay issues.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 01:36:24
Message-ID:	3228.1273109784@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith <greg(at)2ndquadrant(dot)com> writes:
> Heikki Linnakangas wrote:
>> Let's rip out the concept of a delay altogether, and make it a boolean.

> So the only user options would be "allow long-running queries to block
> WAL application forever" and "always cancel queries on conflict?

Got it in one.

Obviously, this is something that would be high priority to improve in
some fashion in 9.1. That doesn't mean that it's reasonable to drop in
a half-baked redesign now, nor to put in the amount of work that would
be required to have a really well-designed implementation, and most
certainly not to uncritically ship what we've got. We have a ton of
other work that has to be done to get 9.0 out the door, and this feature
is something that IMO we can live without for this release.

One reason I believe this isn't so critical as all that is that it only
matters for cases where the operation on the master took an exclusive
lock. In high-performance production scenarios that's something you try
hard to avoid anyway. When you succeed, the standby behavior is moot.
Even if you can't avoid exclusive locks entirely, you may be able to
confine them to maintenance windows where performance doesn't matter
so much ... and then that goes for the standby performance as well.

regards, tom lane

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 01:47:42
Message-ID:	h2t407d949e1005051847r7e14eddcw5cab89584ab6e43c@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 6, 2010 at 2:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> One reason I believe this isn't so critical as all that is that it only
> matters for cases where the operation on the master took an exclusive
> lock.

Uhm, or a vacuum ran. Or a HOT page cleanup occurred, or a btree page
split deleted old tuples.

--
greg

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: On a somewhat disappointing correspondence
Date:	2010-05-06 02:03:17
Message-ID:	4BE22365.1040907@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian wrote:
> We are not very good at _removing_ functionality/GUCs, and based on the
> discussion so far, I think there is a very slim chance we would get it
> right for 9.0, which is why I suggested converting it to a boolean and
> revisiting this for 9.1.
>

There's some feedback you can only get by exposing a complicated feature
to the users and seeing what they make of it. This one hasn't even had
a full week to gather beta user reports. Given that it's easy to
disable (just limiting the range on what is effectively a 3-way switch
to two positions), I don't understand why you're pushing at this point
for its removal. You could be encouraging testing instead, which I
believe is needed to know exactly what the right thing to do in 9.1 is.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 02:48:53
Message-ID:	4BE22E15.7070301@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark wrote:
> On Thu, May 6, 2010 at 2:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> One reason I believe this isn't so critical as all that is that it only
>> matters for cases where the operation on the master took an exclusive
>> lock.
>>
>
> Uhm, or a vacuum ran. Or a HOT page cleanup occurred, or a btree page
> split deleted old tuples.
>

Right; because there are so many regularly expected causes for query
cancellation, the proposed boolean setup really hurts the ability of a
server whose primary goal is high-availability to run queries of any
useful duration. For years I've been hearing "my HA standby is idle,
how can I put it to use?"; that's the back story of the users I thought
everyone knew were the known audience waiting for this feature.

If the UI for vacuum_defer_cleanup_age that prevented these things was
good, I would agree that the cases where max_standby_delay does
something useful are marginal. That's why I tried to get someone
working on SR to provide a hook for that purpose months ago. But since
the vacuum adjustment we have in completely obtuse xid units, that
leaves max_standby_delay as the only tunable here that you can even
think about in terms of human time.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 03:44:55
Message-ID:	AANLkTin3eFjmOu3nNyEaNuiIGzH0bP5EHp10EJCa2b5E@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 9:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Smith <greg(at)2ndquadrant(dot)com> writes:
>> Heikki Linnakangas wrote:
>>> Let's rip out the concept of a delay altogether, and make it a boolean.
>
>> So the only user options would be "allow long-running queries to block
>> WAL application forever" and "always cancel queries on conflict?
>
> Got it in one.
>
> Obviously, this is something that would be high priority to improve in
> some fashion in 9.1. That doesn't mean that it's reasonable to drop in
> a half-baked redesign now, nor to put in the amount of work that would
> be required to have a really well-designed implementation, and most
> certainly not to uncritically ship what we've got.

If you had a genuinely better idea for how this should work, I would
be the first to endorse it, but it's becoming clear that you don't,
which makes me also skeptical of your contention that we will be
better off with no knob at all. I find that position not very
plausible. Nor do I really see how this is backing us into any kind
of a corner. If we're really concerned that we're going to suddenly
come up with a much better method of controlling this behavior (and so
far nobody seems close to having such a brilliant insight), then let's
just put a note in the documentation saying that the setting has
problems X, Y, and Z and that if we develop a better method for
controlling this behavior, the GUC may be modified or removed in a
future release. Ripping it out seems like a drastic overreaction,
particularly considering that we're already in beta.

This feature has been in the tree since December 19th when the initial
Hot Standby patch was committed, and the last significant code change
was on February 13th. It is now May 5th. The fact that you didn't
read the patch sooner is not a reason why we should rip it out now.
Yes, the current implementation is a little crufty and has some
limitations. See also work_mem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 03:50:09
Message-ID:	201005060350.o463o9A05797@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith wrote:
> Heikki Linnakangas wrote:
> > Let's rip out the concept of a delay altogether, and make it a boolean.
> > If you really want your query to finish, set it to -1 (using the current
> > max_standby_delay nomenclature). If recovery is important to you, set it
> > to 0.
> >
>
> So the only user options would be "allow long-running queries to block
> WAL application forever" and "always cancel queries on conflict?" That
> would be taking away the behavior I was going to suggest as the default
> to many customers I work with. I expect a non-trivial subset of people
> using this feature will set max_standby_delay to is some small number of
> minutes, similarly to how archive_timeout is sized now. Enough time to
> get reasonably sized queries executed, not so long as to allow something
> that might try to run for hours on the standby to increase failover
> catchup time very much.
>
> The way the behavior works is admittedly limited, and certainly some
> people are going to want to set it to either 0 or -1. But taking it
> away altogether is going to cripple one category of potential Hot
> Standby use in the field. Consider this for a second: do you really
> think that Simon would have waded into this coding mess, or that I would
> have spent as much energy as I have highlighting issues with its use, if
> there wasn't demand for it? If it wouldn't hurt the usefulness of
> PostgreSQL 9.0 significantly to cut it, I'd have suggested that myself
> two months ago and saved everyone (especially myself) a lot of trouble.

We are not designing in a green field here. We have released beta1 and
we are trying to get to 9.0 final in a few months. If this feature
could have been designed easily months ago, it would have been done, but
it doesn't seem to have any easy solution, and we have run out of time
to fix it. As painful as it is, we need to cut our loses and move on.

We have already cut features like sync replication and communicating the
slave snapshot to the master; I don't see how removing this ability is
any worse. We don't have time to develop this for every use case, even
if those use cases are significant.

If someone wants to suggest that HS is useless if max_standby_delay
supports only boolean values, I am ready to suggest we remove HS as well
and head to 9.0 because that would suggest that HS itself is going to be
useless.

The code will not be thrown away; we will bring it back for 9.1.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 03:52:41
Message-ID:	201005060352.o463qfe06037@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> If you had a genuinely better idea for how this should work, I would
> be the first to endorse it, but it's becoming clear that you don't,
> which makes me also skeptical of your contention that we will be
> better off with no knob at all. I find that position not very
> plausible. Nor do I really see how this is backing us into any kind
> of a corner. If we're really concerned that we're going to suddenly
> come up with a much better method of controlling this behavior (and so
> far nobody seems close to having such a brilliant insight), then let's
> just put a note in the documentation saying that the setting has
> problems X, Y, and Z and that if we develop a better method for
> controlling this behavior, the GUC may be modified or removed in a
> future release. Ripping it out seems like a drastic overreaction,
> particularly considering that we're already in beta.
>
> This feature has been in the tree since December 19th when the initial
> Hot Standby patch was committed, and the last significant code change
> was on February 13th. It is now May 5th. The fact that you didn't
> read the patch sooner is not a reason why we should rip it out now.
> Yes, the current implementation is a little crufty and has some
> limitations. See also work_mem.

I am afraid the current setting is tempting for users to enable, but
will be so unpredictable that it will tarnish the repuation of HS and
Postgres. We don't want to be thinking in 9 months, "Wow, we shouldn't
have shipped that features. It is causing all kinds of problems." We
have done that before (rarely), and it isn't a good feeling.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: On a somewhat disappointing correspondence
Date:	2010-05-06 03:55:06
Message-ID:	201005060355.o463t6U06314@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith wrote:
> Bruce Momjian wrote:
> > We are not very good at _removing_ functionality/GUCs, and based on the
> > discussion so far, I think there is a very slim chance we would get it
> > right for 9.0, which is why I suggested converting it to a boolean and
> > revisiting this for 9.1.
> >
>
> There's some feedback you can only get by exposing a complicated feature
> to the users and seeing what they make of it. This one hasn't even had
> a full week to gather beta user reports. Given that it's easy to
> disable (just limiting the range on what is effectively a 3-way switch
> to two positions), I don't understand why you're pushing at this point
> for its removal. You could be encouraging testing instead, which I
> believe is needed to know exactly what the right thing to do in 9.1 is.

Our developers can't even figure out how it behaves; it is hard to see
how beta users will ever figure it out.

Now is the time to remove stuff, not late in beta. Pushing decisions
into the future is how betas drag.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 04:25:41
Message-ID:	1273119941.4535.4868.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-05 at 17:56 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > I am mostly unavailable for next few days. (Repairing bikeshed.)
>
> Hey, you're supposed to do the bikeshedding on-list! ;-)

That was just a joke, I'm mostly unavailable for other reasons.

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 04:32:43
Message-ID:	AANLkTil87FpZ7LwY-GgqzZbHbtgVipVieo5Fa9-Af4g8@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 11:50 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> If someone wants to suggest that HS is useless if max_standby_delay
> supports only boolean values, I am ready to suggest we remove HS as well
> and head to 9.0 because that would suggest that HS itself is going to be
> useless.

I think HS is going to be a lot less useful than many people think, at
least in 9.0. But I think ripping out max_standby_delay will make it
worse.

> The code will not be thrown away; we will bring it back for 9.1.

If that's the case, then taking it out makes no sense.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 04:47:02
Message-ID:	AANLkTimZAQLr88XGOgVQHduze-Z2ZFzLnGLmEaPg72PP@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 11:52 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> I am afraid the current setting is tempting for users to enable, but
> will be so unpredictable that it will tarnish the repuation of HS and
> Postgres. We don't want to be thinking in 9 months, "Wow, we shouldn't
> have shipped that features. It is causing all kinds of problems." We
> have done that before (rarely), and it isn't a good feeling.

I am not convinced it will be unpredictable. The only caveats that
I've seen so far are:

- You need to run ntpd.
- Queries will get cancelled like crazy if you're not using steaming
replication.

That just doesn't sound that bad to me, especially since the proposed
alternative is:

- Queries will get cancelled like crazy, period.

Or else:

- Replication can fall infinitely far behind and you can write a
tedious and error-prone script to try to prevent it if you like.

I think THAT is going to tarnish our reputation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 05:35:49
Message-ID:	4BE25535.6050704@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Wed, May 5, 2010 at 11:52 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> I am afraid the current setting is tempting for users to enable, but
>> will be so unpredictable that it will tarnish the repuation of HS and
>> Postgres. We don't want to be thinking in 9 months, "Wow, we shouldn't
>> have shipped that features. It is causing all kinds of problems." We
>> have done that before (rarely), and it isn't a good feeling.
>
> I am not convinced it will be unpredictable. The only caveats that
> I've seen so far are:
>
> - You need to run ntpd.
> - Queries will get cancelled like crazy if you're not using steaming
> replication.

And also in situations where the master is idle for a while and then
starts doing stuff. That's the most significant source of confusion,
IMHO, I wouldn't mind the requirement of ntpd so much.

> That just doesn't sound that bad to me, especially since the proposed
> alternative is:
>
> - Queries will get cancelled like crazy, period.
>
> Or else:
>
> - Replication can fall infinitely far behind and you can write a
> tedious and error-prone script to try to prevent it if you like.
>
> I think THAT is going to tarnish our reputation.

The difference is that that's easy to document and understand, so the
behavior won't be a surprise to anyone.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 05:38:16
Message-ID:	4BE255C8.3020002@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Wed, May 5, 2010 at 11:50 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> The code will not be thrown away; we will bring it back for 9.1.
>
> If that's the case, then taking it out makes no sense.

I doubt we're going to bring back the same code, because it still has
the same issues. But we will do something better thought-out. Or people
are happy with the boolean and no-one cares anymore, that's pretty
likely too.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Rob Wultsch <wultsch(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 06:15:53
Message-ID:	h2i2c5ef4e31005052315ga27b6ba0l758d2ee8692ce452@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 5, 2010 at 9:32 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, May 5, 2010 at 11:50 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> If someone wants to suggest that HS is useless if max_standby_delay
>> supports only boolean values, I am ready to suggest we remove HS as well
>> and head to 9.0 because that would suggest that HS itself is going to be
>> useless.
>
> I think HS is going to be a lot less useful than many people think, at
> least in 9.0. But I think ripping out max_standby_delay will make it
> worse.
>
>> The code will not be thrown away; we will bring it back for 9.1.
>
> If that's the case, then taking it out makes no sense.
>
<mysql dba troll>
I manage a bunch of different environments and I am pretty sure that
in any of them if the db started seemingly randomly killing queries I
would have application teams followed quickly by executives coming
after me with torches and pitchforks.

I can not imagine setting this value to anything other than a bool and
most of the time that bool would be -1. I would only be unleashing a
kill storm in utter desperation and I would probably need to explain
myself in detail after. Utter desperation means I am sure I am going
to have to do a impactful failover at any moment and need a slave
completely up to date NOW.

It is good to have the option to automatically cancel queries, but I
think it is a mistake to assume many people will use it.

What I would really need for instrumentation is the ability to
determine *easily* how much a slave is lagging in clock time.
</mysql dba troll>

--
Rob Wultsch
wultsch(at)gmail(dot)com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 07:34:29
Message-ID:	1273131269.12659.8.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 00:47 -0400, Robert Haas wrote:

Yes, that will.

There is no consensus to remove max_standby_delay.

It could be improved with minor adjustments and it makes more sense to
allow a few of those, treating them as bugs.

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Rob Wultsch <wultsch(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 08:12:51
Message-ID:	1273133571.12659.41.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-05 at 23:15 -0700, Rob Wultsch wrote:

> I manage a bunch of different environments and I am pretty sure that
> in any of them if the db started seemingly randomly killing queries I
> would have application teams followed quickly by executives coming
> after me with torches and pitchforks.

Fully understood and well argued, thanks for your input.

HS doesn't randomly kill queries and there are documented work-arounds
to control this behaviour.

Removing the parameter won't help the situation at all, it will make the
situation *worse* by removing control from where it's clearly needed and
removing all hope of making the HS feature work in practice. There is no
consensus to remove the parameter.

--
Simon Riggs www.2ndQuadrant.com

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 09:26:30
Message-ID:	87ljbxjs8p.fsf@hi-media-techno.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith <greg(at)2ndquadrant(dot)com> writes:
> If you need a script that involves changing a server setting to do
> something, that translates into "you can't do that" for a typical DBA. The
> idea of a program regularly changing a server configuration setting on a
> production system is one you just can't sell. That makes this idea
> incredibly more difficult to use in the field than any of the workarounds
> that cope with the known max_standby_delay issues.

I still think that the best API we can do in a timely fashion for 9.0
is:

standby_conflict_winner = replay|queries

pg_pause_recovery() / pg_resume_recovery()

It seems to me those two functions are only exposing existing facilities
in the code, so that's more an API change that a new feature
inclusion. Of course I'm certainly wrong. But the code has already been
written.

I don't think we'll find any better to offer our users in the right time
frame. Now I'll try to step back and stop repeating myself in the void :)

Regards,
--
dim

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 09:36:20
Message-ID:	BA9763FD-F168-40AA-9F07-7B0586A092ED@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 6, 2010, at 11:26 , Dimitri Fontaine wrote:
> Greg Smith <greg(at)2ndquadrant(dot)com> writes:
>> If you need a script that involves changing a server setting to do
>> something, that translates into "you can't do that" for a typical DBA. The
>> idea of a program regularly changing a server configuration setting on a
>> production system is one you just can't sell. That makes this idea
>> incredibly more difficult to use in the field than any of the workarounds
>> that cope with the known max_standby_delay issues.
>
> I still think that the best API we can do in a timely fashion for 9.0
> is:
>
> standby_conflict_winner = replay|queries
>
> pg_pause_recovery() / pg_resume_recovery()
>
> It seems to me those two functions are only exposing existing facilities
> in the code, so that's more an API change that a new feature
> inclusion. Of course I'm certainly wrong. But the code has already been
> written.

If there was an additional SQL-callable function that returned the backends the recovery process is currently waiting for, plus one that reported that last timestamp seen in the WAL, than all those different cancellation policies could be implemented as daemons that monitor recovery and kill backends as needed, no?

That would allow people to experiment with different cancellation policies, and maybe shed some light on what the useful policies are in practice.

best regards,
Florian Pflug

From:	Yeb Havinga <yebhavinga(at)gmail(dot)com>
To:	Rob Wultsch <wultsch(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 10:08:46
Message-ID:	4BE2952E.3060302@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Rob Wultsch wrote:
> I manage a bunch of different environments and I am pretty sure that
> in any of them if the db started seemingly randomly killing queries I
> would have application teams followed quickly by executives coming
> after me with torches and pitchforks.
>
> I can not imagine setting this value to anything other than a bool and
> most of the time that bool would be -1. I would only be unleashing a
> kill storm in utter desperation and I would probably need to explain
> myself in detail after. Utter desperation means I am sure I am going
> to have to do a impactful failover at any moment and need a slave
> completely up to date NOW.
>
That's funny because when I was reading this thread, I was thinking the
exact opposite: having max_standby_delay always set to 0 so I know the
standby server is as up-to-date as possible. The application that
accesses the hot standby has to be 'special' anyway because it might
deliver not-up-to-date data. If that information about specialties
regarding querying the standby server includes the warning that queries
might get cancelled, they can opt for a retry themselves (is there a
special return code to catch that case? like PGRES_RETRY_LATER) or a
message to the user that their report is currently unavailable and they
should retry in a few minutes.

regards,
Yeb Havinga

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 10:18:22
Message-ID:	AANLkTimo7Qw_1OxDGDo0GWEF0ECP8_Srk1pH0xBzN-dt@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 6, 2010 at 1:35 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Robert Haas wrote:
>> On Wed, May 5, 2010 at 11:52 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>> I am afraid the current setting is tempting for users to enable, but
>>> will be so unpredictable that it will tarnish the repuation of HS and
>>> Postgres. We don't want to be thinking in 9 months, "Wow, we shouldn't
>>> have shipped that features. It is causing all kinds of problems." We
>>> have done that before (rarely), and it isn't a good feeling.
>>
>> I am not convinced it will be unpredictable. The only caveats that
>> I've seen so far are:
>>
>> - You need to run ntpd.
>> - Queries will get cancelled like crazy if you're not using steaming
>> replication.
>
> And also in situations where the master is idle for a while and then
> starts doing stuff. That's the most significant source of confusion,
> IMHO, I wouldn't mind the requirement of ntpd so much.

Oh. Ouch. OK, sorry, I missed that part. Wow, that's awful. OK, I
agree: we can't ship that as-is.

/me feels embarrassed for completely failing to understand the root of
the issue until 84 emails into the thread.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 10:23:21
Message-ID:	201005061223.21867.andres@anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On Thursday 06 May 2010 07:35:49 Heikki Linnakangas wrote:
> Robert Haas wrote:
> > On Wed, May 5, 2010 at 11:52 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> I am afraid the current setting is tempting for users to enable, but
> >> will be so unpredictable that it will tarnish the repuation of HS and
> >> Postgres. We don't want to be thinking in 9 months, "Wow, we shouldn't
> >> have shipped that features. It is causing all kinds of problems." We
> >> have done that before (rarely), and it isn't a good feeling.
> >
> > I am not convinced it will be unpredictable. The only caveats that
> > I've seen so far are:
> >
> > - You need to run ntpd.
> > - Queries will get cancelled like crazy if you're not using steaming
> > replication.
>
> And also in situations where the master is idle for a while and then
> starts doing stuff. That's the most significant source of confusion,
> IMHO, I wouldn't mind the requirement of ntpd so much.
Personally I would much rather like to keep that configurability and manually
generate a record a second. Or possibly do something akin to
archive_timeout...

That may be not as important once there are less sources of conflict
resolutions - but thats something *definitely* not going to happen for 9.0...

Andres

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 10:48:35
Message-ID:	1273142915.12659.178.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 11:36 +0200, Florian Pflug wrote:

> If there was an additional SQL-callable function that returned the backends the recovery process is currently waiting for, plus one that reported that last timestamp seen in the WAL, than all those different cancellation policies could be implemented as daemons that monitor recovery and kill backends as needed, no?
>
> That would allow people to experiment with different cancellation policies, and maybe shed some light on what the useful policies are in practice.

It would be easier to implement a conflict resolution plugin that is
called when a conflict occurs, allowing users to have a customisable
mechanism. Again, I have no objection to that proposal.

--
Simon Riggs www.2ndQuadrant.com

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 11:46:51
Message-ID:	3D5543AA-B143-44DE-B1CD-22C8DE83008A@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 6, 2010, at 12:48 , Simon Riggs wrote:
> On Thu, 2010-05-06 at 11:36 +0200, Florian Pflug wrote:
>> If there was an additional SQL-callable function that returned the backends the recovery process is currently waiting for, plus one that reported that last timestamp seen in the WAL, than all those different cancellation policies could be implemented as daemons that monitor recovery and kill backends as needed, no?
>>
>> That would allow people to experiment with different cancellation policies, and maybe shed some light on what the useful policies are in practice.
>
> It would be easier to implement a conflict resolution plugin that is
> called when a conflict occurs, allowing users to have a customisable
> mechanism. Again, I have no objection to that proposal.

True, providing a plugin API would be even better, since no SQL callable API would have to be devised, and possible algorithms wouldn't be constrained by such an API's limitations.

The existing max_standby_delay logic could be moved to such a plugin, living in contrib. Since it was already established (I believe) that the existing max_standby_delay logic is sufficiently fragile to require significant knowledge on the user's side about potential pitfalls, asking those users to install the plugin from contrib shouldn't be too much to ask for.

This way, users who really need something more sophisticated than recovery wins always or standby wins always are given the tools they need *if* they're willing to put in the extra effort. For those who don't, offering max_standby_delay probably does more harm than good anyway, so nothing is lost by not offering it in the first place.

best regards,
Florian Pflug

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 12:13:54
Message-ID:	1273148034.12659.267.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 13:46 +0200, Florian Pflug wrote:
> On May 6, 2010, at 12:48 , Simon Riggs wrote:
> > On Thu, 2010-05-06 at 11:36 +0200, Florian Pflug wrote:
> >> If there was an additional SQL-callable function that returned the backends the recovery process is currently waiting for, plus one that reported that last timestamp seen in the WAL, than all those different cancellation policies could be implemented as daemons that monitor recovery and kill backends as needed, no?
> >>
> >> That would allow people to experiment with different cancellation policies, and maybe shed some light on what the useful policies are in practice.
> >
> > It would be easier to implement a conflict resolution plugin that is
> > called when a conflict occurs, allowing users to have a customisable
> > mechanism. Again, I have no objection to that proposal.
>
> True, providing a plugin API would be even better, since no SQL callable API would have to be devised, and possible algorithms wouldn't be constrained by such an API's limitations.
>
> The existing max_standby_delay logic could be moved to such a plugin, living in contrib. Since it was already established (I believe) that the existing max_standby_delay logic is sufficiently fragile to require significant knowledge on the user's side about potential pitfalls, asking those users to install the plugin from contrib shouldn't be too much to ask for.
>
> This way, users who really need something more sophisticated than recovery wins always or standby wins always are given the tools they need *if* they're willing to put in the extra effort. For those who don't, offering max_standby_delay probably does more harm than good anyway, so nothing is lost by not offering it in the first place.

No problem from me with that approach.

As long as 9.0 ships with the current capability to enforce
max_standby_delay, I have no problem.

--
Simon Riggs www.2ndQuadrant.com

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 12:35:14
Message-ID:	4BE2B782.2010805@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki Linnakangas wrote:
> Robert Haas wrote:
>
>> I am not convinced it will be unpredictable. The only caveats that
>> I've seen so far are:
>>
>> - You need to run ntpd.
>> - Queries will get cancelled like crazy if you're not using steaming
>> replication.
>>
>
> And also in situations where the master is idle for a while and then
> starts doing stuff. That's the most significant source of confusion,
> IMHO, I wouldn't mind the requirement of ntpd so much.
>

I consider it mandatory to include an documentation update here that
says "if you set max_standby_delay > 0, and do not run something that
regularly generates activity to the master like [example], you will get
unnecessary query cancellation on the standby". As well as something
like what Josh was suggesting, adding warnings that this is "for
advanced users only", to borrow his wording. This is why my name has
been on the open items list for a while now--to make sure I follow
through on that.

I haven't written it yet because there were still changes to the
underlying code being made up until moments before beta started, then
this discussion started without a break between. There are a clear set
of user land things that can be done to make up the deficiencies in the
state of the server code, but we won't even get to see how they work out
in the field (feedback needed to improve the 9.1 design) if this
capability goes away altogether.

Is it not clear that there are some people who consider the occasional
bit of cancellation OK, because they can correct for at the application
layer and they're willing to factor it in to their design if it allows
using the otherwise idle HA standby? I'm fine with expanding that
section of the documentation too, to make it more obvious that's the
only situation this aspect of HS is aimed at and suitable for.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc:	Rob Wultsch <wultsch(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 12:42:52
Message-ID:	4BE2B94C.5080603@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Yeb Havinga wrote:
> Rob Wultsch wrote:
>> I can not imagine setting this value to anything other than a bool and
>> most of the time that bool would be -1.
> That's funny because when I was reading this thread, I was thinking
> the exact opposite: having max_standby_delay always set to 0 so I know
> the standby server is as up-to-date as possible.

If you ask one person about this, you'll discover they only consider one
behavior here sane, and any other setting is crazy. Ask five people,
and you'll likely find someone who believes the complete opposite. Ask
ten and carefully work out the trade-offs they're willing to make given
the fundamental limitations of replication, and you'll arrive at the
range of behaviors available right now, plus some more that haven't been
built yet. There are a lot of different types of database applications
out there, each with their own reliability and speed requirements to
balance.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Yeb Havinga <yebhavinga(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	SQLSTATE for Hot Standby cancellation
Date:	2010-05-06 13:56:38
Message-ID:	1273154198.12659.419.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 12:08 +0200, Yeb Havinga wrote:

> That's funny because when I was reading this thread, I was thinking the
> exact opposite: having max_standby_delay always set to 0 so I know the
> standby server is as up-to-date as possible. The application that
> accesses the hot standby has to be 'special' anyway because it might
> deliver not-up-to-date data. If that information about specialties
> regarding querying the standby server includes the warning that queries
> might get cancelled, they can opt for a retry themselves (is there a
> special return code to catch that case? like PGRES_RETRY_LATER) or a
> message to the user that their report is currently unavailable and they
> should retry in a few minutes.

Very sensible. Kevin Grittner already asked for that and I alread
agreed, yet it has not been implemented that way
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01691.php

Can anyone remember a specific objection to that 'cos it still sounds
very sensible to me and is a quick, low impact change.

Currently the SQLSTATE is ERRCODE_ADMIN_SHUTDOWN or
ERRCODE_QUERY_CANCELED if not idle. That makes it even harder to
determine the retryability, since it can come back in one of two states.

Proposed SQLSTATE for HS cancellation is
case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
case PROCSIG_RECOVERY_CONFLICT_LOCK:
case PROCSIG_RECOVERY_CONFLICT_SNAPSHOT:
case PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK:
recovery_conflict_errcode = ERRCODE_T_R_SERIALIZATION_FAILURE;
break;
case PROCSIG_RECOVERY_CONFLICT_DATABASE:
case PROCSIG_RECOVERY_CONFLICT_TABLESPACE:
recovery_conflict_errcode = ERRCODE_ADMIN_SHUTDOWN;
break;

We don't have a retryable SQLSTATE, so perhaps we should document that
serialization_failure and deadlock_detected are both retryable error
conditions. As the above implies HS can throw some errors that are
non-retyable, so we use ERRCODE_ADMIN_SHUTDOWN.

Comments?

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Yeb Havinga <yebhavinga(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	SQLSTATE for Hot Standby cancellation
Date:	2010-05-06 13:58:34
Message-ID:	1273154314.12659.422.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 12:08 +0200, Yeb Havinga wrote:

Very sensible. Kevin Grittner already asked for that and I alread
agreed, yet it has not been implemented that way
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01691.php

Can anyone remember a specific objection to that 'cos it still sounds
very sensible to me and is a quick, low impact change.

Currently the SQLSTATE is ERRCODE_ADMIN_SHUTDOWN or
ERRCODE_QUERY_CANCELED if not idle. That makes it even harder to
determine the retryability, since it can come back in one of two states.

Comments?

--
Simon Riggs www.2ndQuadrant.com

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Florian Pflug <fgp(at)phlo(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 14:09:57
Message-ID:	m2tyql86kq.fsf@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> It would be easier to implement a conflict resolution plugin that is
> called when a conflict occurs, allowing users to have a customisable
> mechanism. Again, I have no objection to that proposal.

To implement, if you say so, no doubt. To use, that means you need to
install a contrib module after validation that the trade offs there are
the one you're interested into, or you have to code it yourself. In C.

I don't see that as an improvement over what we have now. Our main
problem seems to be the documentation of the max_standby_delay, where we
give the impression it's doing things the code can not do. IIUC.

Regards,
--
dim

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc:	Florian Pflug <fgp(at)phlo(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 14:24:54
Message-ID:	1273155894.12659.455.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 16:09 +0200, Dimitri Fontaine wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > It would be easier to implement a conflict resolution plugin that is
> > called when a conflict occurs, allowing users to have a customisable
> > mechanism. Again, I have no objection to that proposal.
>
> To implement, if you say so, no doubt. To use, that means you need to
> install a contrib module after validation that the trade offs there are
> the one you're interested into, or you have to code it yourself. In C.
>
> I don't see that as an improvement over what we have now. Our main
> problem seems to be the documentation of the max_standby_delay, where we
> give the impression it's doing things the code can not do. IIUC.

I meant "easier to implement than what Florian suggested".

The plugin would also allow you to have the pause/resume capability.

--
Simon Riggs www.2ndQuadrant.com

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 14:39:39
Message-ID:	4BE2D4AB.2010004@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Thu, 2010-05-06 at 16:09 +0200, Dimitri Fontaine wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> It would be easier to implement a conflict resolution plugin that is
>>> called when a conflict occurs, allowing users to have a customisable
>>> mechanism. Again, I have no objection to that proposal.
>> To implement, if you say so, no doubt. To use, that means you need to
>> install a contrib module after validation that the trade offs there are
>> the one you're interested into, or you have to code it yourself. In C.
>>
>> I don't see that as an improvement over what we have now. Our main
>> problem seems to be the documentation of the max_standby_delay, where we
>> give the impression it's doing things the code can not do. IIUC.
>
> I meant "easier to implement than what Florian suggested".
>
> The plugin would also allow you to have the pause/resume capability.

Not the same plugin. A hook for stop/resume would need to be called
before and/or after each record, the one for conflict resolution would
need to be called at each conflict. Designing a good interface for a
plugin is hard, you need at least a couple of samples ideas for plugins
that would use the hook, before you know the interface is flexible enough.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 14:47:08
Message-ID:	4BE2D66C.8090107@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Not the same plugin. A hook for stop/resume would need to be called
before and/or after each record, the one for conflict resolution would
need to be called at each conflict. Designing a good interface for a
plugin is hard, you need at least a couple of sample ideas for plugins
that would use the hook, before you know the interface is flexible enough.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 14:56:13
Message-ID:	1273157773.12659.497.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 17:39 +0300, Heikki Linnakangas wrote:

> Not the same plugin. A hook for stop/resume would need to be called
> before and/or after each record, the one for conflict resolution would
> need to be called at each conflict. Designing a good interface for a
> plugin is hard, you need at least a couple of samples ideas for plugins
> that would use the hook, before you know the interface is flexible enough.

* current behaviour of max_standby_delay
* pause for X time if conflict, then cancel - which is the suggested
behaviour upthread
* pause-if-conflict, explicit resume needed

Info passed to plugin
* conflict type
* relfilenode
* latestRemovedXid

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 15:38:18
Message-ID:	AANLkTiljAwtVO2Q0WBWyyWORq3Y0GJedKYUb3LmjXMuk@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Now that I've realized what the real problem is with max_standby_delay
(namely, that inactivity on the master can use up the delay), I think
we should do what Tom originally suggested here. It's not as good as
a really working max_standby_delay, but we're not going to have that
for 9.0, and it's clearly better than a boolean.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Andres Freund" <andres(at)anarazel(dot)de>, "Yeb Havinga" <yebhavinga(at)gmail(dot)com>
Cc:	"Greg Smith" <greg(at)2ndquadrant(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: SQLSTATE for Hot Standby cancellation
Date:	2010-05-06 17:40:25
Message-ID:	4BE2B8B9020000250003133A@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> We don't have a retryable SQLSTATE, so perhaps we should document
> that serialization_failure and deadlock_detected are both
> retryable error conditions. As the above implies HS can throw some
> errors that are non-retyable, so we use ERRCODE_ADMIN_SHUTDOWN.

There is one SQLSTATE (40001 - serialization_failure) to indicate
that the transaction failed due to conflicts with concurrent
transactions. This is supposed to be used when there was no problem
with the transaction itself, were it to be run when nothing else is
running and the environment isn't failing. (For example, it isn't
the right code to use if you fail because your network connection is
broken or you're running out of disk space.) In my view it is a
mistake *not* to use it whenever concurrent transactions cause
failure; we can always use different error message text, details,
and hints to differentiate between the particular ways in which
concurrent transactions caused the failure.

The advantage of consistently using the standard value for this is
that there is software which recognizes this and automatically
retries the transaction from the beginning in a manner which is
transparent to the users and the application code. Our users never
see anything but a delay when hitting this; it's effectively the
same as blocking, with no application programmer effort needed.

Having a separate deadlock_detected SQLSTATE is arguably a bug. (In
fact, I have argued that; but haven't persuaded the community of
it.) Coming up with yet another SQLSTATE to indicate that there's
nothing wrong with the transaction except its timing in relation to
other transactions would, in my view, compound the error.

That said, our shop is resourceful enough (in more than one sense)
to overcome this if need be. I just don't see why the community
would choose to create barriers which cause failures for software
which would otherwise Just Work, and the corresponding programming
burden on shops using PostgreSQL to keep a list of "nothing really
wrong with the transaction, you can probably just reschedule it"
SQLSTATE values for the product.

-Kevin

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 18:47:10
Message-ID:	4BE30EAE.6050207@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> Now that I've realized what the real problem is with max_standby_delay
> (namely, that inactivity on the master can use up the delay), I think
> we should do what Tom originally suggested here. It's not as good as
> a really working max_standby_delay, but we're not going to have that
> for 9.0, and it's clearly better than a boolean.

I guess I'm not clear on how what Tom proposed is fundamentally
different from max_standby_delay = -1. If there's enough concurrent
queries, recovery would never catch up.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 18:59:07
Message-ID:	AANLkTik2jCNECY1FZnYMfYRL0FkpVmSTQkMqUQET4HgJ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 6, 2010 at 2:47 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> Now that I've realized what the real problem is with max_standby_delay
>> (namely, that inactivity on the master can use up the delay), I think
>> we should do what Tom originally suggested here. It's not as good as
>> a really working max_standby_delay, but we're not going to have that
>> for 9.0, and it's clearly better than a boolean.
>
> I guess I'm not clear on how what Tom proposed is fundamentally
> different from max_standby_delay = -1. If there's enough concurrent
> queries, recovery would never catch up.

If your workload is that the standby server is getting pounded with
queries like crazy, then it's probably not that different: it will
fall progressively further behind. But I suspect many people will set
up standby servers where most of the activity happens on the primary,
but they run some reporting queries on the standby. If you expect
your reporting queries to finish in <10s, you could set the max delay
to say 60s. In the event that something gets wedged, recovery will
eventually kill it and move on rather than just getting stuck forever.
If the volume of queries is known not to be too high, it's reasonable
to expect that a few good whacks will be enough to get things back on
track.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 19:03:21
Message-ID:	4BE31279.7040002@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

All,

We are in Beta1, now, and it's May. Original goal for 9.0 was
June-July. We cannot be introducing major design revisions to HS/SR at
this date, or we won't ship until December.

There are at least 10 other major features in 9.0, some of which are
more important to some of our users than HS/SR. More importantly, I
think the discussion on this thread makes it very clear that no matter
how much discussion we have on standby delay, we are NOT going to get it
right the first time. That is, *even if* we replace Simon's code with
something else, that something else will have as many issues for real
users as the current delay does, especially since we won't even have
started debugging or testing the new code yet.

So changing to a lock-based mechanism or designing a plugin interface
are really not at all realistic at this date.

Realistically, we have two options at this point:

1) reduce max_standby_delay to a boolean.

2) have a delay option (based either on WAL glob start time or on system
time) like the current max_standby_delay, preferably with some bugs fixed.

If we do (1), we'll be having this discussion all over again in
September, and will be no better off because we won't have any
production feedback on Simon's approach. If we do (2) we can hedge it
in the documentation with requirements and cautions, and hopefully only
dedicated DBAs will touch it, and we'll get good feedback from them on
how we should redesign it for 9.1. And if it works as badly as Tom
expects, then we won't have an issue with maintaining backwards
compatibility, because people will be *happy* to change.

One way to communicate this would be to have 2 GUCs instead of one:
allow_query_cancel = on|off # defaults to on
max_standby_timeout = 0 # SEE DOCS BEFORE CHANGING

We named this release 9.0 because, among other things, we expected it to
be less stable than the prior 3 releases. And we can continue to tell
users that. I know I won't be moving any of my clients to 9.0.0.

I said it before and I'll say it again: "release early, release often".

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 21:01:04
Message-ID:	201005062101.o46L14b18246@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Josh Berkus wrote:
> All,
>
> We are in Beta1, now, and it's May. Original goal for 9.0 was
> June-July. We cannot be introducing major design revisions to HS/SR at
> this date, or we won't ship until December.
>
> There are at least 10 other major features in 9.0, some of which are
> more important to some of our users than HS/SR. More importantly, I
> think the discussion on this thread makes it very clear that no matter
> how much discussion we have on standby delay, we are NOT going to get it
> right the first time. That is, *even if* we replace Simon's code with
> something else, that something else will have as many issues for real
> users as the current delay does, especially since we won't even have
> started debugging or testing the new code yet.
>
> So changing to a lock-based mechanism or designing a plugin interface
> are really not at all realistic at this date.
>
> Realistically, we have two options at this point:
>
> 1) reduce max_standby_delay to a boolean.

I suggest calling it 'delay_wal_application' or 'wal_query_cancel' or
something like that.

> 2) have a delay option (based either on WAL glob start time or on system
> time) like the current max_standby_delay, preferably with some bugs fixed.

I don't think releasing something that many of us can barely understand
is going to help. I think Heikki is right that we might get feedback
from 9.0 that this setting isn't even useful. If we can't get this
right, and it seems we can't, we should just push this to 9.1.

Remember, delaying wal application just delays making the standby a
master and makes the slave data appear staler. We can just tell people
that the larger their queries are, the larger this delay will be. If
they want to control this, they can set 'statement_timeout' already.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 21:26:14
Message-ID:	4BE333F6.5080704@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian wrote:
> Remember, delaying wal application just delays making the standby a
> master and makes the slave data appear staler. We can just tell people
> that the larger their queries are, the larger this delay will be. If
> they want to control this, they can set 'statement_timeout' already.
>

While a useful defensive component, statement_timeout is a user setting,
so it can't provide guaranteed protection against a WAL application
denial of service from a long running query. A user that overrides the
system setting and kicks off a long query puts you right back into
needing a timeout to ensure forward progress of standby replay.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-06 21:32:07
Message-ID:	201005062132.o46LW7I22842@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith wrote:
> Bruce Momjian wrote:
> > Remember, delaying wal application just delays making the standby a
> > master and makes the slave data appear staler. We can just tell people
> > that the larger their queries are, the larger this delay will be. If
> > they want to control this, they can set 'statement_timeout' already.
> >
>
> While a useful defensive component, statement_timeout is a user setting,
> so it can't provide guaranteed protection against a WAL application
> denial of service from a long running query. A user that overrides the
> system setting and kicks off a long query puts you right back into
> needing a timeout to ensure forward progress of standby replay.

The nice thing about query cancel is that it give predictable behavior.
We could make statement_timeout that can't be changed if it is set in
postgresql.conf. Again, let's think of that for 9.1.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: SQLSTATE for Hot Standby cancellation
Date:	2010-05-07 08:51:01
Message-ID:	1273222261.12659.1314.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 15:00 +0100, Simon Riggs wrote:
> On Thu, 2010-05-06 at 12:08 +0200, Yeb Havinga wrote:
>
> > That's funny because when I was reading this thread, I was thinking the
> > exact opposite: having max_standby_delay always set to 0 so I know the
> > standby server is as up-to-date as possible. The application that
> > accesses the hot standby has to be 'special' anyway because it might
> > deliver not-up-to-date data. If that information about specialties
> > regarding querying the standby server includes the warning that queries
> > might get cancelled, they can opt for a retry themselves (is there a
> > special return code to catch that case? like PGRES_RETRY_LATER) or a
> > message to the user that their report is currently unavailable and they
> > should retry in a few minutes.
>
> Very sensible. Kevin Grittner already asked for that and I alread
> agreed, yet it has not been implemented that way
> http://archives.postgresql.org/pgsql-hackers/2008-12/msg01691.php
>
> Can anyone remember a specific objection to that 'cos it still sounds
> very sensible to me and is a quick, low impact change.
>
> Currently the SQLSTATE is ERRCODE_ADMIN_SHUTDOWN or
> ERRCODE_QUERY_CANCELED if not idle. That makes it even harder to
> determine the retryability, since it can come back in one of two states.
>
> Proposed SQLSTATE for HS cancellation is
> case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
> case PROCSIG_RECOVERY_CONFLICT_LOCK:
> case PROCSIG_RECOVERY_CONFLICT_SNAPSHOT:
> case PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK:
> recovery_conflict_errcode = ERRCODE_T_R_SERIALIZATION_FAILURE;
> break;
> case PROCSIG_RECOVERY_CONFLICT_DATABASE:
> case PROCSIG_RECOVERY_CONFLICT_TABLESPACE:
> recovery_conflict_errcode = ERRCODE_ADMIN_SHUTDOWN;
> break;
>
> We don't have a retryable SQLSTATE, so perhaps we should document that
> serialization_failure and deadlock_detected are both retryable error
> conditions. As the above implies HS can throw some errors that are
> non-retyable, so we use ERRCODE_ADMIN_SHUTDOWN.

Implemented as ERRCODE_ADMIN_SHUTDOWN only in the case of a dropped database.
ERRCODE_T_R_SERIALIZATION_FAILURE in other cases.

--
Simon Riggs www.2ndQuadrant.com

Attachment	Content-Type	Size
hs_cancel_errcode.patch	text/x-patch	1.2 KB

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 08:44:12
Message-ID:	1273308252.12659.2983.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2010-05-06 at 12:03 -0700, Josh Berkus wrote:

> So changing to a lock-based mechanism or designing a plugin interface
> are really not at all realistic at this date.

I agree that changing to a lock-based mechanism is too much at this
stage of development.

However, putting in a plugin is trivial. We could do it if we choose,
without instability or risk. It is as easy a change as option (1). It's
not complex to design because it would use the exact same API as the
internal conflict resolution module already does; we can just move the
current conflict code straight into a contrib module. This can be done
bug-free in about 3 hours work. There is no performance issue associated
with that either. Plugins would allow all of the various mechanisms
requested on list over 18 months, nor would they prevent including some
of those options within the core at a later date.

Without meaning to cause further contention, it is very clear that
putting in contrib modules isn't bad after all, so there is no material
argument against the plugin approach.

I recognise that plugins for some reason ignite unstated fears, by
observation that there is always an argument every time I mention them.
I invite an explanation of that off-list.

> Realistically, we have two options at this point:
>
> 1) reduce max_standby_delay to a boolean.
>
> 2) have a delay option (based either on WAL glob start time or on system
> time) like the current max_standby_delay, preferably with some bugs fixed.

With a plugin option, I would not object to option 1.

If option 1 was taken, without plugins, it's clear that would be against
consensus.

Having said that, I'll confirm now there will not be an extreme reaction
from me if option (1) was forced, nor do I counsel that from others.

> I said it before and I'll say it again: "release early, release often".

None of this needs to delay release.

--
Simon Riggs www.2ndQuadrant.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 18:48:36
Message-ID:	201005081848.o48Imak02225@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> With a plugin option, I would not object to option 1.
>
> If option 1 was taken, without plugins, it's clear that would be against
> consensus.
>
> Having said that, I'll confirm now there will not be an extreme reaction
> from me if option (1) was forced, nor do I counsel that from others.

I found this email amusing. You phrase it like the community is supposed
to be worried by an objection from you or an "extreme reaction"; I
certainly am not. You have been in the community long enough to not use
such phrasing. This is not the first time I have complained about this.
I have no idea why an objection from you should mean more than an
objection from anyone else in the community, and I have no idea what an
"extreme reaction" means, or why anyone should care. Do you think the
community is negotiting with you?

I think the concensus is to change this setting to a boolean. If you
don't want to do it, I am sure we can find someone who will.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 19:30:07
Message-ID:	AANLkTinXMnSmet-g7qfapIgmjoFKFawwZNWLolaBC88x@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, May 8, 2010 at 2:48 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> I think the concensus is to change this setting to a boolean. If you
> don't want to do it, I am sure we can find someone who will.

I still think we should revert to Tom's original proposal.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 19:40:09
Message-ID:	201005081940.o48Je9w10516@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Sat, May 8, 2010 at 2:48 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > I think the concensus is to change this setting to a boolean. ?If you
> > don't want to do it, I am sure we can find someone who will.
>
> I still think we should revert to Tom's original proposal.

And Tom's proposal was to do it on WAL slave arrival time? If we could
get agreement from everyone that that is the proper direction, fine, but
I am hearing things like plugins, and other complexity that makes it
seem we are not getting closer to an agreed solution, and without
agreement, the simplest approach seems to be just to remove the part we
can't agree upon.

I think the big question is whether this issue is significant enough
that we should ignore our policy of no feature design during beta.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 19:46:51
Message-ID:	AANLkTilE4_MV7SzP-s7DsR5iVU0nQj8gNgvi3A9yEMME@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, May 8, 2010 at 3:40 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Robert Haas wrote:
>> On Sat, May 8, 2010 at 2:48 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> > I think the concensus is to change this setting to a boolean. ?If you
>> > don't want to do it, I am sure we can find someone who will.
>>
>> I still think we should revert to Tom's original proposal.
>
> And Tom's proposal was to do it on WAL slave arrival time? If we could
> get agreement from everyone that that is the proper direction, fine, but
> I am hearing things like plugins, and other complexity that makes it
> seem we are not getting closer to an agreed solution, and without
> agreement, the simplest approach seems to be just to remove the part we
> can't agree upon.
>
> I think the big question is whether this issue is significant enough
> that we should ignore our policy of no feature design during beta.

Tom's proposal was basically to define recovery_process_lock_timeout.
The recovery process would wait X seconds for a lock, then kill
whoever held it. It's not the greatest knob in the world for the
reasons already pointed out, but I think it's still better than a
boolean and will be useful to some users. And it's pretty simple.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndQuadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 21:02:51
Message-ID:	m2vday6r9g.fsf@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> I have no idea why an objection from you should mean more than an
> objection from anyone else in the community, and I have no idea what an
> "extreme reaction" means, or why anyone should care.

Maybe I shouldn't say anything here. But clearly while you're spot on
that Simon's objection is worth just as much as any other contributor's,
I disagree that we shouldn't care about the way those people feel about
being a member of our community.

I appreciate your efforts to avoid having anyone here use such a wording
but I can't help to dislike your argument for it. I hope that's simply a
localisation issue (l10n is so much harder than i18n).

Anyway, I so much hate reading such exchanges here that I couldn't help
ranting about it. Back to suitable -hackers content.

> I think the concensus is to change this setting to a boolean. If you
> don't want to do it, I am sure we can find someone who will.

I don't think so. I understand the current state to be:
a. this problem is not blocking beta, but a must fix before release
b. we either have to change the API or the behavior
c. only one behavior change has been proposed, by Tom
d. proposed behavior would favor queries rather than availability
e. API change 1 is boolean + explicit pause/resume command
f. API change 2 is boolean + plugin facility, with a contrib for
current behavior.
g. API change 3 is boolean only

I don't remember reading any mail on this thread bearing consensus on
the choices above, but rather either one of us pushing for his vision or
people defending the current situation, complaining about it or asking
that a reasonable choice is made soon.

If we have to choose between reasonable and soon, soon won't be my
vote. Beta is meant to last more or less 3 months after all.

Each party's standing is clear. Decision remains to be made, and I guess
that the one writing the code will have a much louder voice.

Regards,
--
dim

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 22:51:43
Message-ID:	201005082251.o48MphB21068@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Sat, May 8, 2010 at 3:40 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Robert Haas wrote:
> >> On Sat, May 8, 2010 at 2:48 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> > I think the concensus is to change this setting to a boolean. ?If you
> >> > don't want to do it, I am sure we can find someone who will.
> >>
> >> I still think we should revert to Tom's original proposal.
> >
> > And Tom's proposal was to do it on WAL slave arrival time? ?If we could
> > get agreement from everyone that that is the proper direction, fine, but
> > I am hearing things like plugins, and other complexity that makes it
> > seem we are not getting closer to an agreed solution, and without
> > agreement, the simplest approach seems to be just to remove the part we
> > can't agree upon.
> >
> > I think the big question is whether this issue is significant enough
> > that we should ignore our policy of no feature design during beta.
>
> Tom's proposal was basically to define recovery_process_lock_timeout.
> The recovery process would wait X seconds for a lock, then kill
> whoever held it. It's not the greatest knob in the world for the
> reasons already pointed out, but I think it's still better than a
> boolean and will be useful to some users. And it's pretty simple.

I thought there was concern about lock stacking causing
unpredictable/unbounded delays. I am not sure boolean has a majority
vote, but I am suggesting that because it is the _minimal_ feature set,
and when we can't agree during beta, the minimal feature set seems like
the best choice.

Clearly, anything is more feature-full than boolean --- the big question
is whether Tom's proposal is significantly better than boolean that we
should spend the time designing and implementing it, with the
possibility it will all be changed in 9.1.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 23:04:23
Message-ID:	4BE5EDF7.6030305@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian wrote:
> I think the big question is whether this issue is significant enough
> that we should ignore our policy of no feature design during beta

The idea that you're considering removal of a feature that we already
have people using in beta and making plans around is a policy violation
too you know. A freeze should include not cutting things just because
their UI or implementation is not ideal yet. And you've been using the
word "consensus" here when there is no such thing. At best there's
barely a majority here among people who have stated an opinion, and
consensus means something much stronger even than that; that means
something closer to unanimity. I thought the summary of where the
project is at Josh wrote at
http://archives.postgresql.org/message-id/4BE31279.7040002@agliodbs.com
was excellent, both from a technical and a process commentary
standpoint. I'd be completely happy to follow that plan, and then we'd
be at a consensus--with no one left arguing.

It was very clear back in February that if SR didn't hit the feature set
to make HS less troublesome out of the box, there would be some
limitations here, and that set of concerns hasn't changed much since
then. I thought the backup plan if we didn't get things like xid
feedback was to keep the capability as written anyway, knowing that it's
still much better than no control over cancellation timing available at
all. Keep improving documentation around its issues, and continue to
hack away at them in user space and in the field. Then we do better for
9.1. You seem bent on removing the feedback part of that cycle.

The full statement of the ESR bit Josh was quoting is "Release early.
Release often. And listen to your customers."[1] My customers include
some of whom believed the PostgreSQL community process enough to
contribute toward the HS development that's been completed and donated
to the project. They have a pretty clear view on this I'm relaying when
I talk about what I'd like to see happen. They are saying they cannot
completely ignore their requirements for HA failover, but would be
willing to loosen them just a bit (increasing failover time slightly) if
it reduces the odds of query cancellation, and therefore improves how
much load they can expect to push toward the standby. max_standby_delay
is a currently available mechanism that does that. I'm not going to be
their nanny and say "no, that's not perfectly predictable, you might get
a query canceled sometimes when you don't expect it anyway".

Instead, I was hoping to let them deploy 9.0 with this option available
(but certainly not the default), informed of the potential risks, see
how that goes. We can confirm whether the userland workarounds we
believe will be effective here really are. If so, then we can solider
forward directly incorporating them into the server code, knowing that
works. If not, switch to one of the safer modes, see if there's
something better to use altogether in 9.1, and perhaps this whole
approach gets removed. That's healthy development progress either way.

Upthread Bruce expressed some concern that this was going to live
forever once deployed. There is no way I'm going to let this behavior
continue to be available in 9.1 if field tests say the workarounds
aren't good enough. That's going to torture all of us who do customer
deployments of this technology every day if that turns out to be the
case, and nobody is going to feel the heat from that worse than
2ndQuadrant. I did a round once of removing GUCs that didn't do what
they were expected to in the field before, based on real-world tests
showing regular misuse, and I'll do it again if this falls into that
same category. We've already exposed this release to a whole stack of
risk from work during its development cycle, risk that doesn't really
drop much just from cutting this one bit. I'd at least like to get all
the reward possible from that risk, which I expected to include feedback
in this area.

Circumventing the planned development process by dropping this now will
ruin how I expected the project to feel out the right thing on the user
side, and we'll all be left with little more insight for what to do in
9.1 than we have now. And I'm not looking forward to explaining to
people why a feature they've been seeing and planning to deploy for
months has now been cut only after what was supposed to be a freeze for
beta.

[1]
http://catb.org/esr/writings/homesteading/cathedral-bazaar/ar01s04.html
, and this particular bit is quite relevant here: "Linus was keeping his
hacker/users constantly stimulated and rewarded—stimulated by the
prospect of having an ego-satisfying piece of the action, rewarded by
the sight of constant (even daily) improvement in their work. Linus was
directly aiming to maximize the number of person-hours thrown at
debugging and development, even at the possible cost of instability in
the code and user-base burnout if any serious bug proved intractable." I
continue to be disappointed at how contributing code to PostgreSQL is
far more likely to come with a dose of argument and frustration rather
than reward, and this discussion is a perfect example of such.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 23:34:18
Message-ID:	201005082334.o48NYIA28405@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith wrote:
> Bruce Momjian wrote:
> > I think the big question is whether this issue is significant enough
> > that we should ignore our policy of no feature design during beta
>
> The idea that you're considering removal of a feature that we already
> have people using in beta and making plans around is a policy violation
> too you know. A freeze should include not cutting things just because
> their UI or implementation is not ideal yet. And you've been using the
> word "consensus" here when there is no such thing. At best there's
> barely a majority here among people who have stated an opinion, and
> consensus means something much stronger even than that; that means
> something closer to unanimity. I thought the summary of where the
> project is at Josh wrote at
> http://archives.postgresql.org/message-id/4BE31279.7040002@agliodbs.com
> was excellent, both from a technical and a process commentary
> standpoint. I'd be completely happy to follow that plan, and then we'd
> be at a consensus--with no one left arguing.

I can't argue with anything you have said in your email. The big
question is whether designing during beta is worth it in this case, and
whether we can get something that is useful and gives us useful feedback
for 9.1, and is it worth spending the time to figure this out during
beta? If we can, great, let's do it, but I have not seen that yet, and
I am unclear how long we should keep trying to find it.

I think everyone agrees the current code is unusable, per Heikki's
comment about a WAL file arriving after a period of no WAL activity, and
look how long it took our group to even understand why that fails so
badly. I thought Tom's idea had problems, and there were ideas of how
to improve it. It just seems like we are drifting around on something
that has no easy solution, and not something that we are likely to hit
during beta where we should be focusing on the release. Saying we have
three months to fix this during beta seems like a recipe for delaying
the final release, and this feature is not worth that.

What we could do is to convert max_standby_delay to a boolean, 'ifdef'
out the code that was handling non-boolean cases, and then if someone
wants to work on a patch in a corner and propose something in a month
that improves this, we can judge the patch on its own merits, and apply
it if it is a great benefit, because basically that is what we are doing
now if we fix this --- adding a new patch/feature during beta.
(Frankly, because we are not requiring an initdb during beta, I am
unclear how we are going to rename max_standby_delay to behave as a
boolean.)

It is great if we can get a working max_standby_delay, but I fear
drifting/distraction at this stage.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-08 23:42:15
Message-ID:	201005090142.16933.andres@anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sunday 09 May 2010 01:34:18 Bruce Momjian wrote:
> I think everyone agrees the current code is unusable, per Heikki's
> comment about a WAL file arriving after a period of no WAL activity, and
> look how long it took our group to even understand why that fails so
> badly.
To be honest its not *that* hard to simply make sure generating wal regularly
to combat that. While it surely aint a nice workaround its not much of a
problem either.

Andres

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 00:57:26
Message-ID:	18337.1273366646@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On Sunday 09 May 2010 01:34:18 Bruce Momjian wrote:
>> I think everyone agrees the current code is unusable, per Heikki's
>> comment about a WAL file arriving after a period of no WAL activity, and
>> look how long it took our group to even understand why that fails so
>> badly.

> To be honest its not *that* hard to simply make sure generating wal regularly
> to combat that. While it surely aint a nice workaround its not much of a
> problem either.

Well, that's dumping a kluge onto users; but really that isn't the
point. What we have here is a badly designed and badly implemented
feature, and we need to not ship it like this so as to not
institutionalize a bad design.

I like the proposal of a boolean because it provides only the minimal
feature set of two cases that are both clearly needed and easily
implementable. Whatever we do later is certain to provide a superset
of those two cases. If we do something else (and that includes my own
proposal of a straight lock timeout), we'll be implementing something
we might wish to take back later. Taking out features after they've
been in a release is very hard, even if we realize they're badly
designed.

regards, tom lane

From:	Greg Smith <greg(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 03:00:55
Message-ID:	4BE62567.5090508@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:

> Taking out features after they've been in a release is very hard, even if we realize they're badly
> designed.
>

It doesn't have to be; that's the problem the "release often" part takes
care of. If a release has only been out a year, and a new one comes out
saying "oh, that thing we released for the first time in the last
version, it didn't work as well as we'd hoped in the field; you should
try to avoid that and use this new implementation that works better
instead once you can upgrade", that's not only not hard, it's exactly
what people using a X.0 release expect to happen.

I've read the message from you that started off this thread several
times now. Your low-level code implementation details shared later
obviously need to be addressed. But all of the "fundamental" and
"fatal" issues you mentioned at the start continue to strike me as
either situations where you don't agree with the use case this was
designed for, or spots where you feel the userland workarounds required
to make it work right are too onerous. Bruce's objections seem to fall
mainly into the latter category.

I've been wandering around talking to people about that exact
subject--what do people want and expect from Hot Standby, and what would
they do to gain its benefits--for over six months now, independently of
Simon's work which did a lot of that before me too. The use cases are
covered as best they can be without better support from expected future
SR features like heartbeats and XID loopback. As for the workarounds
required to make things work, the responses I get match what we just saw
from Andres. When the required details are explained, people say
"that's annoying but I can do that", and off we go. There are
significant documentation issues I know need to be cleaned up here, and
I've already said I'll take care of that as soon as freeze is really
here and I have a stable target. (That this discussion is still going
on says that's not yet)

What I fail to see are problems significant enough to not ship the parts
of this feature that are done, so that it can be used by those it is
appropriate for, allow feedback, and make it easy to test individual
improvements upon what's already there. I can't make you prioritize
based on what people are telling me. All I can do is suggest you
reconsider handing control over the decision to use this feature or not
to the users of the software, so they can make their own choice.

I'm tired of arguing about this instead of doing productive work, and
I've done all I can here to try and work within the development process
of the community. If talk of removing the max_standby_delay feature
clears up, I'll happily provide my promised round of documentation
updates, to make its limitations and associated workarounds as clear as
they can be, within a week of being told go on that. If instead this
capability goes away, making those moot, I'll maintain my own release
for the 2ndQuadrant customers who have insisted they need this
capability if I have to. That would be really unfortunate, because the
only bucket I can pull time out of for that is the one I currently
allocate to answering questions on the mailing lists here most days.
I'd rather spend that helping out the PostgreSQL community, but we do
need to deliver what our customers want too.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 03:46:41
Message-ID:	201005090346.o493kfD13443@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith wrote:
> Tom Lane wrote:
>
> > Taking out features after they've been in a release is very hard, even if we realize they're badly
> > designed.
> >
>
> It doesn't have to be; that's the problem the "release often" part takes
> care of. If a release has only been out a year, and a new one comes out
> saying "oh, that thing we released for the first time in the last
> version, it didn't work as well as we'd hoped in the field; you should
> try to avoid that and use this new implementation that works better
> instead once you can upgrade", that's not only not hard, it's exactly
> what people using a X.0 release expect to happen.

I think this is the crux of the issue. Tom and I are saying that
historically we have shipped only complete features, or as complete as
reasonable, and have removed items during beta that we found didn't meet
this criteria, in an attempt to reduce the amount of feature set churn
in Postgres. A database is complex, so modifying the API between major
releases is something we only do when we find a significant benefit.

In this case, if we keep max_standby_delay as non-boolean, we know it
will have to be redesigned in 9.1, and it is unclear to me what
additional knowledge we will gain by shipping it in 9.0, except to have
to tell people that it doesn't work well or requires complex
work-arounds, and that doesn't thrill any of us. (I already suggested
that statement_timeout might supply a reasonable and predictable
workaround for non-boolean usage of max_standby_delay.)

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 03:50:56
Message-ID:	AANLkTikfSSM3_V-xYv743mFDD8oni058r-CzMTg_tsRS@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, May 8, 2010 at 6:51 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Robert Haas wrote:
>> On Sat, May 8, 2010 at 3:40 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> > Robert Haas wrote:
>> >> On Sat, May 8, 2010 at 2:48 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> >> > I think the concensus is to change this setting to a boolean. ?If you
>> >> > don't want to do it, I am sure we can find someone who will.
>> >>
>> >> I still think we should revert to Tom's original proposal.
>> >
>> > And Tom's proposal was to do it on WAL slave arrival time? ?If we could
>> > get agreement from everyone that that is the proper direction, fine, but
>> > I am hearing things like plugins, and other complexity that makes it
>> > seem we are not getting closer to an agreed solution, and without
>> > agreement, the simplest approach seems to be just to remove the part we
>> > can't agree upon.
>> >
>> > I think the big question is whether this issue is significant enough
>> > that we should ignore our policy of no feature design during beta.
>>
>> Tom's proposal was basically to define recovery_process_lock_timeout.
>> The recovery process would wait X seconds for a lock, then kill
>> whoever held it. It's not the greatest knob in the world for the
>> reasons already pointed out, but I think it's still better than a
>> boolean and will be useful to some users. And it's pretty simple.
>
> I thought there was concern about lock stacking causing
> unpredictable/unbounded delays. I am not sure boolean has a majority
> vote, but I am suggesting that because it is the _minimal_ feature set,
> and when we can't agree during beta, the minimal feature set seems like
> the best choice.
>
> Clearly, anything is more feature-full than boolean --- the big question
> is whether Tom's proposal is significantly better than boolean that we
> should spend the time designing and implementing it, with the
> possibility it will all be changed in 9.1.

I doubt it's likely to be thrown out completely. We might decide to
fine-tune it in some way. My fear is that if we ship this with only a
boolean, we're shipping crippleware. If that fear turns out to be
unfounded, I will of course be happy, but that's my concern, and I
don't believe that it's entirely unfounded.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 04:08:00
Message-ID:	201005090408.o49480k17494@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> > Clearly, anything is more feature-full than boolean --- the big question
> > is whether Tom's proposal is significantly better than boolean that we
> > should spend the time designing and implementing it, with the
> > possibility it will all be changed in 9.1.
>
> I doubt it's likely to be thrown out completely. We might decide to
> fine-tune it in some way. My fear is that if we ship this with only a
> boolean, we're shipping crippleware. If that fear turns out to be
> unfounded, I will of course be happy, but that's my concern, and I
> don't believe that it's entirely unfounded.

Well, historically, we have been willing to not ship features if we
can't get it right. No one has ever accused us of crippleware, but our
hesitancy has caused slower user adoption, though long-term, it has
helped us grow a dedicated user base that trusts us.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 04:35:53
Message-ID:	AANLkTin_ceFyPWzFHqLyHZAZeXzePy3w94KOT8h0zE0c@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, May 9, 2010 at 12:08 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Robert Haas wrote:
>> > Clearly, anything is more feature-full than boolean --- the big question
>> > is whether Tom's proposal is significantly better than boolean that we
>> > should spend the time designing and implementing it, with the
>> > possibility it will all be changed in 9.1.
>>
>> I doubt it's likely to be thrown out completely. We might decide to
>> fine-tune it in some way. My fear is that if we ship this with only a
>> boolean, we're shipping crippleware. If that fear turns out to be
>> unfounded, I will of course be happy, but that's my concern, and I
>> don't believe that it's entirely unfounded.
>
> Well, historically, we have been willing to not ship features if we
> can't get it right. No one has ever accused us of crippleware, but our
> hesitancy has caused slower user adoption, though long-term, it has
> helped us grow a dedicated user base that trusts us.

We can make the decision to not ship the feature if the feature is
"max_standby_delay". But I think the feature is "Hot Standby", which
I think we've pretty much committed to shipping. And I am concerned
that if the only mechanism for controlling query cancellation vs.
recovery lag is a boolean, people feel that we didn't get Hot Standby
right.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 08:41:31
Message-ID:	1273394491.3936.512.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, 2010-05-08 at 14:48 -0400, Bruce Momjian wrote:

> I think the consensus is to change this setting to a boolean. If you
> don't want to do it, I am sure we can find someone who will.

You expect others to act on consensus and follow rules, yet ignore them
yourself when it suits your purpose. Your other points seem designed to
distract people from seeing that.

There is clear agreement that a problem exists. The action to take as a
result of that problem is very clearly in doubt and yet you repeatedly
ignore other people's comments and viable technical resolutions. If you
can find a cat's paw to break consensus for you, more fool them. You
might find someone with a good resolution, if you ask that instead.

--
Simon Riggs www.2ndQuadrant.com

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 11:59:50
Message-ID:	m2k4rdwaix.fsf@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> I like the proposal of a boolean because it provides only the minimal
> feature set of two cases that are both clearly needed and easily
> implementable. Whatever we do later is certain to provide a superset
> of those two cases. If we do something else (and that includes my own
> proposal of a straight lock timeout), we'll be implementing something
> we might wish to take back later. Taking out features after they've
> been in a release is very hard, even if we realize they're badly
> designed.

That's where I though my proposal fitted in. I fail to see us wanting to
take back explicit pause/resume admin functions in any future release.

Now, after having read Greg's arguments, my vote would be the following:
- hot_standby_conflict_winner = queries|replay, defaults to replay
- add pause/resume so that people can switch temporarily to queries
- label max_standby_delay *experimental*, keep current code

By clearly stating the feature is *experimental* it should be easy to
both get feedback on it so that we know what to implement in 9.1, and
should that be completely different, take back the feature. It should
even be possible to continue tweaking its behavior during beta, or do
something better.

Of course it will piss off some users, but they knew they were depending
on some *experimental* feature after all.

Regards,
--
dim

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 14:10:34
Message-ID:	1F61AC85-79B2-4D49-AA93-2DA451A26574@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 9, 2010, at 13:59 , Dimitri Fontaine wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> I like the proposal of a boolean because it provides only the minimal
>> feature set of two cases that are both clearly needed and easily
>> implementable. Whatever we do later is certain to provide a superset
>> of those two cases. If we do something else (and that includes my own
>> proposal of a straight lock timeout), we'll be implementing something
>> we might wish to take back later. Taking out features after they've
>> been in a release is very hard, even if we realize they're badly
>> designed.
>
> That's where I though my proposal fitted in. I fail to see us wanting to
> take back explicit pause/resume admin functions in any future release.
>
> Now, after having read Greg's arguments, my vote would be the following:
> - hot_standby_conflict_winner = queries|replay, defaults to replay
> - add pause/resume so that people can switch temporarily to queries
> - label max_standby_delay *experimental*, keep current code

Adding pause/resume seems to introduce some non-trivial locking problems, though. How would you handle a pause request if the recovery process currently held a lock?

Dropping the lock is not an option for correctness reasons. Otherwise you wouldn't have needed to take the lock in the first place, no?

Pausing with the lock held leads to priority-inversion like problems. Queries now might block until recovery is resumed - quite the opposite of what pause() is supposed to archive

The only remaining option is to continue applying WAL until you reach a point where no locks are held, then pause. But from a user's POV that is nearly indistinguishable from simply setting hot_standby_conflict_winner to in the first place I think.

best regards,
Florian Pflug

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 16:47:51
Message-ID:	s2h407d949e1005090947s7bbab437v1b2acd8a67e6b066@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, May 9, 2010 at 4:00 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> The use cases are covered as best they can be without better support from
> expected future SR features like heartbeats and XID loopback.

For what it's worth I think deferring these extra complications is a
very useful exercise. I would like to see a system that doesn't depend
on them for basic functionality. In particular I would like to see a
system that can be useful using purely WAL log shipping without
streaming replication at all.

I'm a bit unclear how the boolean proposal would solve things though.
Surely if you set the boolean to recovery-wins then when using
streaming replication with any non-idle master virtually every query
would be cancelled immediately as every HOT cleanup would cause a
snapshot conflict with even short-lived queires in the slave.

--
greg

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 16:52:37
Message-ID:	AANLkTikghvbmc6peLh5YQK1ByJNIRb1a2CRNMVBRKnAV@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, May 9, 2010 at 12:47 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> On Sun, May 9, 2010 at 4:00 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>> The use cases are covered as best they can be without better support from
>> expected future SR features like heartbeats and XID loopback.
>
> For what it's worth I think deferring these extra complications is a
> very useful exercise. I would like to see a system that doesn't depend
> on them for basic functionality. In particular I would like to see a
> system that can be useful using purely WAL log shipping without
> streaming replication at all.
>
> I'm a bit unclear how the boolean proposal would solve things though.
> Surely if you set the boolean to recovery-wins then when using
> streaming replication with any non-idle master virtually every query
> would be cancelled immediately as every HOT cleanup would cause a
> snapshot conflict with even short-lived queires in the slave.

It sounds to me like what we need here is some testing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 18:52:59
Message-ID:	1273431179.3936.1077.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, 2010-05-08 at 20:57 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On Sunday 09 May 2010 01:34:18 Bruce Momjian wrote:
> >> I think everyone agrees the current code is unusable, per Heikki's
> >> comment about a WAL file arriving after a period of no WAL activity, and
> >> look how long it took our group to even understand why that fails so
> >> badly.
>
> > To be honest its not *that* hard to simply make sure generating wal regularly
> > to combat that. While it surely aint a nice workaround its not much of a
> > problem either.
>
> Well, that's dumping a kluge onto users; but really that isn't the
> point. What we have here is a badly designed and badly implemented
> feature, and we need to not ship it like this so as to not
> institutionalize a bad design.

No, you have it backwards. HS was designed to work with SR. SR
unfortunately did not deliver any form of monitoring, and in doing so
the keepalive that it was known HS needed was left out, although it had
been on the todo list for some time. Luckily Greg and I argued to have
some monitoring added and my code was used to provide barest minimum
monitoring for SR, yet not enough to help HS.

Of course, if one team doesn't deliver for whatever reason then others
must take up the slack, if they can: no complaints. Since I personally
didn't know this was going to be the case until after freeze, it is very
late to resolve this situation sensibly and time has been against us.
It's much harder for me to reach into the depths of another person's
work and see how to add necessary mechanisms, especially when I'm
working elsewhere. Even if I had done, it's likely that I would have
been blocked with the "great idea, next release" response as already
used on this thread.

Without doubt the current mechanism suffers from the issues you mention,
though the current state is not the result of bad design, merely
inaction and lack of integration. We could resolve the current state in
many ways, if we chose.

Bruce has used the word crippleware for the current state. Raising a
problem and then blocking solutions is the best way I know to cripple a
release. It should be clear that I've done my best to avoid this
situation and have been active on both SR and HS. Had I not acted as I
have done to date, SR would at this point slurp CPU like a bandit and be
unmonitorable, both fatal flaws in production. I point this out not to
argue, but to set the record straight. IMHO your assignment of blame is
misplaced and your comments about poor design do not reflect how we
arrived at the current state.

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 19:04:02
Message-ID:	1273431842.3936.1098.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2010-05-09 at 16:10 +0200, Florian Pflug wrote:

> Adding pause/resume seems to introduce some non-trivial locking
> problems, though. How would you handle a pause request if the recovery
> process currently held a lock?

(We are only talking about AccessExclusiveLocks here. No LWlocks are
held across WAL records during replay)

Just pause. There are no technical problem there.

Perhaps a danger of unforeseen consequences, though doing that might
also be desirable, who can say?

--
Simon Riggs www.2ndQuadrant.com

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 19:09:22
Message-ID:	m21vdkuc2l.fsf@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> The only remaining option is to continue applying WAL until you reach
> a point where no locks are held, then pause. But from a user's POV
> that is nearly indistinguishable from simply setting
> hot_standby_conflict_winner to in the first place I think.

Not really, the use case would be using the slave as a reporting server,
you know you have say 4 hours of reporting queries during which you will
pause the recovery. So it's ok for the pause command to take time.

What I understand the boolean option would do is to force the user into
choosing either high-availability or using the slave for other purposes
too. The problem is in wanting both, and that's what HS was meant to solve.

Having pause/resume allows for a mixed case usage which is simple to
drive and understand, yet fails to provide adaptive behavior where
queries are allowed to pause recovery implicitly for a while.

In my mind, that would be a compromise we could reach for 9.0, but it
seems introducing those admin functions now is to far a stretch. I've
been failing to understand exactly why, only getting a generic answer I
find unsatisfying here, because all the alternative paths being
proposed, apart from "improve documentation", are more involved code
wise.

Regards,
--
dim

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 19:50:20
Message-ID:	9FAD82B0-69B5-473C-8695-AF096F074427@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 9, 2010, at 21:04 , Simon Riggs wrote:
> On Sun, 2010-05-09 at 16:10 +0200, Florian Pflug wrote:
>
>> Adding pause/resume seems to introduce some non-trivial locking
>> problems, though. How would you handle a pause request if the recovery
>> process currently held a lock?
>
> (We are only talking about AccessExclusiveLocks here. No LWlocks are
> held across WAL records during replay)
>
> Just pause. There are no technical problem there.
>
> Perhaps a danger of unforeseen consequences, though doing that might
> also be desirable, who can say?

No technical problems perhaps, but some usability ones, no?

I assume people would pause recovery to prevent it from interfering with long-running reporting queries. Now, if those queries might block indefinitely if the pause request by chance was issued while the recovery process held an AccessExclusiveLock, then the pause *caused* exactly what it was supposed to prevent. Setting hot_standby_conflict_winner to "queries" would at least have allowed the reporting queries to finish eventually.

If AccessExclusiveLocks are taken out of the picture (they're supposed to be pretty rare on a production system anyway), setting hot_standby_conflict_winner to "queries" seems to act like a conditional pause request - recovery is paused as soon as it gets in the way. In this setting, the real advantage of pause would be to prevent recovery from using up all available IO bandwidth. This seems like a valid concern, but calls more for something like recovery_delay (similar to vacuum_delay) instead of pause().

best regards,
Florian Pflug

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc:	Florian Pflug <fgp(at)phlo(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 20:01:51
Message-ID:	AANLkTin1F7diw2GmmulBR15jO8kMWx0kRdrrjsQfHi19@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(at)hi-media(dot)com> wrote:
> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> The only remaining option is to continue applying WAL until you reach
>> a point where no locks are held, then pause. But from a user's POV
>> that is nearly indistinguishable from simply setting
>> hot_standby_conflict_winner to in the first place I think.
>
> Not really, the use case would be using the slave as a reporting server,
> you know you have say 4 hours of reporting queries during which you will
> pause the recovery. So it's ok for the pause command to take time.

Seems like it could take FOREVER on a busy system. Surely that's not
OK. The fact that Hot Standby has to take exclusive locks that can't
be released until WAL replay has progressed to a certain point seems
like a fairly serious wart. We had a discussion on another thread of
how this can make the database fail to shut down properly, a problem
we're not addressing because we're too busy arguing about
max_standby_delay. In fact, if we knew how to pause replay without
leaving random locks lying around, we could rearrange the whole smart
shutdown sequence so that we paused replay FIRST and then waited for
all backends to exit, but the consensus on the thread where we
discussed this was that we did not know how to do that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 20:15:52
Message-ID:	1273436152.3936.1243.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2010-05-09 at 16:01 -0400, Robert Haas wrote:

> The fact that Hot Standby has to take exclusive locks that can't
> be released until WAL replay has progressed to a certain point seems
> like a fairly serious wart.

LOL

And people lecture me about design.

--
Simon Riggs www.2ndQuadrant.com

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 22:25:44
Message-ID:	9CDB991C-0AB4-4CE9-9D9B-E25A5B69C7F7@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 9, 2010, at 22:01 , Robert Haas wrote:
> On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(at)hi-media(dot)com> wrote:
>> Florian Pflug <fgp(at)phlo(dot)org> writes:
>>> The only remaining option is to continue applying WAL until you reach
>>> a point where no locks are held, then pause. But from a user's POV
>>> that is nearly indistinguishable from simply setting
>>> hot_standby_conflict_winner to in the first place I think.
>>
>> Not really, the use case would be using the slave as a reporting server,
>> you know you have say 4 hours of reporting queries during which you will
>> pause the recovery. So it's ok for the pause command to take time.
>
> Seems like it could take FOREVER on a busy system. Surely that's not
> OK. The fact that Hot Standby has to take exclusive locks that can't
> be released until WAL replay has progressed to a certain point seems
> like a fairly serious wart.

If this is a serious wart then it's not one of hot standby, but one of postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as opposed to UPDATE/DELETE-blocking locks) are never necessary from a correctness POV, they're only there for implementation reasons.

Getting rid of them doesn't seem completely insurmountable either - just as multiple row versions remove the need to block SELECTs dues to concurrent UPDATEs, multiple datafile versions could remove the need to block SELECTs due to concurrent ALTERs. But people seem to live with them quite well, judged from the amount of work put into getting rid of them (zero). I therefore fail to see why they should pose a significant problem in HS setups.

> We had a discussion on another thread of
> how this can make the database fail to shut down properly, a problem
> we're not addressing because we're too busy arguing about
> max_standby_delay. In fact, if we knew how to pause replay without
> leaving random locks lying around, we could rearrange the whole smart
> shutdown sequence so that we paused replay FIRST and then waited for
> all backends to exit, but the consensus on the thread where we
> discussed this was that we did not know how to do that.

Yeah, this was exactly my line of thought too.

best regards,
Florian Pflug

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-09 22:58:30
Message-ID:	201005100058.31428.andres@anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Monday 10 May 2010 00:25:44 Florian Pflug wrote:
> On May 9, 2010, at 22:01 , Robert Haas wrote:
> > On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
wrote:
> >> Florian Pflug <fgp(at)phlo(dot)org> writes:
> >>> The only remaining option is to continue applying WAL until you reach
> >>> a point where no locks are held, then pause. But from a user's POV
> >>> that is nearly indistinguishable from simply setting
> >>> hot_standby_conflict_winner to in the first place I think.
> >>
> >> Not really, the use case would be using the slave as a reporting server,
> >> you know you have say 4 hours of reporting queries during which you will
> >> pause the recovery. So it's ok for the pause command to take time.
> >
> > Seems like it could take FOREVER on a busy system. Surely that's not
> > OK. The fact that Hot Standby has to take exclusive locks that can't
> > be released until WAL replay has progressed to a certain point seems
> > like a fairly serious wart.
>
> If this is a serious wart then it's not one of hot standby, but one of
> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
> correctness POV, they're only there for implementation reasons.
>
> Getting rid of them doesn't seem completely insurmountable either - just as
> multiple row versions remove the need to block SELECTs dues to concurrent
> UPDATEs, multiple datafile versions could remove the need to block SELECTs
> due to concurrent ALTERs. But people seem to live with them quite well,
> judged from the amount of work put into getting rid of them (zero). I
> therefore fail to see why they should pose a significant problem in HS
> setups.
The difference is that in HS you have to wait for a moment where *no exclusive
lock at all* exist, possibly without contending for any of them, while on the
master you might not even blocked by the existence of any of those locks.

If you have two sessions which in overlapping transactions lock different
tables exlusively you have no problem shutting the master down, but you will
never reach a point where no exclusive lock is taken on the slave.

Andres

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 00:56:09
Message-ID:	AANLkTinWKJ4tfP_IoZs1iR3Bt1vyORftczsFlhTGtbWw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, May 9, 2010 at 6:58 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On Monday 10 May 2010 00:25:44 Florian Pflug wrote:
>> On May 9, 2010, at 22:01 , Robert Haas wrote:
>> > On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
> wrote:
>> >> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> >>> The only remaining option is to continue applying WAL until you reach
>> >>> a point where no locks are held, then pause. But from a user's POV
>> >>> that is nearly indistinguishable from simply setting
>> >>> hot_standby_conflict_winner to in the first place I think.
>> >>
>> >> Not really, the use case would be using the slave as a reporting server,
>> >> you know you have say 4 hours of reporting queries during which you will
>> >> pause the recovery. So it's ok for the pause command to take time.
>> >
>> > Seems like it could take FOREVER on a busy system. Surely that's not
>> > OK. The fact that Hot Standby has to take exclusive locks that can't
>> > be released until WAL replay has progressed to a certain point seems
>> > like a fairly serious wart.
>>
>> If this is a serious wart then it's not one of hot standby, but one of
>> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
>> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
>> correctness POV, they're only there for implementation reasons.
>>
>> Getting rid of them doesn't seem completely insurmountable either - just as
>> multiple row versions remove the need to block SELECTs dues to concurrent
>> UPDATEs, multiple datafile versions could remove the need to block SELECTs
>> due to concurrent ALTERs. But people seem to live with them quite well,
>> judged from the amount of work put into getting rid of them (zero). I
>> therefore fail to see why they should pose a significant problem in HS
>> setups.
> The difference is that in HS you have to wait for a moment where *no exclusive
> lock at all* exist, possibly without contending for any of them, while on the
> master you might not even blocked by the existence of any of those locks.
>
> If you have two sessions which in overlapping transactions lock different
> tables exlusively you have no problem shutting the master down, but you will
> never reach a point where no exclusive lock is taken on the slave.

A possible solution to this in the shutdown case is to kill anyone
waiting on a lock held by the startup process at the same time we kill
the startup process, and to kill anyone who subsequently waits for
such a lock as soon as they attempt to take it. I'm not sure if this
would also make sense in the pause case.

Another possible solution would be to try to figure out if there's a
way to delay application of WAL that requires the taking of AELs to
the point where we could apply it all at once. That might not be
feasible, though, or only in some cases, and it's certainly 9.1
material (at least) in any case.

Anyway, this is all a little off-topic. We need to get back to
arguing about how best to cut the legs out from under a feature that's
been in the tree for six months but Tom didn't get around to looking
at until last week. I'll restate my position: now that I understand
what the issues are (I think), the feature as currently implemented
seems pretty wonky, but cutting it down to a boolean seems like an
exercise in excessive pessimism about our ability to predict future
development directions, as well as possibly quite inconvenient for
people attempting to use Hot Standby. Therefore I think we should
adopt Tom's original proposal (with +1 also from Stephen Frost), but
that doesn't seem likely to fly because, on the one hand, we have Tom
himself arguing (along with Bruce and possibly Heikki) that we should
whack it down all the way to a boolean; and on the other hand Simon
and Greg Smith and I think also Andres Freund and Kevin Grittner
arguing that the original feature is OK as-is.

Other people who weighed in include Stefan Kaltenbrunner (who opined
that Tom had a legitimate complaint about the current design but
didn't vote for a specific resolution), Greg Sabino Mullane (who
pointed out that SOME of the issues that Tom raised could be solved
with proper time synchronization), Josh Drake (who thought requiring
NTP to be working was a bad idea, and therefore presumably favors
changing something), Josh Berkus (who changed his vote at least once
and whose priority seems to have to do with releasing before the turn
of the century than with the actual technical option we select,
apologies if I'm misreading his emails), Greg Stark (who seems to
think that a boolean will be bad news but didn't specifically vote for
another option), Dimitri Fontaine (who wants a boolean plus
pause/resume functions, or maybe a plugin facility of some kind), Rob
Wultsch (who doesn't ever want to kill queries and therefore would be
happy with a boolean), Yeb Havinga (who never wants to stall recovery
and therefore would also be happy with a boolean), and Florian Pflug
(who points out that pause/resume is actually a nontrivial feature).
Apologies if I've left anyone out or misrepresented their position.

Overall I would say opinion is about evenly split between:

- leave it as-is
- make it a Boolean
- change it in some way but to something more expressive than a Boolean

I can't presume to extract a consensus from that; I don't think there
is one. You could say "the majority of people want to change
something" and that would be true; you could also say "the majority of
people don't want a Boolean" and that would also be true.

IF we adopt "leave it as-is", then we need to document that you will
need to both run ntp and run some sort of heartbeat process on the
master to make sure that at least a small amount of WAL keeps getting
generated; or else you'll have massive query cancellations. IF we
decide to make it a Boolean, then we need to document that you have to
choose between the possibility of recovery falling arbitrarily behind
as a result of even one query holding an exclusive lock, or
alternatively instantaneously canceling queries that conflict, however
briefly, with replay. IF we adopt Tom's original proposal, then we'll
need to document that the timeout given is per-lock-wait, and
therefore if the lock timeout is not zero and there are many lock
waits the standby may fall far behind and have difficulty catching up.
IF we decide to do something else, then I don't know.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 06:27:44
Message-ID:	1273472865.3936.1954.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2010-05-09 at 20:56 -0400, Robert Haas wrote:

> >> > Seems like it could take FOREVER on a busy system. Surely that's not
> >> > OK. The fact that Hot Standby has to take exclusive locks that can't
> >> > be released until WAL replay has progressed to a certain point seems
> >> > like a fairly serious wart.
> >>
> >> If this is a serious wart then it's not one of hot standby, but one of
> >> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
> >> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
> >> correctness POV, they're only there for implementation reasons.
> >>
> >> Getting rid of them doesn't seem completely insurmountable either - just as
> >> multiple row versions remove the need to block SELECTs dues to concurrent
> >> UPDATEs, multiple datafile versions could remove the need to block SELECTs
> >> due to concurrent ALTERs. But people seem to live with them quite well,
> >> judged from the amount of work put into getting rid of them (zero). I
> >> therefore fail to see why they should pose a significant problem in HS
> >> setups.
> > The difference is that in HS you have to wait for a moment where *no exclusive
> > lock at all* exist, possibly without contending for any of them, while on the
> > master you might not even blocked by the existence of any of those locks.
> >
> > If you have two sessions which in overlapping transactions lock different
> > tables exlusively you have no problem shutting the master down, but you will
> > never reach a point where no exclusive lock is taken on the slave.
>
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it.

I already explained that killing the startup process first is a bad idea
for many reasons when shutdown was discussed. Can't remember who added
the new standby shutdown code recently, but it sounds like their design
was pretty poor if it didn't include shutting down properly with HS. I
hope they fix the bug they have introduced. HS was never designed to
work that way, so there is no flaw there; it certainly worked when
committed.

> I'm not sure if this
> would also make sense in the pause case.

Not sure why pausing replay would make any difference at all. Being
between one WAL record and the next is a valid and normal state that
exists many thousands of times per second. If making that state longer
would cause problems we would already have seen any issues. There are
none, it will work fine.

> Another possible solution would be to try to figure out if there's a
> way to delay application of WAL that requires the taking of AELs to
> the point where we could apply it all at once. That might not be
> feasible, though, or only in some cases, and it's certainly 9.1
> material (at least) in any case.

Locks usually protect users from accessing a table while its being
clustered or dropped or something like that. Locks are not bad. They are
also used by some developers to specifically serialize access to an
object. AccessExclusiveLocks are rare in normal running and not to be
avoided when they do exist. HS correctly supports locking, as and when
such locks are made on the master.

--
Simon Riggs www.2ndQuadrant.com

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 08:01:57
Message-ID:	87mxw85gne.fsf@hi-media-techno.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, May 9, 2010 at 6:58 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> The difference is that in HS you have to wait for a moment where *no exclusive
>> lock at all* exist, possibly without contending for any of them, while on the
>> master you might not even blocked by the existence of any of those locks.
>>
>> If you have two sessions which in overlapping transactions lock different
>> tables exlusively you have no problem shutting the master down, but you will
>> never reach a point where no exclusive lock is taken on the slave.
>
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it. I'm not sure if this
> would also make sense in the pause case.

Well, wait, I'm getting lost here. It seems to me that no query on the
slave is granted to take AEL, not matter what. The only case is a query
waiting for the replay to release its locks.

The only consequence of pause not waiting for any lock to get released
from the replay is that those backends will be, well, paused. But that
applies the same to any backend started after we pause.

Waiting for replay to release all its locks before to pause would mean
that there's a possibility that the activity on the master is such that
you never reach a pause in the WAL stream. Let's assume we want any new
code we throw in at this stage to be a magic wand making every use happy
at once.

So we'd need a pause function taking either 1 or 2 arguments, first is
to say we pause now even if we know the replay is holding some locks
that might pause the reporting queries too, the other is to wait until
the locks are not held anymore, with a timeout (default 1min?).

Ok, that's designing the API we're missing, and we should not be in the
process of doing any design at this stage. But we are.

> [good summary of current positions]
> I can't presume to extract a consensus from that; I don't think there
> is one.

All we know for sure is that Tom does not want to release as-is, and he
rightfully insists on several objectives as far as the editing is
concerned:
- no addition of code we might want to throw away later
- avoid having to deprecate released behavior, it's too hard
- minimal change set, possibly with no new features.

One more, pausing the replay is *already* in the code base, it's exactly
what happens under the hood if you favor queries rather than replay, to
the point I don't understand why the pause design needs to happen
now. We're only talking about having an *explicit* version of it.

Regards,
--
dim

I too am growing tired of insisting this much. I only continue because I
really can't get to understand why-o-why considering a new API over
existing feature is not possible at this stage. I'm hitting my head on
the wal, so to say…

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 09:43:33
Message-ID:	4BE7D545.1090003@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Sun, May 9, 2010 at 6:58 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> On Monday 10 May 2010 00:25:44 Florian Pflug wrote:
>>> On May 9, 2010, at 22:01 , Robert Haas wrote:
>>>> On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
>> wrote:
>>>> Seems like it could take FOREVER on a busy system. Surely that's not
>>>> OK. The fact that Hot Standby has to take exclusive locks that can't
>>>> be released until WAL replay has progressed to a certain point seems
>>>> like a fairly serious wart.
>>> If this is a serious wart then it's not one of hot standby, but one of
>>> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
>>> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
>>> correctness POV, they're only there for implementation reasons.
>>>
>>> Getting rid of them doesn't seem completely insurmountable either - just as
>>> multiple row versions remove the need to block SELECTs dues to concurrent
>>> UPDATEs, multiple datafile versions could remove the need to block SELECTs
>>> due to concurrent ALTERs. But people seem to live with them quite well,
>>> judged from the amount of work put into getting rid of them (zero). I
>>> therefore fail to see why they should pose a significant problem in HS
>>> setups.
>> The difference is that in HS you have to wait for a moment where *no exclusive
>> lock at all* exist, possibly without contending for any of them, while on the
>> master you might not even blocked by the existence of any of those locks.
>>
>> If you have two sessions which in overlapping transactions lock different
>> tables exlusively you have no problem shutting the master down, but you will
>> never reach a point where no exclusive lock is taken on the slave.
>
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it.

If you're not going to apply any more WAL records before shutdown, you
could also just release all the AccessExclusiveLocks held by the startup
process. Whatever the transaction was doing with the locked relation, if
we're not going to replay any more WAL records before shutdown, we will
not see the transaction committing or doing anything else with the
relation, so we should be safe. Whatever state the data on disk is in,
it must be valid, or we would have a problem with crash recovery
recovering up to this WAL record and then starting up too.

I'm not 100% clear if that reasoning applies to AccessExclusiveLocks
taken explicitly with LOCK TABLE. It's not clear what the application
would use the lock for.

Nevertheless, maybe killing the transactions that wait for the locks
would be more intuitive anyway.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 10:03:13
Message-ID:	4BE7D9E1.9090707@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Thu, May 6, 2010 at 2:47 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> Now that I've realized what the real problem is with max_standby_delay
>>> (namely, that inactivity on the master can use up the delay), I think
>>> we should do what Tom originally suggested here. It's not as good as
>>> a really working max_standby_delay, but we're not going to have that
>>> for 9.0, and it's clearly better than a boolean.
>> I guess I'm not clear on how what Tom proposed is fundamentally
>> different from max_standby_delay = -1. If there's enough concurrent
>> queries, recovery would never catch up.
>
> If your workload is that the standby server is getting pounded with
> queries like crazy, then it's probably not that different: it will
> fall progressively further behind. But I suspect many people will set
> up standby servers where most of the activity happens on the primary,
> but they run some reporting queries on the standby. If you expect
> your reporting queries to finish in <10s, you could set the max delay
> to say 60s. In the event that something gets wedged, recovery will
> eventually kill it and move on rather than just getting stuck forever.
> If the volume of queries is known not to be too high, it's reasonable
> to expect that a few good whacks will be enough to get things back on
> track.

Yeah, I could live with that.

A problem with using the name "max_standby_delay" for Tom's suggestion
is that it sounds like a hard limit, which it isn't. But if we name it
something like:

# -1 = no timeout
# 0 = kill conflicting queries immediately
# > 0 wait for N seconds, then kill query
standby_conflict_timeout = -1

it's more clear that the setting is a timeout for each *conflict*, and
it's less surprising that the standby can fall indefinitely behind in
the worst case. If we name the setting along those lines, I could live
with that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 10:13:48
Message-ID:	3700B58D-452D-458D-983D-D68B7A91DA97@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 10, 2010, at 11:43 , Heikki Linnakangas wrote:
> If you're not going to apply any more WAL records before shutdown, you
> could also just release all the AccessExclusiveLocks held by the startup
> process. Whatever the transaction was doing with the locked relation, if
> we're not going to replay any more WAL records before shutdown, we will
> not see the transaction committing or doing anything else with the
> relation, so we should be safe. Whatever state the data on disk is in,
> it must be valid, or we would have a problem with crash recovery
> recovering up to this WAL record and then starting up too.

Sounds plausible. But wouldn't this imply that HS could *always* postpone the acquisition of an AccessExclusiveLocks until right before the corresponding commit record is replayed? If fail to see a case where this would fail, yet recovery in case of an intermediate crash would be correct.

best regards,
Florian Pflug

From:	Aidan Van Dyk <aidan(at)highrise(dot)ca>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 11:51:47
Message-ID:	20100510115147.GA14663@oak.highrise.ca
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> [100510 06:03]:

> A problem with using the name "max_standby_delay" for Tom's suggestion
> is that it sounds like a hard limit, which it isn't. But if we name it
> something like:

I'ld still rather an "if your killing something, make sure you kill
enough to get all the way current" behaviour, but that's just me....

I'm want to run my standbys in a always current mode... But if I decide
to play with a lagged HR, I really want to make sure there is some
mechanism to cap the lag, and the "cap" is something I can understand
and use to make a reasonable estimate as to when data I know is live on
the primary will be seen on the standby...

bonus points if it works similarly for archive recovery ;-)

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 11:53:52
Message-ID:	AANLkTimWExb70BkMZWAUi0MpLiSHKRy3UkSkrajM1z6a@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 10, 2010 at 2:27 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I already explained that killing the startup process first is a bad idea
> for many reasons when shutdown was discussed. Can't remember who added
> the new standby shutdown code recently, but it sounds like their design
> was pretty poor if it didn't include shutting down properly with HS. I
> hope they fix the bug they have introduced. HS was never designed to
> work that way, so there is no flaw there; it certainly worked when
> committed.

The patch was written by Fujii Masao and committed, after review, by
me. Prior to that patch, smart shutdown never worked; now it works,
or so I believe, unless recovery is stalled holding a lock upon which
a regular back-end is blocking. Clearly that is both better and not
all that good. If you have any ideas to improve the situation
further, I'm all ears.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 11:55:32
Message-ID:	AANLkTimo0PCuQa_L5ML-OHXr44F_09KQglGRqTgtqWnZ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 10, 2010 at 6:03 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Yeah, I could live with that.
>
> A problem with using the name "max_standby_delay" for Tom's suggestion
> is that it sounds like a hard limit, which it isn't. But if we name it
> something like:
>
> # -1 = no timeout
> # 0 = kill conflicting queries immediately
> # > 0 wait for N seconds, then kill query
> standby_conflict_timeout = -1
>
> it's more clear that the setting is a timeout for each *conflict*, and
> it's less surprising that the standby can fall indefinitely behind in
> the worst case. If we name the setting along those lines, I could live
> with that.

Yeah, if we do it that way, +1 for changing the name, and your
suggestion seems good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 11:57:23
Message-ID:	AANLkTikEFWm3YgU0PXjwuAP8opIBEjIe2X6iMKPRW8DE@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 10, 2010 at 6:13 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On May 10, 2010, at 11:43 , Heikki Linnakangas wrote:
>> If you're not going to apply any more WAL records before shutdown, you
>> could also just release all the AccessExclusiveLocks held by the startup
>> process. Whatever the transaction was doing with the locked relation, if
>> we're not going to replay any more WAL records before shutdown, we will
>> not see the transaction committing or doing anything else with the
>> relation, so we should be safe. Whatever state the data on disk is in,
>> it must be valid, or we would have a problem with crash recovery
>> recovering up to this WAL record and then starting up too.
>
> Sounds plausible. But wouldn't this imply that HS could *always* postpone the acquisition of an AccessExclusiveLocks until right before the corresponding commit record is replayed? If fail to see a case where this would fail, yet recovery in case of an intermediate crash would be correct.

Yeah, I'd like to understand this, too. I don't have a clear
understanding of when HS needs to take locks here in the first place.

[removing Josh Berkus's persistently bouncing email from the CC line]

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 12:00:45
Message-ID:	4BE7F56D.1020305@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Florian Pflug wrote:
> On May 10, 2010, at 11:43 , Heikki Linnakangas wrote:
>> If you're not going to apply any more WAL records before shutdown, you
>> could also just release all the AccessExclusiveLocks held by the startup
>> process. Whatever the transaction was doing with the locked relation, if
>> we're not going to replay any more WAL records before shutdown, we will
>> not see the transaction committing or doing anything else with the
>> relation, so we should be safe. Whatever state the data on disk is in,
>> it must be valid, or we would have a problem with crash recovery
>> recovering up to this WAL record and then starting up too.
>
> Sounds plausible. But wouldn't this imply that HS could *always* postpone the acquisition of an AccessExclusiveLocks until right before the corresponding commit record is replayed? If fail to see a case where this would fail, yet recovery in case of an intermediate crash would be correct.

I guess it could in some situations, but for example the
AccessExclusiveLock taken at the end of lazy vacuum to truncate the
relation must be held during the truncation, or concurrent readers will
get upset.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Florian Pflug <fgp(at)phlo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 12:11:56
Message-ID:	201005101411.57216.andres@anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Monday 10 May 2010 14:00:45 Heikki Linnakangas wrote:
> Florian Pflug wrote:
> > On May 10, 2010, at 11:43 , Heikki Linnakangas wrote:
> >> If you're not going to apply any more WAL records before shutdown, you
> >> could also just release all the AccessExclusiveLocks held by the startup
> >> process. Whatever the transaction was doing with the locked relation, if
> >> we're not going to replay any more WAL records before shutdown, we will
> >> not see the transaction committing or doing anything else with the
> >> relation, so we should be safe. Whatever state the data on disk is in,
> >> it must be valid, or we would have a problem with crash recovery
> >> recovering up to this WAL record and then starting up too.
> >
> > Sounds plausible. But wouldn't this imply that HS could *always* postpone
> > the acquisition of an AccessExclusiveLocks until right before the
> > corresponding commit record is replayed? If fail to see a case where
> > this would fail, yet recovery in case of an intermediate crash would be
> > correct.
>
> I guess it could in some situations, but for example the
> AccessExclusiveLock taken at the end of lazy vacuum to truncate the
> relation must be held during the truncation, or concurrent readers will
> get upset.
Actually all the locks that do not need to be taken on the slave would not
need to be an ACCESS EXCLUSIVE but a EXCLUSIVE on the master, right? That
should be "fixed" on the master, not hacked up on the slave and is by far out
of scope of 9.0.
Thats an area where I definitely would like to improve pg in the future...

Andres

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 14:57:11
Message-ID:	201005101457.o4AEvBm24537@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> Bruce has used the word crippleware for the current state. Raising a
> problem and then blocking solutions is the best way I know to cripple a
> release. It should be clear that I've done my best to avoid this

FYI, it was Robert Haas who used the term "crippleware" to describe a
boolean value for max_standby_delay, and I was just repeating his term,
and disputing it would be crippleware.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 15:07:35
Message-ID:	201005101507.o4AF7ZS25912@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> Wultsch (who doesn't ever want to kill queries and therefore would be
> happy with a boolean), Yeb Havinga (who never wants to stall recovery
> and therefore would also be happy with a boolean), and Florian Pflug
> (who points out that pause/resume is actually a nontrivial feature).
> Apologies if I've left anyone out or misrepresented their position.
>
> Overall I would say opinion is about evenly split between:
>
> - leave it as-is
> - make it a Boolean
> - change it in some way but to something more expressive than a Boolean
>
> I can't presume to extract a consensus from that; I don't think there
> is one. You could say "the majority of people want to change
> something" and that would be true; you could also say "the majority of
> people don't want a Boolean" and that would also be true.

Yep, this is where we are. Discussion had stopped, so it seemed like
time for a decision, and with no one agreeing on what to do, feature
removal seemed like the best approach. Suggesting we will fix it later
in beta is not a solution.

Now, if everyone agrees we should do X, and X in simple, lets do X, but
I am stil not seeing that.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Mike Rylander <mrylander(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 15:22:20
Message-ID:	AANLkTinkWxND3Dgc5_8mOs5pedTtMSk003oZWqcFBUUk@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 10, 2010 at 6:03 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Robert Haas wrote:
>> On Thu, May 6, 2010 at 2:47 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>>> Now that I've realized what the real problem is with max_standby_delay
>>>> (namely, that inactivity on the master can use up the delay), I think
>>>> we should do what Tom originally suggested here. It's not as good as
>>>> a really working max_standby_delay, but we're not going to have that
>>>> for 9.0, and it's clearly better than a boolean.
>>> I guess I'm not clear on how what Tom proposed is fundamentally
>>> different from max_standby_delay = -1. If there's enough concurrent
>>> queries, recovery would never catch up.
>>
>> If your workload is that the standby server is getting pounded with
>> queries like crazy, then it's probably not that different: it will
>> fall progressively further behind. But I suspect many people will set
>> up standby servers where most of the activity happens on the primary,
>> but they run some reporting queries on the standby. If you expect
>> your reporting queries to finish in <10s, you could set the max delay
>> to say 60s. In the event that something gets wedged, recovery will
>> eventually kill it and move on rather than just getting stuck forever.
>> If the volume of queries is known not to be too high, it's reasonable
>> to expect that a few good whacks will be enough to get things back on
>> track.
>
> Yeah, I could live with that.
>
> A problem with using the name "max_standby_delay" for Tom's suggestion
> is that it sounds like a hard limit, which it isn't. But if we name it
> something like:
>
> # -1 = no timeout
> # 0 = kill conflicting queries immediately
> # > 0 wait for N seconds, then kill query
> standby_conflict_timeout = -1
>
> it's more clear that the setting is a timeout for each *conflict*, and
> it's less surprising that the standby can fall indefinitely behind in
> the worst case. If we name the setting along those lines, I could live
> with that.

+1 from the peanut gallery.

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Robert Haas" <robertmhaas(at)gmail(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc:	"Greg Smith" <greg(at)2ndquadrant(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Andres Freund" <andres(at)anarazel(dot)de>, "Dimitri Fontaine" <dfontaine(at)hi-media(dot)com>, "Florian Pflug" <fgp(at)phlo(dot)org>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 15:39:00
Message-ID:	4BE7E24402000025000314AB@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Robert Haas wrote:

>> Overall I would say opinion is about evenly split between:
>>
>> - leave it as-is
>> - make it a Boolean
>> - change it in some way but to something more expressive than a
>> Boolean

I think a boolean would limit the environments in which HS would be
useful. Personally, I think how far the replica is behind the
source is a more useful metric, even with anomalies on the
transition from idle to active; but a blocking duration would be
much better than no finer control than the boolean. So my "instant
runoff second choice" would be for the block duration knob.

> time for a decision, and with no one agreeing on what to do,
> feature removal seemed like the best approach.

I keep wondering at the assertion that once a GUC is present
(especially a tuning GUC like this) that we're stuck with it. I
know that's true of SQL code constructs, but postgresql.conf files?
How about redirect_stderr, max_fsm_*, sort_mem, etc.? This argument
seems tenuous.

> Suggesting we will fix it later in beta is not a solution.

I'm with you there, 100%

-Kevin

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 16:01:00
Message-ID:	20100510160100.GL21875@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Aidan Van Dyk (aidan(at)highrise(dot)ca) wrote:
> * Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> [100510 06:03]:
> > A problem with using the name "max_standby_delay" for Tom's suggestion
> > is that it sounds like a hard limit, which it isn't. But if we name it
> > something like:
>
> I'ld still rather an "if your killing something, make sure you kill
> enough to get all the way current" behaviour, but that's just me....

I agree with that comment, and it's more like what max_standby_delay
was. That's what I had thought Tom was proposing initially,
since it makes a heck of alot more sense to me than "just keep
waiting, just keep waiting..".

Now, if it's possible to have things queue up behind the recovery
process, such that the recovery process will only wait up to
timeout * # of locks held when recovery started, that might be alright,
but that's not the impression I've gotten about how this will work.

Of course, I also want to be able to have a Nagios hook that checks how
far behind the slave has gotten, and a way to tell the slave "oook,
you're too far behind, just forcibly catch up right *now*". If I could
use reload to change max_standby_delay (or whatever) and I can figure
out how long the delay is (even if I have to update a table on the
master and then see what it says on the slave..), I'd be happy.

That being said, I do think it makes more sense to wait until we've got
a conflict to start the timer, and I rather like avoiding the
uncertainty of time sync between master and slave by using WAL arrival
time on the slave.

Thanks,

Stephen

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 16:20:11
Message-ID:	201005101620.o4AGKBD03816@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Robert Haas wrote:
>
> >> Overall I would say opinion is about evenly split between:
> >>
> >> - leave it as-is
> >> - make it a Boolean
> >> - change it in some way but to something more expressive than a
> >> Boolean
>
> I think a boolean would limit the environments in which HS would be
> useful. Personally, I think how far the replica is behind the
> source is a more useful metric, even with anomalies on the
> transition from idle to active; but a blocking duration would be
> much better than no finer control than the boolean. So my "instant
> runoff second choice" would be for the block duration knob.
>
> > time for a decision, and with no one agreeing on what to do,
> > feature removal seemed like the best approach.
>
> I keep wondering at the assertion that once a GUC is present
> (especially a tuning GUC like this) that we're stuck with it. I
> know that's true of SQL code constructs, but postgresql.conf files?
> How about redirect_stderr, max_fsm_*, sort_mem, etc.? This argument
> seems tenuous.

You are right that we are much more flexible about changing
administrative configuration parameters (like this one) than SQL. In the
past, we even renamed logging parameters to be more consistent, and I
think that proves the bar is quite low for GUC administrative parameter
change. :-)

The concern about 'max_standby_delay' is that it controls a lot of new
code and affects the behavior of HS/SR in ways that might cause a poor
user experience, expecially for non-expert users. I admit that expert
users can use the setting, but we are coding for a general user base,
and we might have to field many questions about 'max_standby_delay' from
general users that will make us look bad. "The setting is total
useless" is something we have heard about other partial solutions we
have released in the past. We try to avoid that. ;-) Labeling
something "experimental" also makes our code look sloppy. And if we
decide the problem is unsolvable using this approach, we should remove
it now rather than later. We don't like to carry around a wart for a
small segment of our userbase.

I realize many of you have not been around to see some of our
less-than-perfect solutions and to see the pain they cause. Once
something gets it, we have to fight to remove it. In fact, there is no
way we would add 'max_standby_delay' into our codebase now, knowing its
limitations, but people are having to fight hard for its removal, if
necessary.

Now that discussion has restarted again, let's keep going to see if can
reach some kind of simple solution.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 16:44:44
Message-ID:	g2o407d949e1005100944gd1e7b312xd0dce6dd23805ab3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 10, 2010 at 5:20 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> You are right that we are much more flexible about changing
> administrative configuration parameters (like this one) than SQL. In the
> past, we even renamed logging parameters to be more consistent, and I
> think that proves the bar is quite low for GUC administrative parameter
> change. :-)
>
> The concern about 'max_standby_delay' is that it controls a lot of new
> code and affects the behavior of HS/SR in ways that might cause a poor
> user experience, expecially for non-expert users.

I would like to propose that we do the following:

1) Replace max_standby_delay with a boolean as per heikki's suggestion

2) Add an explicitly experimental option like max_standby_delay or
recovery_conflict_timeout which is only effective if you've chosen
recovery_conflict="pause recovery"
option and is explicitly documented as being scheduled to be replaced
with a more complete system in future versions.

My thinking is that when we do replace max_standby_delay we would keep
the recovery_conflict parameter with the same semantics. It's just the
additional experimental option which would change.

--
greg

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 20:42:21
Message-ID:	4BE86FAD.4040703@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> 1) Replace max_standby_delay with a boolean as per heikki's suggestion
>
> 2) Add an explicitly experimental option like max_standby_delay or
> recovery_conflict_timeout which is only effective if you've chosen
> recovery_conflict="pause recovery"
> option and is explicitly documented as being scheduled to be replaced
> with a more complete system in future versions.

As far as I can tell, the current delay *works*. It just doesn't
necessarily work the way most people expect it to to work. Kind of
like, hmmm, shared_buffers? Or effective_cache_size? Or
effective_io_concurrency?

And I still think that having this kind of a delay option will give us
invaluable use feedback on how the option *should* work in 9.1, which we
won't get if we don't have an option. I think we will be overhauling it
for 9.1, but I don't think that overhaul will benefit from a lack of data.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	"Robert Haas" <robertmhaas(at)gmail(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Greg Smith" <greg(at)2ndquadrant(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Andres Freund" <andres(at)anarazel(dot)de>, "Dimitri Fontaine" <dfontaine(at)hi-media(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-10 22:43:24
Message-ID:	0C2FA100-EBA0-414A-AC83-35BC8D2154F4@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 10, 2010, at 17:39 , Kevin Grittner wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> Robert Haas wrote:
>
>>> Overall I would say opinion is about evenly split between:
>>>
>>> - leave it as-is
>>> - make it a Boolean
>>> - change it in some way but to something more expressive than a
>>> Boolean
>
> I think a boolean would limit the environments in which HS would be
> useful. Personally, I think how far the replica is behind the
> source is a more useful metric, even with anomalies on the
> transition from idle to active; but a blocking duration would be
> much better than no finer control than the boolean. So my "instant
> runoff second choice" would be for the block duration knob.

You could always toggle that boolean automatically, based on some measurement of the replication lag (Assuming the boolean would be settable at runtime). That'd give you much more flexibility than any built-on knob could provide, and even more so than a built-in knob with known deficiencies.

My preference is hence to make it a boolean, but in a way that allows more advanced behavior to be implemented on top of it. In the simplest case by allowing the boolean to be flipped at runtime and ensuring that the system reacts in a sane way.

best regards,
Florian Pflug

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-11 05:01:23
Message-ID:	AANLkTiktHw60jeEF9rJ2jdc3y9GEOGPWTkkahc_jlUfZ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 10, 2010 at 3:27 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I already explained that killing the startup process first is a bad idea
> for many reasons when shutdown was discussed. Can't remember who added
> the new standby shutdown code recently, but it sounds like their design
> was pretty poor if it didn't include shutting down properly with HS. I
> hope they fix the bug they have introduced. HS was never designed to
> work that way, so there is no flaw there; it certainly worked when
> committed.

New smart shutdown during recovery doesn't kill the startup process until
all of the read only backends have gone away. So it works fine with HS.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 06:50:57
Message-ID:	1273647057.308.306.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2010-05-11 at 14:01 +0900, Fujii Masao wrote:
> On Mon, May 10, 2010 at 3:27 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > I already explained that killing the startup process first is a bad idea
> > for many reasons when shutdown was discussed. Can't remember who added
> > the new standby shutdown code recently, but it sounds like their design
> > was pretty poor if it didn't include shutting down properly with HS. I
> > hope they fix the bug they have introduced. HS was never designed to
> > work that way, so there is no flaw there; it certainly worked when
> > committed.
>
> New smart shutdown during recovery doesn't kill the startup process until
> all of the read only backends have gone away. So it works fine with HS.

Yes, I thought some more about what Robert said. HS works identically to
normal running in this regard, so there's no hint of a bug or design
flaw on that for either of us to worry about.

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 11:10:22
Message-ID:	AANLkTik4tlswLtbfEorxF3V4WCwrdOnSY0wgB-HK8_9T@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 2:50 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Tue, 2010-05-11 at 14:01 +0900, Fujii Masao wrote:
>> On Mon, May 10, 2010 at 3:27 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > I already explained that killing the startup process first is a bad idea
>> > for many reasons when shutdown was discussed. Can't remember who added
>> > the new standby shutdown code recently, but it sounds like their design
>> > was pretty poor if it didn't include shutting down properly with HS. I
>> > hope they fix the bug they have introduced. HS was never designed to
>> > work that way, so there is no flaw there; it certainly worked when
>> > committed.
>>
>> New smart shutdown during recovery doesn't kill the startup process until
>> all of the read only backends have gone away. So it works fine with HS.
>
> Yes, I thought some more about what Robert said. HS works identically to
> normal running in this regard, so there's no hint of a bug or design
> flaw on that for either of us to worry about.

I'm not sure what to make of this. Sometimes not shutting down
doesn't sound like a feature to me.

http://archives.postgresql.org/pgsql-hackers/2010-05/msg00098.php
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00103.php

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 11:26:31
Message-ID:	1273663591.308.441.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:

> I'm not sure what to make of this. Sometimes not shutting down
> doesn't sound like a feature to me.

It acts exactly the same in recovery as in normal running. It is not a
special feature of recovery at all, bug or otherwise.

You may think its a strange feature generally and I would agree. I would
welcome you changing that in 9.1+, as long as your change works in both
recovery and normal running.

--
Simon Riggs www.2ndQuadrant.com

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 11:33:58
Message-ID:	AANLkTimhpTNCSglw6D889Wy9-MRseY9UmloWkZlecHMF@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 12:26 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
>
>> I'm not sure what to make of this. Sometimes not shutting down
>> doesn't sound like a feature to me.
>
> It acts exactly the same in recovery as in normal running. It is not a
> special feature of recovery at all, bug or otherwise.

I admit I've sometimes been surprised that smart shutdown was waiting
when I didn't expect it to.

It would be good to give the shutdown more feedback. If it explicitly
shows "Waiting for n sessions with active transactions to commit" or
"Waiting for n sessions to disconnect" then the user would at least
understand why it was waiting and what would be necessary to get it to
continue.

--
greg

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 12:52:15
Message-ID:	AANLkTilAGEimrC9uof7e92CUBgZCYeVJqSmw3bTZb9qX@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 7:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
>
>> I'm not sure what to make of this. Sometimes not shutting down
>> doesn't sound like a feature to me.
>
> It acts exactly the same in recovery as in normal running. It is not a
> special feature of recovery at all, bug or otherwise.

Simon, that doesn't make any sense. We are talking about a backend
getting stuck forever on an exclusive lock that is held by the startup
process and which will never be released (for example, because the
master has shut down and no more WAL can be obtained for replay). The
startup process does not hold locks in normal operation.

There are other things we might want to change about the shutdown
behavior (for example, switching from smart to fast automatically
after N seconds) which could apply to both the primary and the standby
and which might also be workarounds for this problem, but this
particular issue is specific to Hot Standby mode and pretending
otherwise is just sticking your head in the sand.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 13:15:11
Message-ID:	1273670111.308.556.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 08:52 -0400, Robert Haas wrote:
> On Wed, May 12, 2010 at 7:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
> >
> >> I'm not sure what to make of this. Sometimes not shutting down
> >> doesn't sound like a feature to me.
> >
> > It acts exactly the same in recovery as in normal running. It is not a
> > special feature of recovery at all, bug or otherwise.
>
> Simon, that doesn't make any sense. We are talking about a backend
> getting stuck forever on an exclusive lock that is held by the startup
> process and which will never be released (for example, because the
> master has shut down and no more WAL can be obtained for replay). The
> startup process does not hold locks in normal operation.

When I test it, startup process holding a lock does not prevent shutdown
of a standby.

I'd be happy to see your test case showing a bug exists and that the
behaviour differs from normal running.

--
Simon Riggs www.2ndQuadrant.com

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 14:03:31
Message-ID:	4BEAB533.7060302@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Wed, 2010-05-12 at 08:52 -0400, Robert Haas wrote:
>> On Wed, May 12, 2010 at 7:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
>>>
>>>> I'm not sure what to make of this. Sometimes not shutting down
>>>> doesn't sound like a feature to me.
>>> It acts exactly the same in recovery as in normal running. It is not a
>>> special feature of recovery at all, bug or otherwise.
>> Simon, that doesn't make any sense. We are talking about a backend
>> getting stuck forever on an exclusive lock that is held by the startup
>> process and which will never be released (for example, because the
>> master has shut down and no more WAL can be obtained for replay). The
>> startup process does not hold locks in normal operation.
>
> When I test it, startup process holding a lock does not prevent shutdown
> of a standby.
>
> I'd be happy to see your test case showing a bug exists and that the
> behaviour differs from normal running.

In my testing the postmaster simply does not shut down even with no
clients connected any more once in a while - most of the time it works
just fine but in like 1 out of 10 cases it get's stuck - my testcase (as
detailed in the related thread) is simply doing an interval load on the
master (pgbench -T 120 && sleep 30 && pgbench -T 120 - rinse and repeat
as needed) and pgbench -S && pg_ctl restart && pgbench -S in a lop on
the standby. once in a while the standby will simply not shut down
(forever - not only by eceeding the default timeout of pgctl which seems
to get triggered much more often on the standby than on the master -
have not looked into that yet in detail)

Stefan

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 14:40:19
Message-ID:	1273675219.308.737.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 16:03 +0200, Stefan Kaltenbrunner wrote:
> Simon Riggs wrote:
> > On Wed, 2010-05-12 at 08:52 -0400, Robert Haas wrote:
> >> On Wed, May 12, 2010 at 7:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >>> On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
> >>>
> >>>> I'm not sure what to make of this. Sometimes not shutting down
> >>>> doesn't sound like a feature to me.
> >>> It acts exactly the same in recovery as in normal running. It is not a
> >>> special feature of recovery at all, bug or otherwise.
> >> Simon, that doesn't make any sense. We are talking about a backend
> >> getting stuck forever on an exclusive lock that is held by the startup
> >> process and which will never be released (for example, because the
> >> master has shut down and no more WAL can be obtained for replay). The
> >> startup process does not hold locks in normal operation.
> >
> > When I test it, startup process holding a lock does not prevent shutdown
> > of a standby.
> >
> > I'd be happy to see your test case showing a bug exists and that the
> > behaviour differs from normal running.
>
> In my testing the postmaster simply does not shut down even with no
> clients connected any more once in a while - most of the time it works
> just fine but in like 1 out of 10 cases it get's stuck - my testcase (as
> detailed in the related thread) is simply doing an interval load on the
> master (pgbench -T 120 && sleep 30 && pgbench -T 120 - rinse and repeat
> as needed) and pgbench -S && pg_ctl restart && pgbench -S in a lop on
> the standby. once in a while the standby will simply not shut down
> (forever - not only by eceeding the default timeout of pgctl which seems
> to get triggered much more often on the standby than on the master -
> have not looked into that yet in detail)

If you could recreate that on a server in debug mode we can see what's
happening. If you can attach to the server and get a back trace that
would help. I've not seen that behaviour at all during testing and if
the issue is sporadic its not likely to help much trying to recreate
myself.

This could be an issue with SR, or an issue with the shutdown code
itself.

--
Simon Riggs www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 15:28:13
Message-ID:	1273678093.308.856.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 14:18 +0100, Simon Riggs wrote:
> On Wed, 2010-05-12 at 08:52 -0400, Robert Haas wrote:
> > On Wed, May 12, 2010 at 7:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > > On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
> > >
> > >> I'm not sure what to make of this. Sometimes not shutting down
> > >> doesn't sound like a feature to me.
> > >
> > > It acts exactly the same in recovery as in normal running. It is not a
> > > special feature of recovery at all, bug or otherwise.
> >
> > Simon, that doesn't make any sense. We are talking about a backend
> > getting stuck forever on an exclusive lock that is held by the startup
> > process and which will never be released (for example, because the
> > master has shut down and no more WAL can be obtained for replay). The
> > startup process does not hold locks in normal operation.
>
> When I test it, startup process holding a lock does not prevent shutdown
> of a standby.
>
> I'd be happy to see your test case showing a bug exists and that the
> behaviour differs from normal running.

Let me put this differently: I accept that Stefan has reported a
problem. Neither Tom nor myself can reproduce the problem. I've re-run
Stefan's test case and restarted the server more than 400 times now
without any issue.

I re-read your post where you gave what you yourself called "uninformed
speculation". There's no real polite way to say it, but yes your
speculation does appear to be uninformed, since it is incorrect. Reasons
would be not least that Stefan's tests don't actually send any locks to
the standby anyway (!), but even if they did your speculation as to the
cause is still all wrong, as explained.

There is no evidence to link this behaviour with HS, as yet, and you
should be considering the possibility the problem lies elsewhere,
especially since it could be code you committed that is at fault.

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 16:04:20
Message-ID:	AANLkTinRrbhlHd3f__lg0S8_2AbKWye1Q19Ff7T4udtx@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 11:28 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, 2010-05-12 at 14:18 +0100, Simon Riggs wrote:
>> On Wed, 2010-05-12 at 08:52 -0400, Robert Haas wrote:
>> > On Wed, May 12, 2010 at 7:26 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> > > On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
>> > >
>> > >> I'm not sure what to make of this. Sometimes not shutting down
>> > >> doesn't sound like a feature to me.
>> > >
>> > > It acts exactly the same in recovery as in normal running. It is not a
>> > > special feature of recovery at all, bug or otherwise.
>> >
>> > Simon, that doesn't make any sense. We are talking about a backend
>> > getting stuck forever on an exclusive lock that is held by the startup
>> > process and which will never be released (for example, because the
>> > master has shut down and no more WAL can be obtained for replay). The
>> > startup process does not hold locks in normal operation.
>>
>> When I test it, startup process holding a lock does not prevent shutdown
>> of a standby.
>>
>> I'd be happy to see your test case showing a bug exists and that the
>> behaviour differs from normal running.
>
> Let me put this differently: I accept that Stefan has reported a
> problem. Neither Tom nor myself can reproduce the problem. I've re-run
> Stefan's test case and restarted the server more than 400 times now
> without any issue.

OK, I'm glad to hear you've been testing this. I wasn't aware of that.

> I re-read your post where you gave what you yourself called "uninformed
> speculation". There's no real polite way to say it, but yes your
> speculation does appear to be uninformed, since it is incorrect. Reasons
> would be not least that Stefan's tests don't actually send any locks to
> the standby anyway (!),

Hmm. Well, assuming you're correct, that does seem to be a, uh,
slight problem with my theory.

> but even if they did your speculation as to the
> cause is still all wrong, as explained.

You lost me. I don't understand why the problem that I'm referring to
couldn't happen, even if it's not what's happening here.

> There is no evidence to link this behaviour with HS, as yet, and you
> should be considering the possibility the problem lies elsewhere,
> especially since it could be code you committed that is at fault.

Huh?? The evidence that this bug is linked with HS is that it occurs
on a server running in HS mode, and not otherwise. As for whether the
bug is code I committed, that's certainly possible, but keep in mind
it didn't work at all before IN HOT STANDBY MODE - and that will be
code you committed.

I'm going to go test this and see if I can figure out what's going on.
I hope you will keep at it also - as you point out, your knowledge of
this code far exceeds mine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 16:49:13
Message-ID:	1273682953.308.905.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 12:04 -0400, Robert Haas wrote:

> Huh?? The evidence that this bug is linked with HS is that it occurs
> on a server running in HS mode, and not otherwise. As for whether the
> bug is code I committed, that's certainly possible, but keep in mind
> it didn't work at all before IN HOT STANDBY MODE - and that will be
> code you committed.

I'll say it now, so its plain. I'm not going to investigate every bug
that occurs on Postgres, just because someone was in HS when they found
it. Any more than all bugs on Postgres in normal running are MVCC bugs.
There needs to be reasonable evidence or a conjecture by someone that
knows something about the code. If HS were the only thing changed in
recovery in this release, that might not seem reasonable, but since we
have much new code and I am not the only developer, it is.

Normal shutdown didn't work on a standby before HS was committed and it
didn't work afterwards either. Use all the capitals you like but if you
use poor arguments and combine that with no evidence then we'll not get
very far, either in working together or in solving the actual bugs.
Please don't continue to make wild speculations about things related to
HS and recovery, so that issues do not become confused; there is no need
to comment on every thread.

--
Simon Riggs www.2ndQuadrant.com

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 17:05:09
Message-ID:	AANLkTikzsjTpySMhv8RNMfcfLzNlqD7OXjrqWvNKwu1k@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 5:49 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, 2010-05-12 at 12:04 -0400, Robert Haas wrote:
>
>> Huh?? The evidence that this bug is linked with HS is that it occurs
>> on a server running in HS mode, and not otherwise. As for whether the
>> bug is code I committed, that's certainly possible, but keep in mind
>> it didn't work at all before IN HOT STANDBY MODE - and that will be
>> code you committed.
>
> I'll say it now, so its plain. I'm not going to investigate every bug
> that occurs on Postgres, just because someone was in HS when they found
> it.

Fair enough, though your help debugging is always appreciated
regardless of whether a problem is HS related or not. Nobody's
obligated to work on anything in Postgres after all.

I'm not sure who to blame for the shouting match over whose commit
introduced the bug -- it doesn't seem like a relevant or useful thing
to argue about, please both stop.

> there is no need
> to comment on every thread.

This is out of line.

--
greg

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 17:21:15
Message-ID:	1273684875.308.917.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 18:05 +0100, Greg Stark wrote:

> I'm not sure who to blame for the shouting match over whose commit
> introduced the bug -- it doesn't seem like a relevant or useful thing
> to argue about, please both stop.

I haven't blamed Robert's code, merely asked him to consider that it is
something other HS, since we have no evidence either way at present
because the issue is sporadic and has not been replicated as yet, with
no specific detail leading to any section of code.

> > there is no need
> > to comment on every thread.
>
> This is out of line.

Quoted out of context, it is. My full comment is "Please don't continue
to make wild speculations about things related to HS and recovery, so
that issues do not become confused; there is no need to comment on every
thread." ... by which I mean threads related to HS and recovery. I
respect everybody's right to free speech here, but I would say the same
to anyone if they do it repeatedly. I'm not the first to make such a
comment on hackers either.

--
Simon Riggs www.2ndQuadrant.com

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 18:01:51
Message-ID:	1273687311.8624.2908.camel@jd-desktop.unknown.charter.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 17:49 +0100, Simon Riggs wrote:
> On Wed, 2010-05-12 at 12:04 -0400, Robert Haas wrote:

> Normal shutdown didn't work on a standby before HS was committed and it
> didn't work afterwards either. Use all the capitals you like but if you
> use poor arguments and combine that with no evidence then we'll not get
> very far, either in working together or in solving the actual bugs.
> Please don't continue to make wild speculations about things related to
> HS and recovery, so that issues do not become confused; there is no need
> to comment on every thread.
>

Simon,

People are very passionate about this feature. This feature has the
ability to show us as moving forward in a fashion that will allow us to
directly compete with the "big boys" in the "big installs", although we
are still probably 2-3 releases from that.

It also has the ability to make us look like a bunch of yahoos (no pun
intended) who are better served beating up on that database that Oracle
just bought, versus Oracle itself.

Patience is a virtue for all when it comes to the this feature.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Greg Stark <gsstark(at)mit(dot)edu>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 18:43:13
Message-ID:	AANLkTinycqapesEJkKnqpdmxvRgLy4pmZF7a2Y_zekZX@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 1:21 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, 2010-05-12 at 18:05 +0100, Greg Stark wrote:
>
>> I'm not sure who to blame for the shouting match over whose commit
>> introduced the bug -- it doesn't seem like a relevant or useful thing
>> to argue about, please both stop.
>
> I haven't blamed Robert's code, merely asked him to consider that it is
> something other HS, since we have no evidence either way at present
> because the issue is sporadic and has not been replicated as yet, with
> no specific detail leading to any section of code.

I'm not really sure what we're arguing about here. I feel like I'm
being accused either of (a) introducing the bug (which is possible) or
(b) saying that Simon introduced the bug (which presumably is also
possible, although it's not really my point). I ventured an
uninformed guess at what the problem might be; Simon thinks my guess
is wrong, and it may well be: but either way there's a bug buried in
here somewhere and it would be nice to fix it. I thought that it
would be a good idea for Simon to look at it because, on the surface,
it APPEARS to have something to do with Hot Standby, since that's what
Stefan was testing when he found it. Sure, the investigation might
lead somewhere else; I completely admit that.

Now, Simon just said he HAS looked at it and can't reproduce the
problem. So now I'm even less sure what we're arguing about. I'm
glad he looked at it. It's interesting that he wasn't able to
reproduce the problem. I hope that he or someone else will find
something that helps us move forward. I am having difficulty
reproducing Stefan's test environment and perhaps for that reason I
can't reproduce it either, though I've encountered several other
problems about which, I suppose, I will post separate emails.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 19:10:28
Message-ID:	4BEAFD24.2070707@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/12/2010 05:28 PM, Simon Riggs wrote:
> On Wed, 2010-05-12 at 14:18 +0100, Simon Riggs wrote:
>> On Wed, 2010-05-12 at 08:52 -0400, Robert Haas wrote:
>>> On Wed, May 12, 2010 at 7:26 AM, Simon Riggs<simon(at)2ndquadrant(dot)com> wrote:
>>>> On Wed, 2010-05-12 at 07:10 -0400, Robert Haas wrote:
>>>>
>>>>> I'm not sure what to make of this. Sometimes not shutting down
>>>>> doesn't sound like a feature to me.
>>>>
>>>> It acts exactly the same in recovery as in normal running. It is not a
>>>> special feature of recovery at all, bug or otherwise.
>>>
>>> Simon, that doesn't make any sense. We are talking about a backend
>>> getting stuck forever on an exclusive lock that is held by the startup
>>> process and which will never be released (for example, because the
>>> master has shut down and no more WAL can be obtained for replay). The
>>> startup process does not hold locks in normal operation.
>>
>> When I test it, startup process holding a lock does not prevent shutdown
>> of a standby.
>>
>> I'd be happy to see your test case showing a bug exists and that the
>> behaviour differs from normal running.
>
> Let me put this differently: I accept that Stefan has reported a
> problem. Neither Tom nor myself can reproduce the problem. I've re-run
> Stefan's test case and restarted the server more than 400 times now
> without any issue.
>
> I re-read your post where you gave what you yourself called "uninformed
> speculation". There's no real polite way to say it, but yes your
> speculation does appear to be uninformed, since it is incorrect. Reasons
> would be not least that Stefan's tests don't actually send any locks to
> the standby anyway (!), but even if they did your speculation as to the
> cause is still all wrong, as explained.
>
> There is no evidence to link this behaviour with HS, as yet, and you
> should be considering the possibility the problem lies elsewhere,
> especially since it could be code you committed that is at fault.

Well I'm not sure why people seem to have that hard a time reproducing
that issue - it seems that I can provoke it really trivially(in this
case no loops, no pgbench, no tricks). A few minutes ago I logged into
my test standby (which is idle except for the odd connect to template1
caused by nagios - the master is idle as well and has been for days):

postgres(at)soldata005:~$ psql
psql (9.0beta1)
Type "help" for help.

postgres=# select 1;
?column?
----------
1
(1 row)

postgres=# \q
postgres(at)soldata005:~$ pg_ctl -D /var/lib/postgresql/9.0b1/main/ restart
waiting for server to shut down.... done
server stopped
server starting
postgres(at)soldata005:~$ pg_ctl -D /var/lib/postgresql/9.0b1/main/ restart
waiting for server to shut down.... done
server stopped
server starting
postgres(at)soldata005:~$ pg_ctl -D /var/lib/postgresql/9.0b1/main/ restart
waiting for server to shut
down............................................................... failed
pg_ctl: server does not shut down

the server log for that is as follows:

<2010-05-12 20:36:18.166 CEST,,,> LOG: received smart shutdown request
<2010-05-12 20:36:18.167 CEST,,,> FATAL: terminating walreceiver
process due to administrator command
<2010-05-12 20:36:18.174 CEST,,,> LOG: shutting down
<2010-05-12 20:36:18.251 CEST,,,> LOG: database system is shut down
<2010-05-12 20:36:19.706 CEST,,,> LOG: database system was interrupted
while in recovery at log time 2010-05-06 17:36:05 CEST
<2010-05-12 20:36:19.706 CEST,,,> HINT: If this has occurred more than
once some data might be corrupted and you might need to choose an
earlier recovery target.
<2010-05-12 20:36:19.706 CEST,,,> LOG: entering standby mode
<2010-05-12 20:36:19.721 CEST,,,> LOG: consistent recovery state
reached at 1/12000078
<2010-05-12 20:36:19.721 CEST,,,> LOG: invalid record length at 1/12000078
<2010-05-12 20:36:19.723 CEST,,,> LOG: database system is ready to
accept read only connections
<2010-05-12 20:36:19.737 CEST,,,> LOG: streaming replication
successfully connected to primary
<2010-05-12 20:36:19.918 CEST,,,> LOG: received smart shutdown request
<2010-05-12 20:36:19.919 CEST,,,> FATAL: terminating walreceiver
process due to administrator command
<2010-05-12 20:36:19.922 CEST,,,> LOG: shutting down
<2010-05-12 20:36:19.937 CEST,,,> LOG: database system is shut down
<2010-05-12 20:36:21.433 CEST,,,> LOG: database system was interrupted
while in recovery at log time 2010-05-06 17:36:05 CEST
<2010-05-12 20:36:21.433 CEST,,,> HINT: If this has occurred more than
once some data might be corrupted and you might need to choose an
earlier recovery target.
<2010-05-12 20:36:21.433 CEST,,,> LOG: entering standby mode
<2010-05-12 20:36:21.482 CEST,,,> LOG: received smart shutdown request
<2010-05-12 20:36:21.504 CEST,,,> LOG: consistent recovery state
reached at 1/12000078
<2010-05-12 20:36:21.504 CEST,,,> LOG: invalid record length at 1/12000078
<2010-05-12 20:36:21.505 CEST,,,> LOG: database system is ready to
accept read only connections
<2010-05-12 20:36:21.516 CEST,,,> LOG: streaming replication
successfully connected to primary

so it restarted two times successfully - however if one looks at the
third time one can see that it received the smart shutdown request
BEFORE it reached a consistent recovery state - yet it continued to
enable HS and reenabled SR as well.

The database is now sitting there doing nothing and it more or less
broken because you cannot connect to it in the current state:

~$ psql
psql: FATAL: the database system is shutting down

the startup process has the following backtrace:

(gdb) bt
#0 0x00007fbe24cb2c83 in select () from /lib/libc.so.6
#1 0x00000000006e811a in pg_usleep ()
#2 0x000000000048c333 in XLogPageRead ()
#3 0x000000000048c967 in ReadRecord ()
#4 0x0000000000493ab6 in StartupXLOG ()
#5 0x0000000000495a88 in StartupProcessMain ()
#6 0x00000000004ab25e in AuxiliaryProcessMain ()
#7 0x00000000005d4a7d in StartChildProcess ()
#8 0x00000000005d70c2 in PostmasterMain ()
#9 0x000000000057d898 in main ()

Stefan

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 19:25:14
Message-ID:	1273692314.308.1059.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 21:10 +0200, Stefan Kaltenbrunner wrote:

> > There is no evidence to link this behaviour with HS, as yet, and you
> > should be considering the possibility the problem lies elsewhere,
> > especially since it could be code you committed that is at fault.
>
> Well I'm not sure why people seem to have that hard a time reproducing
> that issue - it seems that I can provoke it really trivially(in this
> case no loops, no pgbench, no tricks). A few minutes ago I logged into
> my test standby (which is idle except for the odd connect to template1
> caused by nagios - the master is idle as well and has been for days):

Thanks, good report.

> so it restarted two times successfully - however if one looks at the
> third time one can see that it received the smart shutdown request
> BEFORE it reached a consistent recovery state - yet it continued to
> enable HS and reenabled SR as well.
>
> The database is now sitting there doing nothing and it more or less
> broken because you cannot connect to it in the current state:
>
> ~$ psql
> psql: FATAL: the database system is shutting down
>
> the startup process has the following backtrace:
>
> (gdb) bt
> #0 0x00007fbe24cb2c83 in select () from /lib/libc.so.6
> #1 0x00000000006e811a in pg_usleep ()
> #2 0x000000000048c333 in XLogPageRead ()
> #3 0x000000000048c967 in ReadRecord ()
> #4 0x0000000000493ab6 in StartupXLOG ()
> #5 0x0000000000495a88 in StartupProcessMain ()
> #6 0x00000000004ab25e in AuxiliaryProcessMain ()
> #7 0x00000000005d4a7d in StartChildProcess ()
> #8 0x00000000005d70c2 in PostmasterMain ()
> #9 0x000000000057d898 in main ()

Well, its waiting for new info from primary. Nothing to do with locking,
but that's not an indication that its an SR issue though either. ;-)

I'll put some waits into that part of the code and see if I can induce
the failure. Maybe its just a simple lack of a CHECK_FOR_INTERRUPTS().

--
Simon Riggs www.2ndQuadrant.com

From:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 19:36:07
Message-ID:	1273692759-sup-6803@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Excerpts from Stefan Kaltenbrunner's message of mié may 12 15:10:28 -0400 2010:

> the startup process has the following backtrace:
>
> (gdb) bt
> #0 0x00007fbe24cb2c83 in select () from /lib/libc.so.6
> #1 0x00000000006e811a in pg_usleep ()
> #2 0x000000000048c333 in XLogPageRead ()
> #3 0x000000000048c967 in ReadRecord ()
> #4 0x0000000000493ab6 in StartupXLOG ()
> #5 0x0000000000495a88 in StartupProcessMain ()
> #6 0x00000000004ab25e in AuxiliaryProcessMain ()
> #7 0x00000000005d4a7d in StartChildProcess ()
> #8 0x00000000005d70c2 in PostmasterMain ()
> #9 0x000000000057d898 in main ()

I just noticed that we have some code assigning the return value of
time() to a pg_time_t variable. Is this supposed to work reliably?
(xlog.c lines 9267ff)
--

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 19:40:09
Message-ID:	1273693209.308.1062.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 15:36 -0400, Alvaro Herrera wrote:
> I just noticed that we have some code assigning the return value of
> time() to a pg_time_t variable. Is this supposed to work reliably?
> (xlog.c lines 9267ff)

Code's used that for a while now. Checkpoints and everywhere.

--
Simon Riggs www.2ndQuadrant.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 19:51:25
Message-ID:	AANLkTikfkVgcEZb4vt77gG78fL0AoU3_Y8O99FV391ef@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 3:36 PM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> Excerpts from Stefan Kaltenbrunner's message of mié may 12 15:10:28 -0400 2010:
>
>> the startup process has the following backtrace:
>>
>> (gdb) bt
>> #0 0x00007fbe24cb2c83 in select () from /lib/libc.so.6
>> #1 0x00000000006e811a in pg_usleep ()
>> #2 0x000000000048c333 in XLogPageRead ()
>> #3 0x000000000048c967 in ReadRecord ()
>> #4 0x0000000000493ab6 in StartupXLOG ()
>> #5 0x0000000000495a88 in StartupProcessMain ()
>> #6 0x00000000004ab25e in AuxiliaryProcessMain ()
>> #7 0x00000000005d4a7d in StartChildProcess ()
>> #8 0x00000000005d70c2 in PostmasterMain ()
>> #9 0x000000000057d898 in main ()
>
> I just noticed that we have some code assigning the return value of
> time() to a pg_time_t variable. Is this supposed to work reliably?
> (xlog.c lines 9267ff)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 19:55:19
Message-ID:	AANLkTinufrCIGshcouLdUfkcrLy5sbvs2hDGgVks9TBp@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 3:51 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, May 12, 2010 at 3:36 PM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>> Excerpts from Stefan Kaltenbrunner's message of mié may 12 15:10:28 -0400 2010:
>>
>>> the startup process has the following backtrace:
>>>
>>> (gdb) bt
>>> #0 0x00007fbe24cb2c83 in select () from /lib/libc.so.6
>>> #1 0x00000000006e811a in pg_usleep ()
>>> #2 0x000000000048c333 in XLogPageRead ()
>>> #3 0x000000000048c967 in ReadRecord ()
>>> #4 0x0000000000493ab6 in StartupXLOG ()
>>> #5 0x0000000000495a88 in StartupProcessMain ()
>>> #6 0x00000000004ab25e in AuxiliaryProcessMain ()
>>> #7 0x00000000005d4a7d in StartChildProcess ()
>>> #8 0x00000000005d70c2 in PostmasterMain ()
>>> #9 0x000000000057d898 in main ()
>>
>> I just noticed that we have some code assigning the return value of
>> time() to a pg_time_t variable. Is this supposed to work reliably?
>> (xlog.c lines 9267ff)
>
> I'

I have a love-hate relationship with GMail, sorry.

I am wondering if we are not correctly handling the case where we get
a shutdown request while we are still in the PM_STARTUP state. It
looks like we might go ahead and switch to PM_RECOVERY and then
PM_RECOVERY_CONSISTENT without noticing the shutdown. There is some
logic to handle the shutdown when the startup process exits, but if
the startup process never exits it looks like we might get stuck.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Greg Stark <gsstark(at)mit(dot)edu>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 21:34:38
Message-ID:	1273700078.308.1239.camel@ebony
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 14:43 -0400, Robert Haas wrote:

> I thought that it
> would be a good idea for Simon to look at it because, on the surface,
> it APPEARS to have something to do with Hot Standby, since that's what
> Stefan was testing when he found it.

He was also testing SR, yet you haven't breathed a word about that for
some strange reason. It didn't APPEAR like it was HS at all, not from
basic logic or from technical knowledge. So you'll have to forgive me if
I don't leap into action when you say something is an HS problem in the
future.

--
Simon Riggs www.2ndQuadrant.com

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 21:39:41
Message-ID:	4BEB201D.3000906@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon, Robert,

> He was also testing SR, yet you haven't breathed a word about that for
> some strange reason. It didn't APPEAR like it was HS at all, not from
> basic logic or from technical knowledge. So you'll have to forgive me if
> I don't leap into action when you say something is an HS problem in the
> future.

Can we please chill out on this some? Especially since we now have an
actual reproduceable bug?

Simon, it's natural for people to come to you because you are
knowledgeable and responsive. You should take it as a compliment.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 21:40:44
Message-ID:	1273700444.8624.2976.camel@jd-desktop.unknown.charter.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2010-05-12 at 22:34 +0100, Simon Riggs wrote:
> On Wed, 2010-05-12 at 14:43 -0400, Robert Haas wrote:
>
> > I thought that it
> > would be a good idea for Simon to look at it because, on the surface,
> > it APPEARS to have something to do with Hot Standby, since that's what
> > Stefan was testing when he found it.
>
> He was also testing SR, yet you haven't breathed a word about that for
> some strange reason. It didn't APPEAR like it was HS at all, not from
> basic logic or from technical knowledge. So you'll have to forgive me if
> I don't leap into action when you say something is an HS problem in the
> future.

Simon, with respect -- knock it off.

Robert gave a very reasonable response. He is just trying to help. Relax
man.

Joshua Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Greg Stark <gsstark(at)mit(dot)edu>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-12 22:47:53
Message-ID:	AANLkTiltf8sRCiSkrV6KpP7OD0ieLPZ3N-veWeIKW1Om@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 5:34 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, 2010-05-12 at 14:43 -0400, Robert Haas wrote:
>
>> I thought that it
>> would be a good idea for Simon to look at it because, on the surface,
>> it APPEARS to have something to do with Hot Standby, since that's what
>> Stefan was testing when he found it.
>
> He was also testing SR, yet you haven't breathed a word about that for
> some strange reason. It didn't APPEAR like it was HS at all, not from
> basic logic or from technical knowledge. So you'll have to forgive me if
> I don't leap into action when you say something is an HS problem in the
> future.

Well, the original subject line of the report had mentioned SR only,
but I had a specific theory about what might be happening that was
related to the operation of HS. You've said that you think my guess
is incorrect, and that's very possible, but until we actually find and
fix the bug we're all just guessing. I wasn't intending to cast
aspersions on your code.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-13 02:46:44
Message-ID:	AANLkTimdw83ZaLdwNDXv1nCcTrOOSun5guXu0yDypGCz@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 13, 2010 at 4:55 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I am wondering if we are not correctly handling the case where we get
> a shutdown request while we are still in the PM_STARTUP state. It
> looks like we might go ahead and switch to PM_RECOVERY and then
> PM_RECOVERY_CONSISTENT without noticing the shutdown. There is some
> logic to handle the shutdown when the startup process exits, but if
> the startup process never exits it looks like we might get stuck.

Right. I reported this problem and submitted the patch before.
http://archives.postgresql.org/pgsql-hackers/2010-04/msg00592.php

Stefan,
Could you check whether the patch fixes the problem you encountered?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-13 03:07:29
Message-ID:	AANLkTik_ay5lTJTEhAUqmEAREsKT7mEGxzmeJll3ERXw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, May 12, 2010 at 10:46 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, May 13, 2010 at 4:55 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I am wondering if we are not correctly handling the case where we get
>> a shutdown request while we are still in the PM_STARTUP state. It
>> looks like we might go ahead and switch to PM_RECOVERY and then
>> PM_RECOVERY_CONSISTENT without noticing the shutdown. There is some
>> logic to handle the shutdown when the startup process exits, but if
>> the startup process never exits it looks like we might get stuck.
>
> Right. I reported this problem and submitted the patch before.
> http://archives.postgresql.org/pgsql-hackers/2010-04/msg00592.php

Sorry we missed that.

> Stefan,
> Could you check whether the patch fixes the problem you encountered?

I think that would be a good thing to check (it'll confirm whether
this is the same bug), but I'm not convinced we should actually fix it
that way. Prior to 8.4, we handled a smart shutdown during recovery
at the conclusion of recovery, just prior to entering normal running.
I'm wondering if we shouldn't revert to that behavior in both 8.4 and
HEAD.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-13 17:12:27
Message-ID:	4BEC32FB.90303@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 5/12/10 8:07 PM, Robert Haas wrote:
> I think that would be a good thing to check (it'll confirm whether
> this is the same bug), but I'm not convinced we should actually fix it
> that way. Prior to 8.4, we handled a smart shutdown during recovery
> at the conclusion of recovery, just prior to entering normal running.
> I'm wondering if we shouldn't revert to that behavior in both 8.4 and
> HEAD.

This would be OK as long as we document it well. We patched the
shutdown the way we did specifically because Fujii thought it would be
an easy fix; if it's complicated, we should revert it and document the
issue for DBAs.

Oh, and to confirm: the same issue exists, and has always existed, with
Warm Standby.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-14 21:32:15
Message-ID:	AANLkTilnuYddIZAnD2bVCve3s_i8s6oQwGMuUSQua2RQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 13, 2010 at 1:12 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 5/12/10 8:07 PM, Robert Haas wrote:
>> I think that would be a good thing to check (it'll confirm whether
>> this is the same bug), but I'm not convinced we should actually fix it
>> that way. Prior to 8.4, we handled a smart shutdown during recovery
>> at the conclusion of recovery, just prior to entering normal running.
>> I'm wondering if we shouldn't revert to that behavior in both 8.4 and
>> HEAD.
>
> This would be OK as long as we document it well. We patched the
> shutdown the way we did specifically because Fujii thought it would be
> an easy fix; if it's complicated, we should revert it and document the
> issue for DBAs.

I don't understand this comment.

> Oh, and to confirm: the same issue exists, and has always existed, with
> Warm Standby.

That's what I was thinking, but I hadn't gotten around to testing it.
Thanks for the confirmation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-14 21:51:18
Message-ID:	4BEDC5D6.6030800@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>> This would be OK as long as we document it well. We patched the
>> shutdown the way we did specifically because Fujii thought it would be
>> an easy fix; if it's complicated, we should revert it and document the
>> issue for DBAs.
>
> I don't understand this comment.

In other words, I'm saying that it's not critical that we troubleshoot
this for 9.0. Revering Fujii's patch, if it's not working, is an option.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject:	Re: max_standby_delay considered harmful
Date:	2010-05-14 22:20:32
Message-ID:	AANLkTilUKjn8KhDsoNbs8AmBCDDL78PncnfHt8N6EwGf@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, May 14, 2010 at 5:51 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>>> This would be OK as long as we document it well. We patched the
>>> shutdown the way we did specifically because Fujii thought it would be
>>> an easy fix; if it's complicated, we should revert it and document the
>>> issue for DBAs.
>>
>> I don't understand this comment.
>
> In other words, I'm saying that it's not critical that we troubleshoot
> this for 9.0. Revering Fujii's patch, if it's not working, is an option.

There is no patch which we could revert to fix this, by Fujii Masao or
anyone else. The patch he proposed has not been committed. I am
still studying the problem to try to figure out where to go with it.
We could decide to punt the whole thing for 9.1, but I'd like to
understand what the options are before we make that decision.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company