Re: Hot Standby Feedback should default to on in 9.3+

Lists: pgsql-hackers
From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Claudio Freire" <klaussfreire(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>,"Andres Freund" <andres(at)2ndquadrant(dot)com>, "PostgreSQL-Dev" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby Feedback should default to on in 9.3+
Date: 2012-11-30 21:20:38
Message-ID: 20121130212038.69310@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Claudio Freire wrote:

>> With what setting of max_standby_streaming_delay? I would rather
>> default that to -1 than default hot_standby_feedback on. That
>> way what you do on the standby only affects the standby.
>
> 1d

Was there actually a transaction hanging open for an entire day on
the standby? Was it a query which actually ran that long, or an
ill-behaved user or piece of software?

I have most certainly managed databases where holding up vacuuming
on the source would cripple performance to the point that users
would have demanded that any other process causing it must be
immediately canceled. And canceling it wouldn't be enough at that
point -- the bloat would still need to be fixed before they could
work efficiently.

-Kevin


From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby Feedback should default to on in 9.3+
Date: 2012-11-30 21:40:27
Message-ID: CAGTBQpb=yTM2BiByzzrP+1+hRtHM7jPeSKSZY7WYAmiAYFWcRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner <kgrittn(at)mail(dot)com> wrote:
> Claudio Freire wrote:
>
>>> With what setting of max_standby_streaming_delay? I would rather
>>> default that to -1 than default hot_standby_feedback on. That
>>> way what you do on the standby only affects the standby.
>>
>> 1d
>
> Was there actually a transaction hanging open for an entire day on
> the standby? Was it a query which actually ran that long, or an
> ill-behaved user or piece of software?

No, and if there was, I wouldn't care for it to be cancelled.

Queries were being cancelled way before that timeout was reached,
probably something to do with max_keep_segments on the master side
being unable to keep up for that long.

> I have most certainly managed databases where holding up vacuuming
> on the source would cripple performance to the point that users
> would have demanded that any other process causing it must be
> immediately canceled. And canceling it wouldn't be enough at that
> point -- the bloat would still need to be fixed before they could
> work efficiently.

I wouldn't mind occasional cancels, but these were recurring. When a
query ran long enough, there was no way for it to finish, no matter
how many times you tried. The master never stops being busy, that's
probably a factor.


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby Feedback should default to on in 9.3+
Date: 2012-11-30 21:49:41
Message-ID: 50B929F5.5020008@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30.11.2012 23:40, Claudio Freire wrote:
> On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner<kgrittn(at)mail(dot)com> wrote:
>> Claudio Freire wrote:
>>
>>>> With what setting of max_standby_streaming_delay? I would rather
>>>> default that to -1 than default hot_standby_feedback on. That
>>>> way what you do on the standby only affects the standby.
>>>
>>> 1d
>>
>> Was there actually a transaction hanging open for an entire day on
>> the standby? Was it a query which actually ran that long, or an
>> ill-behaved user or piece of software?
>
> No, and if there was, I wouldn't care for it to be cancelled.
>
> Queries were being cancelled way before that timeout was reached,
> probably something to do with max_keep_segments on the master side
> being unable to keep up for that long.

Running out of max_keep_segments would produce a different error,
requiring a new base backup.

>> I have most certainly managed databases where holding up vacuuming
>> on the source would cripple performance to the point that users
>> would have demanded that any other process causing it must be
>> immediately canceled. And canceling it wouldn't be enough at that
>> point -- the bloat would still need to be fixed before they could
>> work efficiently.
>
> I wouldn't mind occasional cancels, but these were recurring. When a
> query ran long enough, there was no way for it to finish, no matter
> how many times you tried. The master never stops being busy, that's
> probably a factor.

Hmm, it sounds like max_standby_streaming_delay=1d didn't work as
intended for some reason. It should've given the query one day to run
before canceling it. Unless the standby was running one day behind the
master already, but that seems unlikely. Any chance you could reproduce
that?

- Heikki


From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby Feedback should default to on in 9.3+
Date: 2012-11-30 21:53:43
Message-ID: CAGTBQpYCdC0uSNCohCn3xMUQ0j76V+LY-s2D3wdiGUivEXbYAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 30, 2012 at 6:49 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
>>> I have most certainly managed databases where holding up vacuuming
>>> on the source would cripple performance to the point that users
>>> would have demanded that any other process causing it must be
>>> immediately canceled. And canceling it wouldn't be enough at that
>>> point -- the bloat would still need to be fixed before they could
>>> work efficiently.
>>
>>
>> I wouldn't mind occasional cancels, but these were recurring. When a
>> query ran long enough, there was no way for it to finish, no matter
>> how many times you tried. The master never stops being busy, that's
>> probably a factor.
>
>
> Hmm, it sounds like max_standby_streaming_delay=1d didn't work as intended
> for some reason. It should've given the query one day to run before
> canceling it. Unless the standby was running one day behind the master
> already, but that seems unlikely. Any chance you could reproduce that?

I have a pre-production server with replication for these tests. I
could create a fake stream of writes on it, disable feedback, and see
what happens.