Re: beta3 & the open items list

Lists: pgsql-hackers
From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <jd(at)commandprompt(dot)com>,<tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <robertmhaas(at)gmail(dot)com>,<gsstark(at)mit(dot)edu>, <fgp(at)phlo(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: beta3 & the open items list
Date: 2010-06-20 20:01:04
Message-ID: 4C1E2D300200002500032666@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" wrote:

> Can someone tell me what we are going to do about firewalls that
> impose their own rules outside of the control of the DBA?

Has anyone actually seen a firewall configured for something so
stupid as to allow *almost* all the various packets involved in using
a TCP connection, but which suppressed just keepalive packets? That
seems to be what you're suggesting is the risk; it's an outlandish
enough suggestion that I think the burden of proof is on you to show
that it happens often enough to make this a worthless change.

-Kevin


From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, gsstark(at)mit(dot)edu, fgp(at)phlo(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-20 20:44:20
Message-ID: 20100620204420.GA19746@aart.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 20, 2010 at 03:01:04PM -0500, Kevin Grittner wrote:
> "Joshua D. Drake" wrote:
>
> > Can someone tell me what we are going to do about firewalls that
> > impose their own rules outside of the control of the DBA?
>
> Has anyone actually seen a firewall configured for something so
> stupid as to allow *almost* all the various packets involved in using
> a TCP connection, but which suppressed just keepalive packets? That
> seems to be what you're suggesting is the risk; it's an outlandish
> enough suggestion that I think the burden of proof is on you to show
> that it happens often enough to make this a worthless change.
>
> -Kevin
>

I have seen this sort of behavior but in every case it has been
the result of a myopic view of firewall/IP tables solutions to
perceived "attacks". While I do agree that having heartbeat
within the replication process it worthwhile, it should definitely
be 9.1 material at best. For 9.0 such ill-behaved environments
will need much more interaction by the DBA with monitoring and
triage of problems as they arrive.

Regards,
Ken

P.S. My favorite example of odd behavior was preemptively dropping
TCP packets in one direction only at a single port. Many, many
odd things happen when the kernel does not know that the packet
would never make it to it destination. Services would sometimes
run for weeks without a problem depending on when the port ended
up being used invariably at night or on the weekend.


From: Florian Pflug <fgp(at)phlo(dot)org>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: <jd(at)commandprompt(dot)com>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <robertmhaas(at)gmail(dot)com>, <gsstark(at)mit(dot)edu>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: beta3 & the open items list
Date: 2010-06-20 21:41:48
Message-ID: 2DDFD2FF-60FF-4520-829D-AF1D66D1DE80@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 20, 2010, at 22:01 , Kevin Grittner wrote:
> "Joshua D. Drake" wrote:
>
>> Can someone tell me what we are going to do about firewalls that
>> impose their own rules outside of the control of the DBA?
>
> Has anyone actually seen a firewall configured for something so
> stupid as to allow *almost* all the various packets involved in using
> a TCP connection, but which suppressed just keepalive packets? That
> seems to be what you're suggesting is the risk; it's an outlandish
> enough suggestion that I think the burden of proof is on you to show
> that it happens often enough to make this a worthless change.

Yeah, especially since there is no such thing as a special "keepalive" packet in TCP. Keepalive simply sends packets with zero bytes of payload every once in a while if the connection is otherwise inactive. If those aren't acknowledged (like every other packet would be) by the peer, the connection is assumed to be broken. On a reasonably active connection, keepalive neither causes additional transmissions, nor altered transmissions.

Keepalive is therefore extremely unlikely to break things - in the very worst case, a (really, really stupid) firewall might decide to drop packets with zero bytes of payload, causing inactive connections to abort after a while. AFAIK walreceiver will simply reconnect in this case.

Plus, the postmaster enables keepalive on all incoming connections *already*, so any problems ought to have caused bugreports about dropped client connections.

best regards,
Florian Pflug


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-20 22:13:24
Message-ID: AANLkTikLFX7vX-BnKbhhbl3uwUjnngKZgK1nzVMg4CE0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 20, 2010 at 10:41 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> Yeah, especially since there is no such thing as a special "keepalive" packet in TCP. Keepalive simply sends packets with zero bytes of payload every once in a while if the connection is otherwise inactive. If those aren't acknowledged (like every other packet would be) by the peer, the connection is assumed to be broken. On a reasonably active connection, keepalive neither causes additional transmissions, nor altered transmissions.

Actualy keep-alive packets contain one byte of data which is a
duplicate of the last previously acked byte.

>
> Keepalive is therefore extremely unlikely to break things - in the very worst case, a (really, really stupid) firewall might decide to drop packets with zero bytes of payload, causing inactive connections to abort after a while. AFAIK walreceiver will simply reconnect in this case.

Stateful firewalls whole raison-d'etre is to block packets which
aren't consistent with the current TCP state -- such as packets with a
sequence number earlier than the last acked sequence number.
Keepalives do in fact violate the basic TCP spec so they wouldn't be
entirely crazy to block them. Of course a firewall that blocked them
would be pretty criminally stupid given how ubiquitous they are.

> Plus, the postmaster enables keepalive on all incoming connections
*already*, so any problems ought to have caused bugreports about
dropped client connections.

Really? Since when? I thought there was some discussion about this
about a year ago and I made it very clear this had to be an optional
feature which defaulted to off.

Keepalives introduce spurious disconnections in working TCP
connections that have transient outages which is basic TCP
functionality that's supposed to work. There are cases where that's
what you want but it isn't the kind of thing that should be on by
default, let alone on unconditionally.

--
greg


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-20 23:42:00
Message-ID: 316B4E65-6192-4722-BC6E-0373F090FEAD@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 21, 2010, at 0:13 , Greg Stark wrote:
>> Keepalive is therefore extremely unlikely to break things - in the very worst case, a (really, really stupid) firewall might decide to drop packets with zero bytes of payload, causing inactive connections to abort after a while. AFAIK walreceiver will simply reconnect in this case.
>
> Stateful firewalls whole raison-d'etre is to block packets which
> aren't consistent with the current TCP state -- such as packets with a
> sequence number earlier than the last acked sequence number.
> Keepalives do in fact violate the basic TCP spec so they wouldn't be
> entirely crazy to block them.

Keepalives play games with the spec, but they don't outright violate it I'd say. The sender bluffs by retransmitting data it *knows* has been ACK'ed. But since nobody else can prove with certainty that the sender actually saw that ACK (think NIC-internal buffer overflow), nobody is able to call that bluff.

> Of course a firewall that blocked them
> would be pretty criminally stupid given how ubiquitous they are.

Very true, and another reason to stop worrying about possibly brain-dead firewalls.

>> Plus, the postmaster enables keepalive on all incoming connections
>> *already*, so any problems ought to have caused bugreports about
>> dropped client connections.
>
> Really? Since when? I thought there was some discussion about this
> about a year ago and I made it very clear this had to be an optional
> feature which defaulted to off.

Since 'bout 10 years. The setsockopt call is in StreamConnection() in src/backend/libpq/pqcomm.c.

Here's the corresponding commit:

commit 5aa160abba32a1f2d7818b9f49213f38c99b3fd8
Author: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Date: Sat May 20 13:10:54 2000 +0000

Add KEEPALIVE option to the socket of backend. This will automatically
terminate the backend that has no frontend anymore.

> Keepalives introduce spurious disconnections in working TCP
> connections that have transient outages which is basic TCP
> functionality that's supposed to work. There are cases where that's
> what you want but it isn't the kind of thing that should be on by
> default, let alone on unconditionally.

I'd buy that if all timeouts and retry counts would default to +infinity. But they don't, and hence sufficiently long network outages *will* cause connection aborts anyway. That a particular connection might survive due to inactivity proves nothing, since whether the connection is active or inactive during an outage is usually outside of anyone's control.

I really fail to see why anyone would prefer connections (and therefore transactions!) getting stuck forever over a few spurious disconnects. The former always require manual intervention and cause all sorts of performance and disk-space issues, while the latter won't even be an issue for well-written clients who just reconnect and retry.

best regards,
Florian Pflug


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-21 01:31:34
Message-ID: AANLkTimBPLK6S4GKQqYhUHp6KXWFc1rQ44DlkDy9s55y@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 21, 2010 at 12:42 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> I'd buy that if all timeouts and retry counts would default to +infinity. But they don't, and hence sufficiently long network outages *will* cause connection aborts anyway. That a particular connection might survive due to inactivity proves nothing, since whether the connection is active or inactive during an outage is usually outside of anyone's control.
>
> I really fail to see why anyone would prefer connections (and therefore transactions!) getting stuck forever over a few spurious disconnects. The former always require manual intervention and cause all sorts of performance and disk-space issues, while the latter won't even be an issue for well-written clients who just reconnect and retry.
>

So just as a data point I'm routinely annoyed by reopening my screen
session and finding various session sessions have died since the day
before. Usually this is caused by broken firewalls but there are also
a bunch of SSH options which some servers have enabled which cause my
sessions to never survive very long if there are any network outages.
Servers where those options are disabled work fine.

I admit this is a very different use case though and since we have
control over the behaviour when the connection breaks perhaps the
analogy falls apart completely. I'm not sure we can guarantee that
reconnecting is always so simple though. What if the user set up an
SSH gateway or needs some extra authentication to make the connection.
Are users expecting the slave to randomly disconnect and reconnect
willy nilly or are they expecting that once it connects it'll keep
using that connection forever?

--
greg


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-21 03:54:21
Message-ID: AANLkTikhcwlko4QsPonDRiYbgxdscH-9dR2IBqqmgdAT@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 20, 2010 at 9:31 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> On Mon, Jun 21, 2010 at 12:42 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> I'd buy that if all timeouts and retry counts would default to +infinity. But they don't, and hence sufficiently long network outages *will* cause connection aborts anyway. That a particular connection might survive due to inactivity proves nothing, since whether the connection is active or inactive during an outage is usually outside of anyone's control.
>>
>> I really fail to see why anyone would prefer connections (and therefore transactions!) getting stuck forever over a few spurious disconnects. The former always require manual intervention and cause all sorts of performance and disk-space issues, while the latter won't even be an issue for well-written clients who just reconnect and retry.
>>
>
> So just as a data point I'm routinely annoyed by reopening my screen
> session and finding various session sessions have died since the day
> before. Usually this is caused by broken firewalls but there are also
> a bunch of SSH options which some servers have enabled which cause my
> sessions to never survive very long if there are any network outages.
> Servers where those options are disabled work fine.
>
> I admit this is a very different use case though and since we have
> control over the behaviour when the connection breaks perhaps the
> analogy falls apart completely. I'm not sure we can guarantee that
> reconnecting is always so simple though. What if the user set up an
> SSH gateway or needs some extra authentication to make the connection.
> Are users expecting the slave to randomly disconnect and reconnect
> willy nilly or are they expecting that once it connects it'll keep
> using that connection forever?

I feel like we're getting off in the weeds, here. Obviously, the user
would ideally like the connection to the master to last forever, but
equally obviously, if the master unexpectedly reboots, they'd like the
slave to notice - ideally within some reasonable time period - that it
needs to reconnect. There's no perfect way to distinguish "the master
croaked" from "the network administrator unplugged the Ethernet cable
and is planning to plug it back in any hour now", so we'll just need
to pick some reasonable timeout and go with it. To my way of
thinking, if the master hasn't responded in a minute or two, that's a
sign that it's time to declare the connection dead. Retrying the
connection *should* be cheap. If the user has set things up so that a
TCP connection from slave to master is not straightforward, the user
has configured it incorrectly, and no matter what we do it's not going
to be reliable.

I still think there's a decent argument that we might want to have a
protocol-level heartbeat rather than a TCP-level heartbeat. But doing
the latter is, I think, good enough for 9.0. We're pretty much
speculating about what the problems with that approach might be, so
getting too worked up about fixing them at this point seems premature.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-21 08:37:18
Message-ID: AANLkTinVLt-MJRqZb3cmz_-38FlDAw0c0mdth3NjjYDK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 21, 2010 at 4:54 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I feel like we're getting off in the weeds, here.  Obviously, the user
> would ideally like the connection to the master to last forever, but
> equally obviously, if the master unexpectedly reboots, they'd like the
> slave to notice - ideally within some reasonable time period - that it
> needs to reconnect.

>  There's no perfect way to distinguish "the master
> croaked" from "the network administrator unplugged the Ethernet cable
> and is planning to plug it back in any hour now", so we'll just need
> to pick some reasonable timeout and go with it.

--
greg


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-21 11:11:56
Message-ID: AANLkTimRfJfFLQD6ktPmwVjpWmOy5YbEnXlXZe8MpRed@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 21, 2010 at 4:37 AM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> On Mon, Jun 21, 2010 at 4:54 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I feel like we're getting off in the weeds, here.  Obviously, the user
>> would ideally like the connection to the master to last forever, but
>> equally obviously, if the master unexpectedly reboots, they'd like the
>> slave to notice - ideally within some reasonable time period - that it
>> needs to reconnect.
>
>
>
>>  There's no perfect way to distinguish "the master
>> croaked" from "the network administrator unplugged the Ethernet cable
>> and is planning to plug it back in any hour now", so we'll just need
>> to pick some reasonable timeout and go with it.

Eh... was there supposed to be some text here?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company