Re: SO_KEEPALIVE

Lists: pgsql-hackers
From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: SO_KEEPALIVE
Date: 2005-05-16 14:45:06
Message-ID: Pine.LNX.4.44.0505161641500.7072-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

How come we don't set SO_KEEPALIVE in libpq?

Is there any reason why we wouldn't want it on?

--
/Dennis Björklund


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SO_KEEPALIVE
Date: 2005-05-16 16:40:36
Message-ID: 18682.1116261636@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> writes:
> How come we don't set SO_KEEPALIVE in libpq?
> Is there any reason why we wouldn't want it on?

Is there any reason we *would* want it on? The server-side keepalive
should be sufficient to get whatever useful impact it might have.

regards, tom lane


From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SO_KEEPALIVE
Date: 2005-05-16 17:22:47
Message-ID: Pine.LNX.4.44.0505161912520.7072-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 16 May 2005, Tom Lane wrote:

> > How come we don't set SO_KEEPALIVE in libpq?
> > Is there any reason why we wouldn't want it on?
>
> Is there any reason we *would* want it on? The server-side keepalive
> should be sufficient to get whatever useful impact it might have.

Wouldn't the client also want to know that the server is not there
anymore? I talked to Gaetano Mendola (I think, but you never know on irc
:-) and he had some clients that had been hanging around for 3 days after
the server had been down and later up again (stuck in recv).

Server-side keepalive is enough for the server to clean up when clients
disapears, but this do nothing to help clients detect that the server is
gone. So I don't see what server side keepalive has to do with it.

--
/Dennis Björklund


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SO_KEEPALIVE
Date: 2005-05-16 22:08:08
Message-ID: 1116281289.4965.10.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On E, 2005-05-16 at 19:22 +0200, Dennis Bjorklund wrote:
> On Mon, 16 May 2005, Tom Lane wrote:
>
> > > How come we don't set SO_KEEPALIVE in libpq?
> > > Is there any reason why we wouldn't want it on?
> >
> > Is there any reason we *would* want it on? The server-side keepalive
> > should be sufficient to get whatever useful impact it might have.
>
> Wouldn't the client also want to know that the server is not there
> anymore? I talked to Gaetano Mendola (I think, but you never know on irc
> :-) and he had some clients that had been hanging around for 3 days after
> the server had been down and later up again (stuck in recv).

"stuck in recv" is symptom of a reconnect bug when libpq first tries to
test for a SSL connection but the connect has already gone away.
(search for "[HACKERS] oldish libpq bug still in RC2" in lists)
Tom fixed it in no time once I showed him where to look and provided a
test case. It should be fixed in 8.0.

I don't know if the fix was backported to older libpq versions as well.

> Server-side keepalive is enough for the server to clean up when clients
> disapears, but this do nothing to help clients detect that the server is
> gone. So I don't see what server side keepalive has to do with it.

--
Hannu Krosing <hannu(at)tm(dot)ee>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SO_KEEPALIVE
Date: 2005-05-17 01:51:00
Message-ID: 23013.1116294660@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hannu Krosing <hannu(at)tm(dot)ee> writes:
> On E, 2005-05-16 at 19:22 +0200, Dennis Bjorklund wrote:
>> Wouldn't the client also want to know that the server is not there
>> anymore? I talked to Gaetano Mendola (I think, but you never know on irc
>> :-) and he had some clients that had been hanging around for 3 days after
>> the server had been down and later up again (stuck in recv).

> "stuck in recv" is symptom of a reconnect bug when libpq first tries to
> test for a SSL connection but the connect has already gone away.
> (search for "[HACKERS] oldish libpq bug still in RC2" in lists)
> Tom fixed it in no time once I showed him where to look and provided a
> test case. It should be fixed in 8.0.

> I don't know if the fix was backported to older libpq versions as well.

It was not ... but I'm not convinced that that bug explains Gaetano's
problem. If you'll recall, that bug caused libpq to get into a tight
loop chewing CPU. It should be pretty easy to tell the difference
between that and sitting idle because there is nothing happening.

On the other hand, it seems to me a client-side SO_KEEPALIVE would only
be interesting for completely passive clients (perhaps one that sits
waiting for NOTIFY messages?) A normal client will try to issue some
kind of database command once in awhile, and as soon as that happens,
there is a reasonably short timeout before connection failure is reported.

regards, tom lane


From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)tm(dot)ee>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SO_KEEPALIVE
Date: 2005-05-17 04:40:38
Message-ID: Pine.LNX.4.44.0505170638550.7072-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 16 May 2005, Tom Lane wrote:

> On the other hand, it seems to me a client-side SO_KEEPALIVE would only
> be interesting for completely passive clients (perhaps one that sits
> waiting for NOTIFY messages?) A normal client will try to issue some
> kind of database command once in awhile

At least some of the clients was psql.

--
/Dennis Björklund


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)tm(dot)ee>, Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SO_KEEPALIVE
Date: 2005-05-18 13:38:25
Message-ID: 428B4551.7080806@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> On the other hand, it seems to me a client-side SO_KEEPALIVE would only
> be interesting for completely passive clients (perhaps one that sits
> waiting for NOTIFY messages?) A normal client will try to issue some
> kind of database command once in awhile, and as soon as that happens,
> there is a reasonably short timeout before connection failure is reported.

If you're unlucky, the server could go down while you're blocked waiting
for a query response..

-O


From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SO_KEEPALIVE
Date: 2005-05-18 13:49:00
Message-ID: d6fh4n$b2t$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oliver Jowett wrote:

> If you're unlucky, the server could go down while you're blocked waiting
> for a query response..
>

That is exactly what happens to us, and you have to be not so unlucky for
that happen if the engine have ~100 query at time.

Regards
Gaetano Mendola


From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SO_KEEPALIVE
Date: 2005-05-18 14:15:30
Message-ID: 428B4E02.90502@bigfoot.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Oliver Jowett wrote:

>>>> If you're unlucky, the server could go down while you're blocked waiting
>>>> for a query response..
>>>>

That is exactly what happens to us, and you have to be not so unlucky for
that happen if the engine have ~100 query at time.

Regards
Gaetano Mendola

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCi04C7UpzwH2SGd4RArvMAKDUJEefpsH2CX9E6wjg2j5DcV3JSwCgr/XB
BlTc3y4vE9GjyUl6eypcN00=
=h/Gg
-----END PGP SIGNATURE-----