Re: Feature freeze date for 8.1

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Neil Conway <neilc(at)samurai(dot)com>, Oliver Jowett <oliver(at)opencloud(dot)com>, adnandursun(at)asrinbilisim(dot)com(dot)tr, Peter Eisentraut <peter_e(at)gmx(dot)net>, Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feature freeze date for 8.1
Date: 2005-05-02 21:00:36
Message-ID: 1115067636.4932.29.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On E, 2005-05-02 at 18:47 +0300, Heikki Linnakangas wrote:
> On Mon, 2 May 2005, Hannu Krosing wrote:

> > It would be nice if I coud st up some timeut using keepalives (like ssh-
> > s ProtocoKeepalives") and use similar timeouts on client and server.
>
> FWIW, I've been bitten by this problem twice with other applications.
>
> 1. We had a DB2 database with clients running in other computers in the
> network. A faulty switch caused random network outages. If the connection
> timed out and the client was unable to send it's request to the server,
> the client would notice that the connection was down, and open a new one.
> But the server never noticed that the connection was dead. Eventually,
> the maximum number of connections was reached, and the administrator had
> to kill all the connections manually.
>
> 2. We had a custom client-server application using TCP across a network.
> There was stateful firewall between the server and the clients that
> dropped the connection at night when there was no activity. After a
> couple of days, the server reached the maximum number of threads on the
> platform and stopped accepting new connections.
>
> In case 1, the switch was fixed. If another switch fails, the same will
> happen again. In case 2, we added an application-level heartbeat that
> sends a dummy message from server to client every 10 minutes.
>
> TCP keep-alive with a small interval would have saved the day in both
> cases. Unfortunately the default interval must be >= 2 hours, according
> to RFC1122.
>
> On most platforms, including Windows and Linux, the TCP keep-alive
> interval can't be set on a per-connection basis. The ideal solution would
> be to modify the operating system to support it.

Yep. I think this could be done for (our instance of) linux, but getting
it into mainstream kernel, and then into all popular distros is a lot of
effort.

Going the ssh way (protocol level keepalives) might be way simpler.

> What we can do in PostgreSQL is to introduce an application-level
> heartbeat. A simple "Hello world" message sent from server to client that
> the client would ignore would do the trick.

Actually we would need a round-trip indicator (some there-and-back
message: A: do you copy 42 --> B: yes I copy 42), and not just send. The
difficult part is what to do when one side happens to send the keepalive
in the middle of actual data transfer ?

move to packet oriented connections (UDP) and make different packet
types independant of each other?

--
Hannu Krosing <hannu(at)skype(dot)net>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rosser Schwarz 2005-05-02 21:01:07 Re: [HACKERS] Decision Process WAS: Increased company
Previous Message Jim C. Nasby 2005-05-02 20:59:36 Re: ARCHIVE TABLES (was: possible TODO: read-only tables, select from indexes only.)

Browse pgsql-patches by date

  From Date Subject
Next Message Magnus Hagander 2005-05-02 21:24:44 Added columns to pg_stat_activity
Previous Message Bruce Momjian 2005-05-02 18:30:03 Re: Cleaning up unreferenced table files