pgstat: remove delayed destroy / pipe: socket error fix

Lists: pgsql-patches
From: "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>
To: <pgsql-patches(at)postgresql(dot)org>
Subject: pgstat: remove delayed destroy / pipe: socket error fix
Date: 2006-04-06 16:58:22
Message-ID: 4435025E020000BE000029C7@gwmta.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi all,

Attached are two patches which in combination make pg_stat_activity
work reliably for us on Windows.

The mysterious socket error turned out to be WSAEWOULDBLOCK. Per
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/windows_sockets_error_codes_2.asp
, it seems the thing to do is loop and try again. pipe.patch does
that.

pgstat.patch removes the delayed destroy code for backends, databases,
and tables. Database and table entries are cleaned up immediately upon
receipt of the appropriate message.

Both patches were necessary to make pg_stat_activity work reliably.
With no changes, with a connection pool size of 31, under load, we'd
typically see < 5 rows in pg_stat_activity. With pgstat.patch applied,
the number of rows would typically be between 15 and 20. With
pipe.patch also applied, the number of rows in pg_stat_activity was
accurate.

The test server withstood an approximately four hour test stress test
which replays captured Web traffic, but at full blast. The machine was
completely swamped, but there were no socket errors over the test run
(compared to a frequency of once every couple minutes before).

The one remaining problem is that there seems to be a race condition
when installing the temporary stats file on Windows. As we were
monitoring pg_stat_activity during the test run, occasionally we'd get a
response with zero rows. This may not be much of a problem during
normal conditions (the server was completely overloaded and we were
banging away with "Up Arrow", "Enter" watching pg_stat_activity).

What's the best way to do an atomic rename on Windows? Alternatively,
would it make sense to sleep and try again (up to some limit) when
trying to open the stats file on Windows?

Pete

Attachment Content-Type Size
pipe.patch application/octet-stream 3.9 KB
pgstat.patch application/octet-stream 11.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: pgstat: remove delayed destroy / pipe: socket error fix
Date: 2006-04-06 19:44:26
Message-ID: 26914.1144352666@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov> writes:
> Attached are two patches which in combination make pg_stat_activity
> work reliably for us on Windows.
> ...
> pgstat.patch removes the delayed destroy code for backends, databases,
> and tables. Database and table entries are cleaned up immediately upon
> receipt of the appropriate message.

I'll go ahead and apply the delayed-destroy-removal part (just to HEAD
for the time being --- seems a bit risky to apply it to the stable
branches). The Windows-specific change sounds like it may need more
review.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Peter Brant <Peter(dot)Brant(at)wicourts(dot)gov>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: pgstat: remove delayed destroy / pipe: socket error fix
Date: 2006-05-07 01:44:10
Message-ID: 200605070144.k471iAB16915@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches


Now that we know the cause of the Win32 failure (FRONTEND), we don't
need the Win32 part of this patch anymore right? (The stats display
part was already applied.)

---------------------------------------------------------------------------

Peter Brant wrote:
> Hi all,
>
> Attached are two patches which in combination make pg_stat_activity
> work reliably for us on Windows.
>
> The mysterious socket error turned out to be WSAEWOULDBLOCK. Per
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/windows_sockets_error_codes_2.asp
> , it seems the thing to do is loop and try again. pipe.patch does
> that.
>
> pgstat.patch removes the delayed destroy code for backends, databases,
> and tables. Database and table entries are cleaned up immediately upon
> receipt of the appropriate message.
>
> Both patches were necessary to make pg_stat_activity work reliably.
> With no changes, with a connection pool size of 31, under load, we'd
> typically see < 5 rows in pg_stat_activity. With pgstat.patch applied,
> the number of rows would typically be between 15 and 20. With
> pipe.patch also applied, the number of rows in pg_stat_activity was
> accurate.
>
> The test server withstood an approximately four hour test stress test
> which replays captured Web traffic, but at full blast. The machine was
> completely swamped, but there were no socket errors over the test run
> (compared to a frequency of once every couple minutes before).
>
> The one remaining problem is that there seems to be a race condition
> when installing the temporary stats file on Windows. As we were
> monitoring pg_stat_activity during the test run, occasionally we'd get a
> response with zero rows. This may not be much of a problem during
> normal conditions (the server was completely overloaded and we were
> banging away with "Up Arrow", "Enter" watching pg_stat_activity).
>
> What's the best way to do an atomic rename on Windows? Alternatively,
> would it make sense to sleep and try again (up to some limit) when
> trying to open the stats file on Windows?
>
> Pete
>

[ Attachment, skipping... ]

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: <pgsql-patches(at)postgresql(dot)org>
Subject: Re: pgstat: remove delayed destroy / pipe: socket
Date: 2006-05-09 08:57:58
Message-ID: 44601346020000BE0000374E@gwmta.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Yep, the pipe.c patch is unnecessary now.

Pete

>>> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> 05/07/06 3:44 am >>>
Now that we know the cause of the Win32 failure (FRONTEND), we don't
need the Win32 part of this patch anymore right?