Re: pgstat: remove delayed destroy / pipe: socket error fix

Lists: pgsql-patches
From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>, <pgsql-patches(at)postgresql(dot)org>
Subject: Re: pgstat: remove delayed destroy / pipe: socket error fix
Date: 2006-04-06 17:26:02
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCEA0F8DF@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

> Hi all,
>
> Attached are two patches which in combination make
> pg_stat_activity work reliably for us on Windows.
>
> The mysterious socket error turned out to be WSAEWOULDBLOCK.
> Per
> http://msdn.microsoft.com/library/default.asp?url=/library/en-
> us/winsock/winsock/windows_sockets_error_codes_2.asp
> , it seems the thing to do is loop and try again. pipe.patch
> does that.

Good catch. I always knew there was something afoot in that code, but
never thought it was that simple ;-) I kinda got snowed in on the idea
that it had to do with inheritance :-(

Also, might we want a CHECK_FOR_INTERRUPTS in the loop? Since it's a
potential stuck-forever place.

Oh, and checking the code go pgwin32_recv, I think I see where this is
coming from: pgwin32_recv calls pgwin32_waitforsinglesocket(). If that
one succeeds *and a signal is delivered while it's waiting*, we'll get
out og pgwin32_waitforsinglesocket() without clearing the WSA code. The
rest of the pg code uses errno which is properly set to EINTR, but the
pipe code uses WSAGetLastError() directly.

The fix for that is probably to add a WSASetLastError(WSAEINTR) right
after the errno=EINTR in pgwin32_waitforsinglesocket().

Does this seem right? Can you try by adding this, and then backing out
your pgstats patch, and see if this alone is enough?

BTW, what's with the change to all the error msgs?

And finally, the error handling looks a bit off? We specifically *don't*
want it to log an error for the WSAECONNRESET state - it's a normal
state. Or am I reading the patch wrong?

> The one remaining problem is that there seems to be a race
> condition when installing the temporary stats file on
> Windows. As we were monitoring pg_stat_activity during the
> test run, occasionally we'd get a response with zero rows.
> This may not be much of a problem during normal conditions
> (the server was completely overloaded and we were banging
> away with "Up Arrow", "Enter" watching pg_stat_activity).
>
> What's the best way to do an atomic rename on Windows?
> Alternatively, would it make sense to sleep and try again (up
> to some limit) when trying to open the stats file on Windows?

rename() should be atomic. We do a special version on win32 (see
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/port/dirmod.c?rev=1
.42), but I don't see how that would change the atomicity of it.

Do you get anything at all in the logs when this happens? Are you sure
the reason is that it picks up an empty file, or could it be something
else?

//Magnus


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-patches(at)postgresql(dot)org>, "Magnus Hagander" <mha(at)sollentuna(dot)net>, "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>
Subject: Re: pgstat: remove delayed destroy / pipe: socket
Date: 2006-04-06 17:46:23
Message-ID: 44350D9F.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

>>> On Thu, Apr 6, 2006 at 12:26 pm, in message
<6BCB9D8A16AC4241919521715F4D8BCEA0F8DF(at)algol(dot)sollentuna(dot)se>, "Magnus
Hagander"
<mha(at)sollentuna(dot)net> wrote:
> And finally, the error handling looks a bit off? We specifically
*don't*
> want it to log an error for the WSAECONNRESET state - it's a normal
> state. Or am I reading the patch wrong?

It looks to me like error is never referenced except when ret ==
SOCKET_ERROR. When ret == SOCKET_ERROR and error == WSAEWOULDBLOCK it
loops back without generating an error message.

-Kevin


From: "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>
To: <pgsql-patches(at)postgresql(dot)org>, "Magnus Hagander" <mha(at)sollentuna(dot)net>
Subject: Re: pgstat: remove delayed destroy / pipe: socket
Date: 2006-04-06 17:58:46
Message-ID: 44351086020000BE000029CB@gwmta.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

>>> "Magnus Hagander" <mha(at)sollentuna(dot)net> 04/06/06 7:26 pm >>>

> BTW, what's with the change to all the error msgs?
Ah, I'd assumed the %ui was a typo in the format string. If the intent
was to print e.g. 10022i, I'll change it back.

> And finally, the error handling looks a bit off? We specifically
*don't*
> want it to log an error for the WSAECONNRESET state - it's a normal
> state. Or am I reading the patch wrong?
If error is WSAECONNRESET, the return value is reset to zero. That
will prevent an error message from being displayed (unless I'm reading
something wrong).

> Do you get anything at all in the logs when this happens? Are you
sure
> the reason is that it picks up an empty file, or could it be
something
> else?
There is nothing in the logs. It could definitely be something else.
A rename race condition was just our best theory.

Pete