Re: Replication to Postgres 10 on Windows is broken

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: "Augustine, Jobin" <jobin(dot)augustine(at)openscg(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Replication to Postgres 10 on Windows is broken
Date: 2017-08-06 15:52:20
Message-ID: 5067.1502034740@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

I wrote:
> Gut instinct says that the reason this case fails when other tools
> can connect successfully is that libpqwalreceiver is the only tool
> that uses PQconnectStart/PQconnectPoll rather than a plain
> PQconnectdb, and that there is some behavioral difference between
> connectDBComplete's wait loop and libpqrcv_connect's wait loop that
> OpenSSL is sensitive to --- but only on Windows, and maybe only on
> particular OpenSSL versions.

On closer inspection, I take that back. This can't be directly
OpenSSL's fault, because those error messages come out before libpq
has invoked OpenSSL at all; in particular we see

2017-08-03 10:49:41 UTC [2108]: [1-1] user=,db=,app=,client= FATAL: could not connect to the primary server: could not send data to server: Socket is not connected (0x00002749/10057)
could not send SSL negotiation packet: Socket is not connected
(0x00002749/10057)

and "could not send SSL negotiation packet" certainly must occur
before we've asked OpenSSL to do anything.

What seems likely to me at this point is that the changes in
PQconnectPoll() to support multiple hosts are somehow responsible.
It must still be connected to libpqwalreceiver's different wait loop,
but the details are unclear.

It would likely be useful to add some debug logging to PQconnectPoll
to find out what set of addresses it's seeing and whether this failure
occurs after having advanced over some of them.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2017-08-06 16:29:07 Re: Replication to Postgres 10 on Windows is broken
Previous Message Noah Misch 2017-08-06 15:50:37 Re: Replication to Postgres 10 on Windows is broken

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-08-06 16:29:07 Re: Replication to Postgres 10 on Windows is broken
Previous Message Noah Misch 2017-08-06 15:50:37 Re: Replication to Postgres 10 on Windows is broken