Re: Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Ashesh Vashi <ashesh(dot)vashi(at)enterprisedb(dot)com>, Mark Llewellyn <mark_llewellyn(at)adp(dot)com>, pgsql-hackers(at)postgresql(dot)org, Sujeet Rajguru <sujeet(dot)rajguru(at)enterprisedb(dot)com>
Subject: Re: Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Date: 2010-11-17 19:47:48
Message-ID: 201011171947.oAHJlmh10273@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Tom Lane wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
> > On Wed, Nov 17, 2010 at 19:57, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> Is FATAL, in general, enough to conclude the server is running?
>
> > No - specifically, we will send FATAL when "the database system is
> > starting up", which is exactly the one we want to *avoid*.
>
> > I think we should only exclude the password case. I guess we could
> > also do all fatal *except* <list>, but that seems more fragile.
>
> I believe that the above argument is exactly backwards. What we want
> here is to check the result of postmaster.c's canAcceptConnections(),
> and there are only a finite number of error codes that can result from
> rejections there. If we get past that, there are a large number of
> possible failures, but all of them indicate that the postmaster is in
> principle willing to accept connections. Checking for password errors
> only is utterly wrong: any other type of auth failure would be the same
> for this purpose, as would "no such database", "no such user", "too many
> connections", etc etc etc.

Agreed. So how do we pass that info to libpq without exceeding the
value of fixing this problem? Should we parse pg_controldata output?
pg_upgrade could use machine-readable output from that too.

> What we actually want here, and don't have, is the fabled pg_ping
> protocol...

Well, we are basically figuring how to implement that with this fix,
whether it is part of pg_ctl or a separate binary.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2010-11-17 19:55:33 Re: Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running
Previous Message Tom Lane 2010-11-17 19:21:44 Re: Re: [BUGS] BUG #5650: Postgres service showing as stopped when in fact it is running

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-11-17 19:49:36 Re: git diff script is not portable
Previous Message Tom Lane 2010-11-17 19:44:27 Re: unlogged tables