Re: pg_ctl non-idempotent behavior change

Lists: pgsql-hackers
From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_ctl non-idempotent behavior change
Date: 2013-04-26 22:09:44
Message-ID: CAMkU=1zKGzGoDoO=u4MON8h6Q=biRL59PTZvRmR9J7uX0yKoyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
power outage), running "pg_ctl start" just gives this message and then
exits:

pg_ctl: another server might be running

Under the old behavior, it would try to start the server anyway, and
succeed, then go through recovery and give you back a functional system.

From reading the archive, I can't really tell if this change in behavior
was intentional.

Anyway it seems like a bad thing to me. Now the user has a system that
will not start up, and is given no clue that they need to remove
"postmaster.pid" and try again.

The behavior here under the new "-I" flag seems no better in this
situation. It claims the server is running, when it only "might" be
running (and in fact is not running).

Cheers,

Jeff


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_ctl non-idempotent behavior change
Date: 2013-04-27 18:24:17
Message-ID: 437.1367087057@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
> power outage), running "pg_ctl start" just gives this message and then
> exits:

> pg_ctl: another server might be running

> Under the old behavior, it would try to start the server anyway, and
> succeed, then go through recovery and give you back a functional system.

> From reading the archive, I can't really tell if this change in behavior
> was intentional.

Hmm. I rather thought we had agreed not to change the default behavior,
but the commit message fairly clearly says that the default behavior is
being changed. This case shows that that change was inadequately
thought through.

> Anyway it seems like a bad thing to me. Now the user has a system that
> will not start up, and is given no clue that they need to remove
> "postmaster.pid" and try again.

Yeah, this is not tolerable. We could think about improving the logic
to have a stronger check on whether the old server is really there or
not (ie it should be doing something more like pg_ping and less like
just checking if the pidfile is there). But given how close we are to
beta, maybe the best thing is to revert that change for now and put it
back on the to-think-about-for-9.4 list. Peter?

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_ctl non-idempotent behavior change
Date: 2013-04-30 02:01:30
Message-ID: 1367287290.32604.5.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2013-04-27 at 14:24 -0400, Tom Lane wrote:
> Yeah, this is not tolerable. We could think about improving the logic
> to have a stronger check on whether the old server is really there or
> not (ie it should be doing something more like pg_ping and less like
> just checking if the pidfile is there). But given how close we are to
> beta, maybe the best thing is to revert that change for now and put it
> back on the to-think-about-for-9.4 list. Peter?

Reverted.


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_ctl non-idempotent behavior change
Date: 2014-08-04 21:07:47
Message-ID: 20140804210747.GM5475@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
> > power outage), running "pg_ctl start" just gives this message and then
> > exits:
>
> > pg_ctl: another server might be running
>
> > Under the old behavior, it would try to start the server anyway, and
> > succeed, then go through recovery and give you back a functional system.
>
> > From reading the archive, I can't really tell if this change in behavior
> > was intentional.
>
> Hmm. I rather thought we had agreed not to change the default behavior,
> but the commit message fairly clearly says that the default behavior is
> being changed. This case shows that that change was inadequately
> thought through.
>
> > Anyway it seems like a bad thing to me. Now the user has a system that
> > will not start up, and is given no clue that they need to remove
> > "postmaster.pid" and try again.
>
> Yeah, this is not tolerable. We could think about improving the logic
> to have a stronger check on whether the old server is really there or
> not (ie it should be doing something more like pg_ping and less like
> just checking if the pidfile is there). But given how close we are to
> beta, maybe the best thing is to revert that change for now and put it
> back on the to-think-about-for-9.4 list. Peter?

Are we going to unrevert this patch for 9.5?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_ctl non-idempotent behavior change
Date: 2014-10-11 22:54:32
Message-ID: 20141011225432.GO21267@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 4, 2014 at 05:07:47PM -0400, Alvaro Herrera wrote:
> Tom Lane wrote:
> > Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > > After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
> > > power outage), running "pg_ctl start" just gives this message and then
> > > exits:
> >
> > > pg_ctl: another server might be running
> >
> > > Under the old behavior, it would try to start the server anyway, and
> > > succeed, then go through recovery and give you back a functional system.
> >
> > > From reading the archive, I can't really tell if this change in behavior
> > > was intentional.
> >
> > Hmm. I rather thought we had agreed not to change the default behavior,
> > but the commit message fairly clearly says that the default behavior is
> > being changed. This case shows that that change was inadequately
> > thought through.
> >
> > > Anyway it seems like a bad thing to me. Now the user has a system that
> > > will not start up, and is given no clue that they need to remove
> > > "postmaster.pid" and try again.
> >
> > Yeah, this is not tolerable. We could think about improving the logic
> > to have a stronger check on whether the old server is really there or
> > not (ie it should be doing something more like pg_ping and less like
> > just checking if the pidfile is there). But given how close we are to
> > beta, maybe the best thing is to revert that change for now and put it
> > back on the to-think-about-for-9.4 list. Peter?
>
> Are we going to unrevert this patch for 9.5?

Seems no one is thinking of restoring this patch and working on the
issue.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_ctl non-idempotent behavior change
Date: 2014-11-01 13:57:33
Message-ID: 5454E6CD.2070909@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/11/14 6:54 PM, Bruce Momjian wrote:
>> Are we going to unrevert this patch for 9.5?
> Seems no one is thinking of restoring this patch and working on the
> issue.

I had postponed work on this issue and set out to create a test
infrastructure so that all the subtle behavioral dependencies mentioned
in the thread could be expressed in code rather than prose.