Re: Log notice that checkpoint is to be written on shutdown

Lists: pgsql-hackers
From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-02 13:21:48
Message-ID: 1412256108.5984.23.camel@hartree.muc.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

we have seen repeatedly that users can be confused about why PostgreSQL
is not shutting down even though they requested it. Usually, this is
because `log_checkpoints' is not enabled and the final checkpoint is
being written, delaying shutdown. As no message besides "shutting down"
is written to the server log in this case, we even had users believing
the server was hanging and pondering killing it manually.

In order to alert those users that a checkpoint is being written, I
propose to add a log message "waiting for checkpoint ..." on shutdown,
even if log_checkpoints is disabled, as this particular checkpoint might
be important information.

I've attached a trivial patch for this, should it be added to the next
commitfest?

Cheers,

Michael

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment Content-Type Size
shutdown_checkpoint_notice.patch text/x-patch 827 bytes

From: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-02 15:17:32
Message-ID: 1412263052328-5821417.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Banck-2 wrote
> Hi,
>
> we have seen repeatedly that users can be confused about why PostgreSQL
> is not shutting down even though they requested it. Usually, this is
> because `log_checkpoints' is not enabled and the final checkpoint is
> being written, delaying shutdown. As no message besides "shutting down"
> is written to the server log in this case, we even had users believing
> the server was hanging and pondering killing it manually.
>
> In order to alert those users that a checkpoint is being written, I
> propose to add a log message "waiting for checkpoint ..." on shutdown,
> even if log_checkpoints is disabled, as this particular checkpoint might
> be important information.
>
> I've attached a trivial patch for this, should it be added to the next
> commitfest?

Peeking at this provokes a couple of novice questions:

While apparently it is impossible to have a checkpoint to log at recovery
end the decision to use the bitwise-OR instead of the local "shutdown" bool
seemed odd at first...

LogCheckpointStart creates the log entry unconditionally - leaving the
caller responsible for evaluating log_checkpoints - while LogCheckpointEnd
has the log_checkpoints condition built into it. I mention this because by
the same argument advocating the logging of the checkpoint start we should
also log its end. In order to do that here, though, we have to change
log_checkpoints before calling LogCheckpointEnd. Now, since we going to
shutdown anyway that seems benign (and forecloses on any need to set it back
to the prior value) but its still ugly.

Just some thoughts...

The rationale makes perfect sense. +1

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Log-notice-that-checkpoint-is-to-be-written-on-shutdown-tp5821394p5821417.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-02 16:00:57
Message-ID: 1412265657.5984.31.camel@hartree.muc.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Am Donnerstag, den 02.10.2014, 08:17 -0700 schrieb David G Johnston:
> Michael Banck-2 wrote
> > I've attached a trivial patch for this, should it be added to the next
> > commitfest?
>
> Peeking at this provokes a couple of novice questions:
>
> While apparently it is impossible to have a checkpoint to log at recovery
> end the decision to use the bitwise-OR instead of the local "shutdown" bool
> seemed odd at first...

Good point, using `shutdown' is probably better. I guess that local
bool had escaped my notice when I first had a look at this a while ago.

> LogCheckpointStart creates the log entry unconditionally - leaving the
> caller responsible for evaluating log_checkpoints - while LogCheckpointEnd
> has the log_checkpoints condition built into it.

I noticed this as well. My guess would be that this is because
LogCheckpointEnd() also updates the BgWriterStats statistics, and should
do that every time, not just when the checkpoint is getting logged.
Whether or not log_checkpoint is set (and logging should happen) is
evaluated directly after that.

Some refactoring of this might be useful (or there might be a very good
reason why this was done like this), but this seems outside the scope of
this patch. AIUI, shutdown will not be further delayed after the
checkpoint is finished, so no further logging is required for the
purpose of this patch.

> The rationale makes perfect sense. +1

Cool!

I have attached an updated patch using the `shutdown' bool.

Michael

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment Content-Type Size
shutdown_checkpoint_notice_v2.patch text/x-patch 805 bytes

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-03 15:07:23
Message-ID: 20141003150723.GA7043@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Banck wrote:

> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> index 5a4dbb9..f2716ae 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -8085,10 +8085,14 @@ CreateCheckPoint(int flags)
>
> /*
> * If enabled, log checkpoint start. We postpone this until now so as not
> - * to log anything if we decided to skip the checkpoint.
> + * to log anything if we decided to skip the checkpoint. If we are during
> + * shutdown and checkpoints are not being logged, add a log message that a
> + * checkpoint is to be written and shutdown is potentially delayed.
> */
> if (log_checkpoints)
> LogCheckpointStart(flags, false);
> + else if (shutdown)
> + ereport(LOG, (errmsg("waiting for checkpoint ...")));
>
> TRACE_POSTGRESQL_CHECKPOINT_START(flags);

I think if we're going to emit log messages for shutdown checkpoints, we
ought to use the same format we already have, i.e. instead of having the
separate "waiting for checkpoint" message, just test "log_checkpoints ||
shutdown", then LogCheckpointStart. And for consistency also make sure
that the checkpoint end log line is also reported on shutdown
checkpoints regardless of log_checkpoints.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-04 18:02:36
Message-ID: 1412445756.5984.43.camel@hartree.muc.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Freitag, den 03.10.2014, 12:07 -0300 schrieb Alvaro Herrera:
> Michael Banck wrote:
> > diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> > index 5a4dbb9..f2716ae 100644
> > --- a/src/backend/access/transam/xlog.c
> > +++ b/src/backend/access/transam/xlog.c
> > @@ -8085,10 +8085,14 @@ CreateCheckPoint(int flags)
> >
> > /*
> > * If enabled, log checkpoint start. We postpone this until now so as not
> > - * to log anything if we decided to skip the checkpoint.
> > + * to log anything if we decided to skip the checkpoint. If we are during
> > + * shutdown and checkpoints are not being logged, add a log message that a
> > + * checkpoint is to be written and shutdown is potentially delayed.
> > */
> > if (log_checkpoints)
> > LogCheckpointStart(flags, false);
> > + else if (shutdown)
> > + ereport(LOG, (errmsg("waiting for checkpoint ...")));
> >
> > TRACE_POSTGRESQL_CHECKPOINT_START(flags);
>
> I think if we're going to emit log messages for shutdown checkpoints, we
> ought to use the same format we already have, i.e. instead of having the
> separate "waiting for checkpoint" message, just test "log_checkpoints ||
> shutdown", then LogCheckpointStart.

I considered this at first, but the rationale is that if somebody sets
log_checkpoints = off, they probably don't want the actual checkpoint to
be logged, so just a note that a checkpoint is happening was better
(Christoph Berg pointed this out). If there is consensus that forcing
this one-time checkpoint logging is fine, I can change the patch
accordingly.

> And for consistency also make sure
> that the checkpoint end log line is also reported on shutdown
> checkpoints regardless of log_checkpoints.

I did this in an earlier draft of the patch, but AIUI this either
requires some refactoring to not evaluate log_checkpoints inside
LogCheckpointEnd(), and/or to change the function signature of
LogCheckpointEnd() to include either the `flags' bitfield or the
`shutdown' bool in order to force logging of the finished checkpoint
even if log_checkpoints is set to `off'.

Cheers,

Michael

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-04 18:21:03
Message-ID: CAMkU=1wW6ekM=BKHRd731E2tfRwm8-fgFh+oBAVH6TL28K3TgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 2, 2014 at 6:21 AM, Michael Banck <michael(dot)banck(at)credativ(dot)de>
wrote:

> Hi,
>
> we have seen repeatedly that users can be confused about why PostgreSQL
> is not shutting down even though they requested it. Usually, this is
> because `log_checkpoints' is not enabled and the final checkpoint is
> being written, delaying shutdown. As no message besides "shutting down"
> is written to the server log in this case, we even had users believing
> the server was hanging and pondering killing it manually.
>

Wouldn't a better place to write this message be the terminal from which
"pg_ctl stop" was invoked, rather than the server log file?

Cheers,

Jeff


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-04 20:05:02
Message-ID: 543052EE.9000506@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/4/14, 1:21 PM, Jeff Janes wrote:
> On Thu, Oct 2, 2014 at 6:21 AM, Michael Banck <michael(dot)banck(at)credativ(dot)de <mailto:michael(dot)banck(at)credativ(dot)de>> wrote:
>
> Hi,
>
> we have seen repeatedly that users can be confused about why PostgreSQL
> is not shutting down even though they requested it. Usually, this is
> because `log_checkpoints' is not enabled and the final checkpoint is
> being written, delaying shutdown. As no message besides "shutting down"
> is written to the server log in this case, we even had users believing
> the server was hanging and pondering killing it manually.
>
>
> Wouldn't a better place to write this message be the terminal from which "pg_ctl stop" was invoked, rather than the server log file?
>
+1

Or do both. I suspect elog( INFO, ... ) might do that.


From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-08 17:52:05
Message-ID: 1412790725.5382.21.camel@hartree.muc.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Am Samstag, den 04.10.2014, 15:05 -0500 schrieb Jim Nasby:
> On 10/4/14, 1:21 PM, Jeff Janes wrote:
> > On Thu, Oct 2, 2014 at 6:21 AM, Michael Banck wrote:
> > we have seen repeatedly that users can be confused about why PostgreSQL
> > is not shutting down even though they requested it. Usually, this is
> > because `log_checkpoints' is not enabled and the final checkpoint is
> > being written, delaying shutdown. As no message besides "shutting down"
> > is written to the server log in this case, we even had users believing
> > the server was hanging and pondering killing it manually.
> >
> >
>> Wouldn't a better place to write this message be the terminal from
>> which "pg_ctl stop" was invoked, rather than the server log file?

Looking at it from a DBA perspective, this would indeed be better, yes.

However, I see a few issues with that:

1. If you are using an init script (or another wrapper around pg_ctl),
you don't get any of its output it seems.

2. Having taken a quick look at pg_ctl, it seems to just kill the
postmaster on shutdown and wait for its PID file to disappear. I don't
see how it should figure out that PostgreSQL is waiting for a checkpoint
to be finished?

> Or do both. I suspect elog( INFO, ... ) might do that.

That would imply that pg_ctl receives and writes out log messages
directed at clients, which I don't think is true? Even if it was,
client_min_messages does not include an INFO level, and LOG is not being
logged to clients by default. So the first common level above the
default of both client_min_messages and log_min_messages would be
WARNING, which seems excessive to me.

As I said, I only took a quick look at pg_ctl though, so I might well be
missing something.

Michael

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 12:21:18
Message-ID: CA+TgmobXQExH-idyhUtQmLqBhHdqkegRjgf_jcZjXS_Zx9sNRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 8, 2014 at 1:52 PM, Michael Banck <michael(dot)banck(at)credativ(dot)de> wrote:
> Looking at it from a DBA perspective, this would indeed be better, yes.
>
> However, I see a few issues with that:
>
> 1. If you are using an init script (or another wrapper around pg_ctl),
> you don't get any of its output it seems.
>
> 2. Having taken a quick look at pg_ctl, it seems to just kill the
> postmaster on shutdown and wait for its PID file to disappear. I don't
> see how it should figure out that PostgreSQL is waiting for a checkpoint
> to be finished?
>
>> Or do both. I suspect elog( INFO, ... ) might do that.
>
> That would imply that pg_ctl receives and writes out log messages
> directed at clients, which I don't think is true? Even if it was,
> client_min_messages does not include an INFO level, and LOG is not being
> logged to clients by default. So the first common level above the
> default of both client_min_messages and log_min_messages would be
> WARNING, which seems excessive to me.
>
> As I said, I only took a quick look at pg_ctl though, so I might well be
> missing something.

I think you're spot-on.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 12:24:58
Message-ID: 20141009122458.GC6724@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-02 15:21:48 +0200, Michael Banck wrote:
> Hi,
>
> we have seen repeatedly that users can be confused about why PostgreSQL
> is not shutting down even though they requested it. Usually, this is
> because `log_checkpoints' is not enabled and the final checkpoint is
> being written, delaying shutdown. As no message besides "shutting down"
> is written to the server log in this case, we even had users believing
> the server was hanging and pondering killing it manually.
>
> In order to alert those users that a checkpoint is being written, I
> propose to add a log message "waiting for checkpoint ..." on shutdown,
> even if log_checkpoints is disabled, as this particular checkpoint might
> be important information.
>
> I've attached a trivial patch for this, should it be added to the next
> commitfest?

How about flipping the default for log_checkpoints instead? There really
isn't a good reason for having it disabled by default.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Michael Banck <michael(dot)banck(at)credativ(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 13:23:22
Message-ID: 20141009132322.GM28859@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> On 2014-10-02 15:21:48 +0200, Michael Banck wrote:
> > I've attached a trivial patch for this, should it be added to the next
> > commitfest?
>
> How about flipping the default for log_checkpoints instead? There really
> isn't a good reason for having it disabled by default.

Yeah, I agree with this- it's extremely useful information and it's
really not that verbose in general..

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 13:44:09
Message-ID: 24137.1412862249@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
>> How about flipping the default for log_checkpoints instead? There really
>> isn't a good reason for having it disabled by default.

> Yeah, I agree with this- it's extremely useful information and it's
> really not that verbose in general..

-1. Every time we've turned on default logging of routine events,
there's been pushback and it was eventually turned off again as log spam.
I guarantee you that logging checkpoints will be seen as log spam, except
by people who are actually having trouble with that behavior, which is
a small minority.

I'm not really convinced that there's a problem here that needs fixing
at all, but certainly putting log_checkpoints on by default is not an
acceptable fix.

regards, tom lane


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 17:12:52
Message-ID: CAMkU=1z=5qG6swPSL2zubfjArJO3Z1WeWK04i5E9swupQP0U1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 8, 2014 at 10:52 AM, Michael Banck <michael(dot)banck(at)credativ(dot)de>
wrote:

> Hi,
>
> Am Samstag, den 04.10.2014, 15:05 -0500 schrieb Jim Nasby:
> > On 10/4/14, 1:21 PM, Jeff Janes wrote:
> > > On Thu, Oct 2, 2014 at 6:21 AM, Michael Banck wrote:
> > > we have seen repeatedly that users can be confused about why
> PostgreSQL
> > > is not shutting down even though they requested it. Usually, this
> is
> > > because `log_checkpoints' is not enabled and the final checkpoint
> is
> > > being written, delaying shutdown. As no message besides "shutting
> down"
> > > is written to the server log in this case, we even had users
> believing
> > > the server was hanging and pondering killing it manually.
> > >
> > >
> >> Wouldn't a better place to write this message be the terminal from
> >> which "pg_ctl stop" was invoked, rather than the server log file?
>
> Looking at it from a DBA perspective, this would indeed be better, yes.
>
> However, I see a few issues with that:
>
> 1. If you are using an init script (or another wrapper around pg_ctl),
> you don't get any of its output it seems.
>
> 2. Having taken a quick look at pg_ctl, it seems to just kill the
> postmaster on shutdown and wait for its PID file to disappear. I don't
> see how it should figure out that PostgreSQL is waiting for a checkpoint
> to be finished?
>

It could just print out a reminder that a checkpoint will occur, depending
on what mode of shutdown was requested. I don't think this reminder has be
validated by the server itself, the intention should be enough.

Most people who don't know that a clean shutdown inherently involves a
checkpoint probably don't monitor the server log closely, either. Of
course if they use packager scripts to do the shutdown and those scripts
don't pass along the message, I guess that still doesn't help.

Cheers,

Jeff


From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 17:24:04
Message-ID: CABRT9RCi-jC24gQvKb+h3Q6CvJi67sqtFoesjENQAGmnZPqQoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 2, 2014 at 4:21 PM, Michael Banck <michael(dot)banck(at)credativ(dot)de> wrote:
> we have seen repeatedly that users can be confused about why PostgreSQL
> is not shutting down even though they requested it. Usually, this is
> because `log_checkpoints' is not enabled and the final checkpoint is
> being written, delaying shutdown.

Are you convinced that that's the *actual* reason? Maybe you're
mis-attributing it.

In my experience shutdown delays are often caused by using the default
"smart" shutdown mode, which is another way of saying "completely
stupid". It waits until all connections to the server are closed --
meaning never in common situations with connection pooling or when an
admin has forgotten their psql shell open. To add insult to injury,
you can't open a new connection to invoke pg_terminate_backend() to
kill them, either.

In the case of a "restart", it can cause much longer downtime than a
fast/immediate restart would.

Sorry for the rant, but am I the only one hating that default?

Regards,
Marti


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 17:30:23
Message-ID: 20141009173023.GG29124@awork2.int
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-09 09:44:09 -0400, Tom Lane wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> >> How about flipping the default for log_checkpoints instead? There really
> >> isn't a good reason for having it disabled by default.
>
> > Yeah, I agree with this- it's extremely useful information and it's
> > really not that verbose in general..
>
> -1. Every time we've turned on default logging of routine events,
> there's been pushback and it was eventually turned off again as log spam.
> I guarantee you that logging checkpoints will be seen as log spam, except
> by people who are actually having trouble with that behavior, which is
> a small minority.

We're talking about 2 log message per checkpoint_timeout interval
here. That's pretty darn far away from log spam. Was there really any
case of such low frequency message causing ire?

And if it's more frequent you can be happy that you see the log message
- because your config isn't appropriate for your load. The number of
times I've seen people being baffled at the bad performance because of a
inadequate checkpoint configuration, whose effect they couldn't see, is
baffling.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 17:39:44
Message-ID: 20141009173944.GU28859@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> On 2014-10-09 09:44:09 -0400, Tom Lane wrote:
> > Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > > Yeah, I agree with this- it's extremely useful information and it's
> > > really not that verbose in general..
> >
> > -1. Every time we've turned on default logging of routine events,
> > there's been pushback and it was eventually turned off again as log spam.
> > I guarantee you that logging checkpoints will be seen as log spam, except
> > by people who are actually having trouble with that behavior, which is
> > a small minority.
>
> We're talking about 2 log message per checkpoint_timeout interval
> here. That's pretty darn far away from log spam. Was there really any
> case of such low frequency message causing ire?

For embedded devices and similar small-scale systems, I can see Tom's
point. At the same time, I would expect those to require sufficient
configuration that also setting log_checkpoints to 'off' wouldn't be a
huge deal.

> And if it's more frequent you can be happy that you see the log message
> - because your config isn't appropriate for your load. The number of
> times I've seen people being baffled at the bad performance because of a
> inadequate checkpoint configuration, whose effect they couldn't see, is
> baffling.

I certainly agree with this, but then, look at our default
log_line_prefix and other default log settings.. In general, our
defaults are horrible and fixing this one is really just the tip of the
iceburg..

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 18:00:51
Message-ID: 29594.1412877651@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
>> On 2014-10-09 09:44:09 -0400, Tom Lane wrote:
>>> -1. Every time we've turned on default logging of routine events,
>>> there's been pushback and it was eventually turned off again as log spam.

>> We're talking about 2 log message per checkpoint_timeout interval
>> here. That's pretty darn far away from log spam. Was there really any
>> case of such low frequency message causing ire?

> For embedded devices and similar small-scale systems, I can see Tom's
> point. At the same time, I would expect those to require sufficient
> configuration that also setting log_checkpoints to 'off' wouldn't be a
> huge deal.

Here's the problem as I see it: DBAs will be annoyed by the spam and will
turn it off. Then they'll still be confused when a shutdown takes a long
time. So this is no fix at all for the original complaint.

I'm also not entirely convinced that checkpoints have anything to do with
the complaint. Once we get a shutdown request, we're going to have to
perform a checkpoint, which we do at full speed, no delays (or at least
did so last I checked). Whether a checkpoint was already in progress is
more or less irrelevant. It's always been like that and I can't recall
anybody complaining about it. I suspect Marti is correct that the real
problem is elsewhere.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-09 18:10:14
Message-ID: 20141009181014.GW28859@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > For embedded devices and similar small-scale systems, I can see Tom's
> > point. At the same time, I would expect those to require sufficient
> > configuration that also setting log_checkpoints to 'off' wouldn't be a
> > huge deal.
>
> Here's the problem as I see it: DBAs will be annoyed by the spam and will
> turn it off. Then they'll still be confused when a shutdown takes a long
> time. So this is no fix at all for the original complaint.

I've not run into very many folks working with embedded devices, so take
this with a grain of salt, but I have *never* run into a DBA who is
running a production system who doesn't want log_checkpoints,
log_connections, log_disconnections, and a much more verbose
log_line_prefix (and more, really), so I don't buy into this argument at
all. Our default logging is no where near what logging on a production
system should be and I'd be interested to meet the DBA who disagrees
with that, because they've got some requiremeents that I've not dealt
with before.

Basically, I believe every DBA who is using PG for more than a toy setup
(or strictly development) would be pleasantly surprised to have
checkpoints logged; far too many of them don't even know the option
exists.

> I'm also not entirely convinced that checkpoints have anything to do with
> the complaint. Once we get a shutdown request, we're going to have to
> perform a checkpoint, which we do at full speed, no delays (or at least
> did so last I checked). Whether a checkpoint was already in progress is
> more or less irrelevant. It's always been like that and I can't recall
> anybody complaining about it. I suspect Marti is correct that the real
> problem is elsewhere.

This is certainly an interesting question and was asked about up-thread
also, I believe. I agree that if it wasn't slow to shut down due to a
checkpoint then logging checkpoints isn't going to help. If the issue
is that it's a 'smart' shutdown request with folks logged in, then
perhaps we should consider logging *that* fact.. "waiting to shut down
due to user connections" or some such.

Thanks,

Stephen


From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-10-16 19:31:00
Message-ID: 1413487860.12271.13.camel@hartree.muc.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Am Donnerstag, den 02.10.2014, 15:21 +0200 schrieb Michael Banck:
> we have seen repeatedly that users can be confused about why PostgreSQL
> is not shutting down even though they requested it. Usually, this is
> because `log_checkpoints' is not enabled and the final checkpoint is
> being written, delaying shutdown. As no message besides "shutting down"
> is written to the server log in this case, we even had users believing
> the server was hanging and pondering killing it manually.

There were some comments that this might not actually be the case and/or
that the postmaster was simply waiting for clients to disconnect due to
smart shutdown being invoked.

About the former, it might be possible to hook into the checkpoint code
and try to estimate how much I/O is to be written, and decide on some
threshold above which a warning is issued, but this looks rather fragile
so I won't try to tackle it now.

About the latter, this is a valid concern, and I decided to add a
WARNING in this case (if normal clients are connected), thus moving into
the "additional logging on shutdown" territory.

The other point was changing the default shutdown method and/or the
default for the log_checkpoints parameter. The former has not been
discussed much and the latter was contentious - I won't touch those
either. And even if the defaults are changed, old installations might
still carry the old defaults or DBAs might change them for whatever
reasons, so ISTM additional logging is rather orthogonal to that.

Finally, I am not convinced changing the pg_ctl logging is worthwhile.

I have updated the initial patch with the following: (i) the message got
changed to mention that this is a shutdown checkpoint, (ii) the end of
the shutdown checkpoint is logged as well and (iii) if normal clients
are connected during smart shutdown, a warning is logged.

I'll attach it to the next commitfest and see whether anybody likes it.

Michael

--
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax: +49 (2161) 4643-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment Content-Type Size
shutdown_logging_v3.patch text/x-patch 2.0 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Log notice that checkpoint is to be written on shutdown
Date: 2014-11-15 22:11:29
Message-ID: CA+U5nMLDXExnbj4k58NH=Fd-CxP+Hitu+-sLMH7zs3qgOBm7dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 October 2014 20:31, Michael Banck <michael(dot)banck(at)credativ(dot)de> wrote:

> I'll attach it to the next commitfest and see whether anybody likes it.

Not much...

We may decide we wanted to always-log shutdown checkpoints. I'm
neutral about that, but I can see the logic. But if we did, we would
use exactly the same log message as if log_checkpoints = on. Having
two different message texts is just confusing.

I don't see the point of logging "waiting for clients to disconnect"
since we might not wait very long. We do already log the type of
shutdown received.

A few things seem worth pursuing...

* Logging a message that shows how many buffers need to be written for
a shutdown checkpoint. (We might even store some info that allows us
to estimate a time to shutdown, later). At present the "starting:
checkpoint shutdown" isn't much use. I can see we could split
CheckpointGuts() into two, so we mark the buffers first, then report
how many there are to write, then go back to do the writing and fsync
in part two.

* Introducing a new shutdown mode that is genuinely "smart". We send
all backends a signal that will disconnect them *only* if they are
idle and not in a transaction. I've never thought the current smart
mode deserved its name.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services