Lists: | pgsql-hackerspgsql-patches |
---|
From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Stats processor not restarting |
Date: | 2007-03-20 11:23:50 |
Message-ID: | 20070320112350.GB24280@svr2.hagander.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
I've noticed that if for example the autovacuum process dies (such as
with a kill -9 when testing my new shared mem implementation), only
autovac and bgwriter are restarted. The stats collector is terminated,
but not restarted. (Same goes for a regular backend, and not just
autovac)
Is there a reason for this, or is it a bug?
//Magnus
From: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Stats processor not restarting |
Date: | 2007-03-20 12:48:30 |
Message-ID: | 20070320124830.GN24234@alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
Magnus Hagander wrote:
> I've noticed that if for example the autovacuum process dies (such as
> with a kill -9 when testing my new shared mem implementation), only
> autovac and bgwriter are restarted. The stats collector is terminated,
> but not restarted. (Same goes for a regular backend, and not just
> autovac)
>
> Is there a reason for this, or is it a bug?
I would say it is a bug, because the comments and code in ServerLoop()
and reaper() say different.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Stats processor not restarting |
Date: | 2007-03-20 14:50:33 |
Message-ID: | 20070320145033.GE24280@svr2.hagander.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
On Tue, Mar 20, 2007 at 08:48:30AM -0400, Alvaro Herrera wrote:
> Magnus Hagander wrote:
> > I've noticed that if for example the autovacuum process dies (such as
> > with a kill -9 when testing my new shared mem implementation), only
> > autovac and bgwriter are restarted. The stats collector is terminated,
> > but not restarted. (Same goes for a regular backend, and not just
> > autovac)
> >
> > Is there a reason for this, or is it a bug?
>
> I would say it is a bug, because the comments and code in ServerLoop()
> and reaper() say different.
Bah, sorry about the noise. It was the effect of
PGSTAT_RESTART_INTERVAL.
Do we want to add some logging when we don't restart it due to repeated
failures?
//Magnus
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Stats processor not restarting |
Date: | 2007-03-20 15:23:11 |
Message-ID: | 13948.1174404191@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Magnus Hagander wrote:
>> I've noticed that if for example the autovacuum process dies (such as
>> with a kill -9 when testing my new shared mem implementation), only
>> autovac and bgwriter are restarted. The stats collector is terminated,
>> but not restarted. (Same goes for a regular backend, and not just
>> autovac)
>>
>> Is there a reason for this, or is it a bug?
> I would say it is a bug, because the comments and code in ServerLoop()
> and reaper() say different.
There is code in pgstat_start that limits the frequency with which new
stats collectors can be spawned --- maybe your test case is hitting that?
regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Stats processor not restarting |
Date: | 2007-03-20 15:51:04 |
Message-ID: | 14392.1174405864@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
Magnus Hagander <magnus(at)hagander(dot)net> writes:
> Bah, sorry about the noise. It was the effect of
> PGSTAT_RESTART_INTERVAL.
> Do we want to add some logging when we don't restart it due to repeated
> failures?
Not really, but maybe it would be sensible to reset last_pgstat_start_time
when doing a database-wide restart? The motivation for the timeout was
to reduce cycle wastage if pgstat crashed by itself, but when you've
deliberately SIGQUITed it, that hardly seems to apply ...
regards, tom lane
From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Stats processor not restarting |
Date: | 2007-03-22 02:04:09 |
Message-ID: | 200703220204.l2M249127340@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
> > Bah, sorry about the noise. It was the effect of
> > PGSTAT_RESTART_INTERVAL.
> > Do we want to add some logging when we don't restart it due to repeated
> > failures?
>
> Not really, but maybe it would be sensible to reset last_pgstat_start_time
> when doing a database-wide restart? The motivation for the timeout was
> to reduce cycle wastage if pgstat crashed by itself, but when you've
> deliberately SIGQUITed it, that hardly seems to apply ...
You mean like this, attached?
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/plain | 571 bytes |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Stats processor not restarting |
Date: | 2007-03-22 04:33:43 |
Message-ID: | 25538.1174538023@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Tom Lane wrote:
>> Not really, but maybe it would be sensible to reset last_pgstat_start_time
>> when doing a database-wide restart?
> You mean like this, attached?
I'd be fairly surprised if resetting the variable in the child process
had any effect on the postmaster...
I think a correct fix would involve exposing either the variable or a
routine to zero it from pgstat.c, and having postmaster.c do that at
the points where it intentionally SIGQUITs the stats collector.
regards, tom lane
From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Stats processor not restarting |
Date: | 2007-03-22 14:41:46 |
Message-ID: | 200703221441.l2MEfkf05255@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Tom Lane wrote:
> >> Not really, but maybe it would be sensible to reset last_pgstat_start_time
> >> when doing a database-wide restart?
>
> > You mean like this, attached?
>
> I'd be fairly surprised if resetting the variable in the child process
> had any effect on the postmaster...
Yep, me too. ;-)
> I think a correct fix would involve exposing either the variable or a
> routine to zero it from pgstat.c, and having postmaster.c do that at
> the points where it intentionally SIGQUITs the stats collector.
OK, patch attached. I just reset the value in all places where we were
killing the pgstat process and not immediately exiting ourselves.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/plain | 3.1 KB |
From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Stats processor not restarting |
Date: | 2007-03-22 19:53:44 |
Message-ID: | 200703221953.l2MJri200875@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers pgsql-patches |
Applied. I did the case where we exit right away too, just for
consistency.
---------------------------------------------------------------------------
Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > > Tom Lane wrote:
> > >> Not really, but maybe it would be sensible to reset last_pgstat_start_time
> > >> when doing a database-wide restart?
> >
> > > You mean like this, attached?
> >
> > I'd be fairly surprised if resetting the variable in the child process
> > had any effect on the postmaster...
>
> Yep, me too. ;-)
>
> > I think a correct fix would involve exposing either the variable or a
> > routine to zero it from pgstat.c, and having postmaster.c do that at
> > the points where it intentionally SIGQUITs the stats collector.
>
> OK, patch attached. I just reset the value in all places where we were
> killing the pgstat process and not immediately exiting ourselves.
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://www.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
> Index: src/backend/postmaster/pgstat.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
> retrieving revision 1.149
> diff -c -c -r1.149 pgstat.c
> *** src/backend/postmaster/pgstat.c 16 Mar 2007 17:57:36 -0000 1.149
> --- src/backend/postmaster/pgstat.c 22 Mar 2007 14:40:36 -0000
> ***************
> *** 572,577 ****
> --- 572,581 ----
> return 0;
> }
>
> + void allow_immediate_pgstat_restart(void)
> + {
> + last_pgstat_start_time = 0;
> + }
>
> /* ------------------------------------------------------------
> * Public functions used by backends follow
> Index: src/backend/postmaster/postmaster.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
> retrieving revision 1.526
> diff -c -c -r1.526 postmaster.c
> *** src/backend/postmaster/postmaster.c 7 Mar 2007 13:35:02 -0000 1.526
> --- src/backend/postmaster/postmaster.c 22 Mar 2007 14:40:37 -0000
> ***************
> *** 1896,1902 ****
> --- 1896,1905 ----
> signal_child(PgArchPID, SIGQUIT);
> /* Tell pgstat to shut down too; nothing left for it to do */
> if (PgStatPID != 0)
> + {
> signal_child(PgStatPID, SIGQUIT);
> + allow_immediate_pgstat_restart();
> + }
> /* Tell autovac launcher to shut down too */
> if (AutoVacPID != 0)
> signal_child(AutoVacPID, SIGTERM);
> ***************
> *** 1952,1958 ****
> --- 1955,1964 ----
> signal_child(PgArchPID, SIGQUIT);
> /* Tell pgstat to shut down too; nothing left for it to do */
> if (PgStatPID != 0)
> + {
> signal_child(PgStatPID, SIGQUIT);
> + allow_immediate_pgstat_restart();
> + }
> /* Tell autovac launcher to shut down too */
> if (AutoVacPID != 0)
> signal_child(AutoVacPID, SIGTERM);
> ***************
> *** 2241,2247 ****
> --- 2247,2256 ----
> signal_child(PgArchPID, SIGQUIT);
> /* Tell pgstat to shut down too; nothing left for it to do */
> if (PgStatPID != 0)
> + {
> signal_child(PgStatPID, SIGQUIT);
> + allow_immediate_pgstat_restart();
> + }
> /* Tell autovac launcher to shut down too */
> if (AutoVacPID != 0)
> signal_child(AutoVacPID, SIGTERM);
> ***************
> *** 2404,2409 ****
> --- 2413,2419 ----
> "SIGQUIT",
> (int) PgStatPID)));
> signal_child(PgStatPID, SIGQUIT);
> + allow_immediate_pgstat_restart();
> }
>
> /* We do NOT restart the syslogger */
> Index: src/include/pgstat.h
> ===================================================================
> RCS file: /cvsroot/pgsql/src/include/pgstat.h,v
> retrieving revision 1.55
> diff -c -c -r1.55 pgstat.h
> *** src/include/pgstat.h 16 Mar 2007 17:57:36 -0000 1.55
> --- src/include/pgstat.h 22 Mar 2007 14:40:38 -0000
> ***************
> *** 369,375 ****
> extern void pgstat_init(void);
> extern int pgstat_start(void);
> extern void pgstat_reset_all(void);
> !
> #ifdef EXEC_BACKEND
> extern void PgstatCollectorMain(int argc, char *argv[]);
> #endif
> --- 369,375 ----
> extern void pgstat_init(void);
> extern int pgstat_start(void);
> extern void pgstat_reset_all(void);
> ! extern void allow_immediate_pgstat_restart(void);
> #ifdef EXEC_BACKEND
> extern void PgstatCollectorMain(int argc, char *argv[]);
> #endif
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +