BUG #4522: autovacuum working send SIGUSR1 to the wrong pid

Lists: pgsql-bugs
From: "Zou Yong" <springwell(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #4522: autovacuum working send SIGUSR1 to the wrong pid
Date: 2008-11-12 08:34:34
Message-ID: 200811120834.mAC8YYE8010751@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 4522
Logged by: Zou Yong
Email address: springwell(at)gmail(dot)com
PostgreSQL version: 8.3.4
Operating system: Linux 2.6.24
Description: autovacuum working send SIGUSR1 to the wrong pid
Details:

I was running postgres on a Linux with busybox. The autovacuum feature is
turned on. I noticed that the autovacuum worker sent SIGUSR1 to pid 1 which
is the init process and caused the system halt.

I did some debugging and found that the root cause is the constant
AutoVacNumSignals is not defined correctly. It should be (AutoVacRebalance +
1).


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Zou Yong <springwell(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4522: autovacuum working send SIGUSR1 to the wrong pid
Date: 2008-11-12 10:15:03
Message-ID: 491AACA7.9010704@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Zou Yong wrote:
> Bug reference: 4522
> Logged by: Zou Yong
> Email address: springwell(at)gmail(dot)com
> PostgreSQL version: 8.3.4
> Operating system: Linux 2.6.24
> Description: autovacuum working send SIGUSR1 to the wrong pid
> Details:
>
> I was running postgres on a Linux with busybox. The autovacuum feature is
> turned on. I noticed that the autovacuum worker sent SIGUSR1 to pid 1 which
> is the init process and caused the system halt.

Hmm. The postgres user shouldn't have permission to halt the system,
methinks.

> I did some debugging and found that the root cause is the constant
> AutoVacNumSignals is not defined correctly. It should be (AutoVacRebalance +
> 1).

Yeah, that's clearly a bug. Fixed, thanks.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Zou Yong <springwell(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4522: autovacuum working send SIGUSR1 to the wrong pid
Date: 2008-11-12 13:27:20
Message-ID: 20081112132720.GA4535@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Heikki Linnakangas escribió:
> Zou Yong wrote:

>> I was running postgres on a Linux with busybox. The autovacuum feature is
>> turned on. I noticed that the autovacuum worker sent SIGUSR1 to pid 1 which
>> is the init process and caused the system halt.
>
> Hmm. The postgres user shouldn't have permission to halt the system,
> methinks.

Yeah, the reason this hasn't ever been seen is that normally we don't
have permissions to signal init. My guess is that Zou Yang is running a
system without users where everything runs as root. This fits the fact
that it's running busybox: I guess it's an embedded system of some sort.

>> I did some debugging and found that the root cause is the constant
>> AutoVacNumSignals is not defined correctly. It should be (AutoVacRebalance +
>> 1).
>
> Yeah, that's clearly a bug. Fixed, thanks.

My fault. Thanks for the patch.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zou Yong <springwell(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4522: autovacuum working send SIGUSR1 to the wrong pid
Date: 2008-11-12 13:48:53
Message-ID: 29698.1226497733@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Yeah, the reason this hasn't ever been seen is that normally we don't
> have permissions to signal init.

... and that the only consequence of the failed kill() would be that the
launcher doesn't get the signal to rebalance costs after a worker
launch, which I guess doesn't have any obvious bad effects.

regards, tom lane