Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown

Lists: pgsql-bugs
From: "azuneko" <azuneko(at)hotmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Date: 2010-01-18 05:43:39
Message-ID: 201001180543.o0I5hdYN062790@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5284
Logged by: azuneko
Email address: azuneko(at)hotmail(dot)com
PostgreSQL version: 8.3.3
Operating system: FreeBSD7.0.2
Description: Postgres CPU 100% and worker took too long to start;
cancelled... Systemdown
Details:

Hello,

I have the following 2 problems.
It would be appreciated if you give me some information such as the way to
avoid them.(Or if those problems are already known and fixed, could you
please tell me what version I should apply.)

1,CPU utilization of postgres reaches 100%.

I excuted "top" command and sometimes found that CPU utilization of postgres
process reached 100% or almost 100%. (This is similar to the problem that
was posted at
http://archives.free.net.ph/message/20081104.074244.6e0dbcde.ja.html.)
What might be the cause?

2. The following warning can be seen in the postgres log.
WARNING: worker took too long to start; cancelled

After this warning firstly appears in the log, the same warining message
seems to be repeated. And if you leave this state as it is, the OS freezes
before long. I guess this event happens because the daemons related to
postgres (such as vacuum and autovacuum) won't release the shared memory and
exclusively keep using it. Am I correct?

Those 2 problems are confirmed to happen at least under the following
conditions;
Software
-OS : FreeBSD 7.0.2
-Postgres version : 8.3.3
Hardware
-Disk configuration : RAID5 (MegaCLI)
-CPU : Xeon2.4

Thank you.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: azuneko <azuneko(at)hotmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Date: 2010-01-18 20:33:02
Message-ID: 603c8f071001181233g6ba38c9w52e3e8d27a08726c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Mon, Jan 18, 2010 at 12:43 AM, azuneko <azuneko(at)hotmail(dot)com> wrote:
> The following bug has been logged online:
>
> Bug reference:      5284
> Logged by:          azuneko
> Email address:      azuneko(at)hotmail(dot)com
> PostgreSQL version: 8.3.3
> Operating system:   FreeBSD7.0.2
> Description:        Postgres CPU 100% and worker took too long to start;
> cancelled... Systemdown
> Details:
>
> Hello,
>
> I have the following 2 problems.
> It would be appreciated if you give me some information such as the way to
> avoid them.(Or if those problems are already known and fixed, could you
> please tell me what version I should apply.)
>
> 1,CPU utilization of postgres reaches 100%.
>
> I excuted "top" command and sometimes found that CPU utilization of postgres
> process reached 100% or almost 100%. (This is similar to the problem that
> was posted at
> http://archives.free.net.ph/message/20081104.074244.6e0dbcde.ja.html.)
> What might be the cause?
>
> 2. The following warning can be seen in the postgres log.
>        WARNING:  worker took too long to start; cancelled
>
> After this warning firstly appears in the log, the same warining message
> seems to be repeated. And if you leave this state as it is, the OS freezes
> before long. I guess this event happens because the daemons related to
> postgres (such as vacuum and autovacuum) won't release the shared memory and
> exclusively keep using it. Am I correct?
>
> Those 2 problems are confirmed to happen at least under the following
> conditions;
>        Software
>         -OS : FreeBSD 7.0.2
>         -Postgres version : 8.3.3
>        Hardware
>         -Disk configuration : RAID5 (MegaCLI)
>         -CPU : Xeon2.4
>
> Thank you.

You haven't really provided us with much detail here, but it kind of
sounds like your system is overloaded.

...Robert


From: yua ゅぁ <azuneko(at)hotmail(dot)com>
To: <robertmhaas(at)gmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Date: 2010-01-19 10:26:10
Message-ID: KAW101-W3357249F21DE723E6DA1CAAE650@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


Hello.

>You haven't really provided us with much detail here, but it kind of
>sounds like your system is overloaded.
>...Robert

Thank you for your kindness.
Following modules are used
 mod_auth_pgsql-2.0.3_1
 postgresql-client-8.3.9_1,1
 postgresql-server-8.3.9_1
 p5-DBD-Pg-2.16.0
 php5-pdo_pgsql-5.2.12

What kind of information shall, I geve you in addtion to the list
above?

> Date: Mon, 18 Jan 2010 15:33:02 -0500
> Subject: Re: [BUGS] BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
> From: robertmhaas(at)gmail(dot)com
> To: azuneko(at)hotmail(dot)com
> CC: pgsql-bugs(at)postgresql(dot)org
>
> On Mon, Jan 18, 2010 at 12:43 AM, azuneko <azuneko(at)hotmail(dot)com> wrote:
> > The following bug has been logged online:
> >
> > Bug reference: 5284
> > Logged by: azuneko
> > Email address: azuneko(at)hotmail(dot)com
> > PostgreSQL version: 8.3.3
> > Operating system: FreeBSD7.0.2
> > Description: Postgres CPU 100% and worker took too long to start;
> > cancelled... Systemdown
> > Details:
> >
> > Hello,
> >
> > I have the following 2 problems.
> > It would be appreciated if you give me some information such as the way to
> > avoid them.(Or if those problems are already known and fixed, could you
> > please tell me what version I should apply.)
> >
> > 1,CPU utilization of postgres reaches 100%.
> >
> > I excuted "top" command and sometimes found that CPU utilization of postgres
> > process reached 100% or almost 100%. (This is similar to the problem that
> > was posted at
> > http://archives.free.net.ph/message/20081104.074244.6e0dbcde.ja.html.)
> > What might be the cause?
> >
> > 2. The following warning can be seen in the postgres log.
> > WARNING: worker took too long to start; cancelled
> >
> > After this warning firstly appears in the log, the same warining message
> > seems to be repeated. And if you leave this state as it is, the OS freezes
> > before long. I guess this event happens because the daemons related to
> > postgres (such as vacuum and autovacuum) won't release the shared memory and
> > exclusively keep using it. Am I correct?
> >
> > Those 2 problems are confirmed to happen at least under the following
> > conditions;
> > Software
> > -OS : FreeBSD 7.0.2
> > -Postgres version : 8.3.3
> > Hardware
> > -Disk configuration : RAID5 (MegaCLI)
> > -CPU : Xeon2.4
> >
> > Thank you.
>
> You haven't really provided us with much detail here, but it kind of
> sounds like your system is overloaded.
>
> ...Robert

_________________________________________________________________
【無料!】マイクロソフト公式メーラーで、メールを一括チェック
http://windows7.jp.msn.com/master/windows_live/


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <robertmhaas(at)gmail(dot)com>, yua ** <azuneko(at)hotmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Date: 2010-01-19 15:37:14
Message-ID: 4B557D4C020000250002E758@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

yua **<azuneko(at)hotmail(dot)com> wrote:

> What kind of information shall, I geve you

There are some good guidelines here:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

-Kevin


From: yua ゅぁ <azuneko(at)hotmail(dot)com>
To: <kevin(dot)grittner(at)wicourts(dot)gov>, <robertmhaas(at)gmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Date: 2010-01-26 00:58:19
Message-ID: KAW101-W22B60468F76175CB5A9DDBAE5E0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


Hello,

Kevin.

Thank you good guidelines.

guidelines questions.

Q1 postgres Version

A1 PostgreSQL 8.3.3 on i386-portbld-freebsd7.0, compiled by GCC cc (GCC)
4.2.1 20070719 [FreeBSD]

ports Install.

Q2
A2 x

Q3 Query
A3 x

Q4 Query
A4 x

Q5 Error Message
A5
postgres[681]: [506-1] WARNING: worker took too long to start; cancelled Nov 12 11:15:12 kddi-nwmgr01 postgres[681]:
[507-1] WARNING: worker took too long to start; cancelled

ps -auxeww
---------+---------+---------+---------+---------+---------+--------
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
pgsql 682 99.0 0.1 9336 2740 ?? Rs 27Nov08 343573:19.27 USER=pgsql

Q6 PostgreSQL Programs
A6 php5-pdo_pgsql-5.2.12
p5-DBD-Pg-2.16.0

Q8 OS Version
A8 FreeBSD 7.0-RELEASE-p2

Q9 Hardware Info
A9
CPU : Intel, Xeon2.4
RAM : 2GB
Storage
RAID Card : LSI MegaRAID
Battery Cache : YES
write-back Cache : NO
Software RAID : NO ( Hardware RAID)
SAN : NO
Disk : 7,200rpm SATA 3lot
Disk : RAID5 3slot

> Date: Tue, 19 Jan 2010 09:37:14 -0600
> From: Kevin(dot)Grittner(at)wicourts(dot)gov
> To: robertmhaas(at)gmail(dot)com; azuneko(at)hotmail(dot)com
> CC: pgsql-bugs(at)postgresql(dot)org
> Subject: Re: [BUGS] BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
>
> yua **<azuneko(at)hotmail(dot)com> wrote:
>
> > What kind of information shall, I geve you
>
> There are some good guidelines here:
>
> http://wiki.postgresql.org/wiki/SlowQueryQuestions
>
> -Kevin

_________________________________________________________________
【節約!】インターネット代、見直しませんか?
http://campaign.live.jp/eaccess/Top/


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <robertmhaas(at)gmail(dot)com>, yua ** <azuneko(at)hotmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Date: 2010-01-26 14:55:10
Message-ID: 4B5EADEE020000250002EC07@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

yua **<azuneko(at)hotmail(dot)com> wrote:

> PostgreSQL 8.3.3 on i386-portbld-freebsd7.0, compiled by GCC cc
(GCC)
> 4.2.1 20070719 [FreeBSD]

> postgres[681]: [506-1] WARNING: worker took too long to start;
cancelled
> Nov 12 11:15:12 kddi-nwmgr01 postgres[681]:
> [507-1] WARNING: worker took too long to start; cancelled

> ps -auxeww
> ---------+---------+---------+---------+---------+---------+--------
> USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME
COMMAND
> pgsql 682 99.0 0.1 9336 2740 ?? Rs 27Nov08 343573:19.27
USER=pgsql

> Q6 PostgreSQL Programs
> A6 php5-pdo_pgsql-5.2.12
> p5-DBD-Pg-2.16.0

> Q8 OS Version
> A8 FreeBSD 7.0-RELEASE-p2

> CPU : Intel, Xeon2.4
> RAM : 2GB
> Storage
> RAID Card : LSI MegaRAID
> Battery Cache : YES
> write-back Cache : NO
> Software RAID : NO ( Hardware RAID)
> SAN : NO
> Disk : 7,200rpm SATA 3lot
> Disk : RAID5 3slot

This is starting to sound like some other reports from FreeBSD.

http://archives.postgresql.org/pgsql-general/2008-06/msg00934.php

http://archives.postgresql.org/pgsql-general/2010-01/msg01076.php

Unfortunately, the other posters didn't post back with information
on resolution of the issue. Could you read Tom's advice and report
back?:

http://archives.postgresql.org/pgsql-general/2010-01/msg01079.php

-Kevin


From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: yua ゅぁ <azuneko(at)hotmail(dot)com>
Cc: kevin(dot)grittner(at)wicourts(dot)gov, robertmhaas(at)gmail(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Date: 2010-01-26 23:38:05
Message-ID: 4B5F7CDD.8050008@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 26/01/2010 8:58 AM, yua ゅぁ wrote:

> RAID Card : LSI MegaRAID
> Battery Cache : YES
> write-back Cache : NO
> Software RAID : NO ( Hardware RAID)

The LSI MegaRAID series are, AFAIK, software raid implementations. The
hardware has some BIOS hooks to enable boot-loading, then the OS loads a
driver that does all the real work for the RAID implementation.

... though a quick Google search suggests they may be using that brand
for real RAID hardware too, so without specifying the model number it's
hard to know what your RAID hardware is. The fact that your card has a
BBU tends to confirm they're making real RAID hardware under that name.

It probably doesn't make much difference in this particular case,
though, as disk I/O is unlikely to be part of your issue.

> SAN : NO
> Disk : 7,200rpm SATA 3lot
> Disk : RAID5 3slot

Again it's not the cause of the problem you report, but: Most databases,
and certainly PostgreSQL, perform poorly on RAID 5. In particular,
PostgreSQL really doesn't like having the WAL stored on RAID 5, but
really you're much better off using RAID 10 for all your
database-related storage if you can.

--
Craig Ringer