Quick Links

Re: Are indexes blown?

Lists:	pgsql-general

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Are indexes blown?
Date:	2008-02-15 10:55:18
Message-ID:	e373d31e0802150255m1dc208d2o219876e91c11de33@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

Hi,

I have an index on the user_id field in the query below:

myuser=# delete from clients where user_id like '64.22.91.%';
DELETE 22
Time: 220324.975 ms

Is there any reason why it's taking 220 seconds to run this simple
query? There are about 3 million rows in this table.

How can I debug this? How can I check if the index is bloated or
blown? From the VACUUM ANALYZE output, nothing like this is apparent.

Thanks.

From:	"Shoaib Mir" <shoaibmir(at)gmail(dot)com>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 11:05:13
Message-ID:	bf54be870802150305o111cfd9en9eb9bb1dea33e131@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Fri, Feb 15, 2008 at 3:55 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
wrote:

> Hi,
>
> I have an index on the user_id field in the query below:
>
> myuser=# delete from clients where user_id like '64.22.91.%';
> DELETE 22
> Time: 220324.975 ms
>
> Is there any reason why it's taking 220 seconds to run this simple
> query? There are about 3 million rows in this table.
>
>
Use the *'pgstattuple'* contrib module -->
http://www.postgresql.org/docs/current/static/pgstattuple.html

*pgstatindex* function from the contrib module should be able to help you
there.

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[(at)]fast(dot)fujitsu(dot)com(dot)au

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	Postgres General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 11:20:10
Message-ID:	47B5756A.50105@archonet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

Phoenix Kiula wrote:
> Hi,
>
> I have an index on the user_id field in the query below:
>
> myuser=# delete from clients where user_id like '64.22.91.%';
> DELETE 22
> Time: 220324.975 ms
>
> Is there any reason why it's taking 220 seconds to run this simple
> query? There are about 3 million rows in this table.

First guess is that it's not using the index. What does
EXPLAIN ANALYSE SELECT user_id FROM clients WHERE user_id LIKE '...'
show?

Check the list archives for locale and like and text_pattern_ops too -
that's a good place to check.

--
Richard Huxton
Archonet Ltd

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Shoaib Mir" <shoaibmir(at)gmail(dot)com>
Cc:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 12:18:30
Message-ID:	e373d31e0802150418k59d422d7re0f82cc5eec92f9c@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 15/02/2008, Shoaib Mir <shoaibmir(at)gmail(dot)com> wrote:
>
>
> On Fri, Feb 15, 2008 at 3:55 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
> wrote:
> > Hi,
> >
> > I have an index on the user_id field in the query below:
> >
> > myuser=# delete from clients where user_id like '64.22.91.%';
> > DELETE 22
> > Time: 220324.975 ms
> >
> > Is there any reason why it's taking 220 seconds to run this simple
> > query? There are about 3 million rows in this table.
> >
> >
>
> Use the 'pgstattuple' contrib module -->
> http://www.postgresql.org/docs/current/static/pgstattuple.html
>
> pgstatindex function from the contrib module should be able to help you
> there.

How should I install a "contrib" without bringing down my database, or
stopping it, or doing ANYTHING to it? It's in production. I can't
touch it. Will it be installed on the side and then I simply start
using it?

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Richard Huxton" <dev(at)archonet(dot)com>
Cc:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 12:26:23
Message-ID:	e373d31e0802150426o7b2dce6dp70dd541783275b87@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:

> First guess is that it's not using the index. What does
> EXPLAIN ANALYSE SELECT user_id FROM clients WHERE user_id LIKE '...'
> show?
>
> Check the list archives for locale and like and text_pattern_ops too -
> that's a good place to check.

There is nothing to do with locale. The same database has been working
just fine for 2 years. Why should this be an issue now?

When I ran the EXPLAIN SELECT, the database was hanging. Or taking too
much time (waiting for 5 minutes), or whatever. I cancelled it.

That's the problem. It works, then it doesn't. Then it works again. I
am guessing it could be the load, but there's nothing new in terms of
load that should be causing this!

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	Postgres General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 12:40:38
Message-ID:	47B58846.5050102@archonet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

Phoenix Kiula wrote:
> On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:
>
>> First guess is that it's not using the index. What does
>> EXPLAIN ANALYSE SELECT user_id FROM clients WHERE user_id LIKE '...'
>> show?
>>
>> Check the list archives for locale and like and text_pattern_ops too -
>> that's a good place to check.
>
>
> There is nothing to do with locale. The same database has been working
> just fine for 2 years. Why should this be an issue now?

No reason, but you hadn't said this was a change in behaviour, just that
it seemed slow.

> When I ran the EXPLAIN SELECT, the database was hanging. Or taking too
> much time (waiting for 5 minutes), or whatever. I cancelled it.
>
> That's the problem. It works, then it doesn't. Then it works again. I
> am guessing it could be the load, but there's nothing new in terms of
> load that should be causing this!

Ah, more new information! This does seem to point to the load,
particularly if it's exactly the same query each time. So what do
top/vmstat etc show for these "go-slow" periods?

--
Richard Huxton
Archonet Ltd

From:	"Shoaib Mir" <shoaibmir(at)gmail(dot)com>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 12:59:28
Message-ID:	bf54be870802150459l1717f48fnc57f63f7937c9f81@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Fri, Feb 15, 2008 at 5:18 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
wrote:

>
>
> How should I install a "contrib" without bringing down my database, or
> stopping it, or doing ANYTHING to it? It's in production. I can't
> touch it. Will it be installed on the side and then I simply start
> using it?
>

You do not need to restart the database server for that purpose as all you
need is the pgstattuple.so file copied to <PG-HOME>/lib folder.

Do the following (in case you have installed server from source):

- Go to the <PostgreSQL-source>/contrib/pgstattuple folder
- run make and make install (this will copy pgstattuple.so file to the lib
folder of your PostgreSQL installation)
- Now from psql execute the pgstattuple.sql file for that specific database
which can be found in <PG-HOME>/share/contrib folder
- Once the sql file is executed now you can use the pgstattuple function

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[(at)]fast(dot)fujitsu(dot)com(dot)au

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Richard Huxton" <dev(at)archonet(dot)com>
Cc:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 13:09:32
Message-ID:	e373d31e0802150509h47e7c6a2x60299af6679cde03@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

Actually my host has just told me that I have a number of "hung
semaphores" in my server. And he is relating them to postgresql. I am
not surprised, because this is the only utility that has issues. All
the rest is working (apache, mysql, exim, etc). Any thoughts on where
I should start looking for hung semaphores?

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Shoaib Mir" <shoaibmir(at)gmail(dot)com>
Cc:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 13:13:45
Message-ID:	e373d31e0802150513k47095d27h63adefa8a5cab955@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 15/02/2008, Shoaib Mir <shoaibmir(at)gmail(dot)com> wrote:
>
> You do not need to restart the database server for that purpose as all you
> need is the pgstattuple.so file copied to <PG-HOME>/lib folder.
>
> Do the following (in case you have installed server from source):
>
> - Go to the <PostgreSQL-source>/contrib/pgstattuple folder
> - run make and make install (this will copy pgstattuple.so file to the lib
> folder of your PostgreSQL installation)
> - Now from psql execute the pgstattuple.sql file for that specific database
> which can be found in <PG-HOME>/share/contrib folder
> - Once the sql file is executed now you can use the pgstattuple function

Thanks. But I had installed from rpm. Can I just download that .so
file and put in the lib folder for pgsql and then start using it?

From:	"Shoaib Mir" <shoaibmir(at)gmail(dot)com>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	"Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 13:17:19
Message-ID:	bf54be870802150517q2dce6219kd5633ffb99e49d8b@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Fri, Feb 15, 2008 at 6:13 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
wrote:

>
>
> Thanks. But I had installed from rpm. Can I just download that .so
> file and put in the lib folder for pgsql and then start using it?
>

Well I would say download the source for the same version you have, copy it
to your desktop machine, build it and then build the .so file for contrib
module using 'make' and 'make install'.... once that is done copy the .so
from lib folder of PG to your production PG box's lib folder.

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[(at)]fast(dot)fujitsu(dot)com(dot)au

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 13:41:18
Message-ID:	47B5967E.7030408@archonet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

Phoenix Kiula wrote:
> On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:
>
>> Ah, more new information! This does seem to point to the load,
>> particularly if it's exactly the same query each time. So what do
>> top/vmstat etc show for these "go-slow" periods?
>
> In included top and vmstat info in my other post yesterday, but here
> it is again:

Ah, you had a post yesterday!

(goes away, searches for previous post)
http://archives.postgresql.org/pgsql-general/2008-02/msg00689.php
PG quitting sporadically!!

Right, OK. Firstly, stop worrying about index usage and/or bloat. You
have unexplained process crashes to deal with first. There's no point in
looking at indexes until you figure out what is killing your processes.

Secondly, a single line from vmstat isn't useful, you want to compare
what is happening when things are fine with when they aren't. Leave
vmstat 10 logging to a file so you can catch it.

Thirdly, have you upgraded to the latest 8.2 (8.2.6) yet?

I see you've reduced work_mem, that's good.

Oh, you might as well lower max_connections from 150 too, there's no way
you can support that many concurrent queries anyway.

The fact that you're seeing various strange socket-related problems is
odd. As is the fact that logging doesn't seem to work for you.

Are you sure the two sets of vmstat/top figures are from when PG was
crashing/running queries slow? Everything seems idle to me in those figures.

--
Richard Huxton
Archonet Ltd

From:	Harald Fuchs <hari(dot)fuchs(at)googlemail(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Are indexes blown?
Date:	2008-02-15 13:56:50
Message-ID:	puy79me319.fsf@srv.protecting.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

In article <bf54be870802150517q2dce6219kd5633ffb99e49d8b(at)mail(dot)gmail(dot)com>,
"Shoaib Mir" <shoaibmir(at)gmail(dot)com> writes:

> On Fri, Feb 15, 2008 at 6:13 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:

> Thanks. But I had installed from rpm. Can I just download that .so
> file and put in the lib folder for pgsql and then start using it?

> Well I would say download the source for the same version you have, copy it to
> your desktop machine, build it and then build the .so file for contrib module
> using 'make' and 'make install'.... once that is done copy the .so from lib
> folder of PG to your production PG box's lib folder.

But you have to ensure that you build PostgreSQL on your desktop
machine in exactly the same way as the RPM got built
(integer_datetimes etc).

From:	"Douglas McNaught" <doug(at)mcnaught(dot)org>
To:	"Harald Fuchs" <hari(dot)fuchs(at)googlemail(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Are indexes blown?
Date:	2008-02-15 14:07:37
Message-ID:	5ded07e00802150607s1088657ai9a952d61553db9be@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 2/15/08, Harald Fuchs <hari(dot)fuchs(at)googlemail(dot)com> wrote:

> But you have to ensure that you build PostgreSQL on your desktop
> machine in exactly the same way as the RPM got built
> (integer_datetimes etc).

It'd probably be much easier to just install the -contrib RPM. :)
--
-Doug

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Richard Huxton" <dev(at)archonet(dot)com>
Cc:	"PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 14:36:20
Message-ID:	e373d31e0802150636r7226b0aah57c02fc060c66601@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:
> Phoenix Kiula wrote:
> > On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:
> >
> >> Ah, more new information! This does seem to point to the load,
> >> particularly if it's exactly the same query each time. So what do
> >> top/vmstat etc show for these "go-slow" periods?
> >
> > In included top and vmstat info in my other post yesterday, but here
> > it is again:
>
>
> Ah, you had a post yesterday!
>
> (goes away, searches for previous post)
> http://archives.postgresql.org/pgsql-general/2008-02/msg00689.php
> PG quitting sporadically!!
>
> Right, OK. Firstly, stop worrying about index usage and/or bloat. You
> have unexplained process crashes to deal with first. There's no point in
> looking at indexes until you figure out what is killing your processes.
>
> Secondly, a single line from vmstat isn't useful, you want to compare
> what is happening when things are fine with when they aren't. Leave
> vmstat 10 logging to a file so you can catch it.
>
> Thirdly, have you upgraded to the latest 8.2 (8.2.6) yet?
>
> I see you've reduced work_mem, that's good.
>
> Oh, you might as well lower max_connections from 150 too, there's no way
> you can support that many concurrent queries anyway.
>
>
> The fact that you're seeing various strange socket-related problems is
> odd. As is the fact that logging doesn't seem to work for you.
>
> Are you sure the two sets of vmstat/top figures are from when PG was
> crashing/running queries slow? Everything seems idle to me in those figures.

No. They are the vmstat figures from when I was replying to your
email. What will vmstat tell me and how should I set it up to do
"vmstat 10 logging"?

Btw, postgresql logging is working. But here're the kind of things I
have in there:

LOG: test message did not get through on socket for statistics collector
LOG: disabling statistics collector for lack of working socket
LOG: database system was shut down at 2008-02-15 06:12:10 CST
LOG: checkpoint record is at 8/E785304C
LOG: redo record is at 8/E785304C; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 0/296892698; next OID: 97929
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready

LOG: could not receive data from client: Connection reset by peer
LOG: unexpected EOF on client connection
LOG: could not receive data from client: Connection reset by peer
LOG: unexpected EOF on client connection

Now I don't know what is wrong or even where I should look. Postgresql
is often taking quite a bit of memory and CPU resources.

I've reduced work_mem to 10MB and Max_connections to 100. (Anyway, the
old values were working just fine until recently!)

The biggest problem: when I su into postgres user and do a psql to get
into the PG console in my SSH, it takes a whole lot of time to come
up! It used to come up in a jiffy earlier!!! It now shows me this
error:

~ >
psql: could not connect to server: Connection timed out
Is the server running on host "localhost" and accepting
TCP/IP connections on port 5432?

Then, five minutes later, I can connect again! In less than a second!
What gives?

Finally, very simple queries like this one:

select url, disable_in_statistics, id, user_known from links where
alias = '1yqw7' and status = 'Y' limit 1

Which used to be server in "5 ms" (0.005 seconds) are now taking
upwards of 200 seconds! Your suggestion to "Explain Analyze" --

=# explain analyze select url, disable_in_statistics, id, user_known
from links where alias = '1yqw7' and status = 'Y' limit 1 ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..8.74 rows=1 width=113) (actual time=9.639..9.643
rows=1 loops=1)
-> Index Scan using links2_alias_key on links (cost=0.00..8.74
rows=1 width=113) (actual time=9.630..9.630 rows=1 loops=1)
Index Cond: ((alias)::text = '1yqw7'::text)
Filter: (status = 'Y'::bpchar)
Total runtime: 16.425 ms
(5 rows)

Now this is only when I have connected to the psql console, of course.
Still, these queries are intermittently very slow!

From:	"Douglas McNaught" <doug(at)mcnaught(dot)org>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	"Richard Huxton" <dev(at)archonet(dot)com>, "PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 14:41:10
Message-ID:	5ded07e00802150641q3f9243a1gbc90e6d4bc207330@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 2/15/08, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:

This means your client processes are dying or getting killed (possibly
due to memory shortages?). Are these running on the same machine as
Postgres? Are there any logs you can look at to see what might be
going wrong? If this is Linux, are there any OOM-killer messages in
the syslogs?

-Doug

From:	"Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	"Richard Huxton" <dev(at)archonet(dot)com>, "PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 15:45:20
Message-ID:	dcc563d10802150745g70e3ca11i55b894af5d42200d@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Fri, Feb 15, 2008 at 8:36 AM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:

> No. They are the vmstat figures from when I was replying to your
> email. What will vmstat tell me and how should I set it up to do
> "vmstat 10 logging"?

Something like

vmstat 10 > vmstat.log

> LOG: could not receive data from client: Connection reset by peer
> LOG: unexpected EOF on client connection
> LOG: could not receive data from client: Connection reset by peer
> LOG: unexpected EOF on client connection
>
> Now I don't know what is wrong or even where I should look. Postgresql
> is often taking quite a bit of memory and CPU resources.
>
> I've reduced work_mem to 10MB and Max_connections to 100. (Anyway, the
> old values were working just fine until recently!)
>
> The biggest problem: when I su into postgres user and do a psql to get
> into the PG console in my SSH, it takes a whole lot of time to come
> up! It used to come up in a jiffy earlier!!! It now shows me this
> error:

How many pgsql processes are there when this happens? Try something like

ps axu|grep postgres

to see. use

ps axu|grep postgres|wc -l

to get a rough count. I'm guessing that your web service layer is
keeping old connections open. could be something as ugly as php's
pg_pconnect or a buggy jdbc driver, etc...

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-15 15:54:20
Message-ID:	47B5B5AC.8050608@archonet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

Phoenix Kiula wrote:
> On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:
>>
>> Are you sure the two sets of vmstat/top figures are from when PG was
>> crashing/running queries slow? Everything seems idle to me in those figures.
>
> No. They are the vmstat figures from when I was replying to your
> email. What will vmstat tell me and how should I set it up to do
> "vmstat 10 logging"?

I'd write a small script and call it e.g. "trackusage.sh" and save it in
/tmp/

#!/bin/sh
while (/bin/true)
do
date >> /tmp/vmstat_figures.txt
vmstat 10 60 >> /tmp/vmstat_figures.txt
done

Then, set the execute flag on it and do something like:
nohup /tmp/trackusage.sh &

That should run even when you disconnect (don't forget to kill it once
this is fixed). It will log a timestamp every 10 minutes and vmstat
activity between.

[snip logging fragment]
> Now I don't know what is wrong or even where I should look. Postgresql
> is often taking quite a bit of memory and CPU resources.

Just checking - this is a real machine and not a virtual one, isn't it?

> I've reduced work_mem to 10MB and Max_connections to 100. (Anyway, the
> old values were working just fine until recently!)
>
> The biggest problem: when I su into postgres user and do a psql to get
> into the PG console in my SSH, it takes a whole lot of time to come
> up! It used to come up in a jiffy earlier!!! It now shows me this
> error:
>
> ~ >
> psql: could not connect to server: Connection timed out
> Is the server running on host "localhost" and accepting
> TCP/IP connections on port 5432?
>
> Then, five minutes later, I can connect again! In less than a second!
> What gives?

Hopefully vmstat will show us.

> Finally, very simple queries like this one:
>
> select url, disable_in_statistics, id, user_known from links where
> alias = '1yqw7' and status = 'Y' limit 1
>
> Which used to be server in "5 ms" (0.005 seconds) are now taking
> upwards of 200 seconds!

Same symptom. I'd have guessed the machine is running out of memory and
swapping, but the vmstat/top stuff all look fine.

> Your suggestion to "Explain Analyze" --
>
> =# explain analyze select url, disable_in_statistics, id, user_known
> from links where alias = '1yqw7' and status = 'Y' limit 1 ;
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..8.74 rows=1 width=113) (actual time=9.639..9.643
> rows=1 loops=1)
> -> Index Scan using links2_alias_key on links (cost=0.00..8.74
> rows=1 width=113) (actual time=9.630..9.630 rows=1 loops=1)
> Index Cond: ((alias)::text = '1yqw7'::text)
> Filter: (status = 'Y'::bpchar)
> Total runtime: 16.425 ms

Fine - it's nothing to do with the planner, indexes or anything else.
This is system-related, and vmstat should point us in the right direction.
--
Richard Huxton
Archonet Ltd

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Richard Huxton" <dev(at)archonet(dot)com>
Cc:	"PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 02:38:45
Message-ID:	e373d31e0802151838s797bcd02o159fdf4307062ece@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:
> Phoenix Kiula wrote:
> > On 15/02/2008, Richard Huxton <dev(at)archonet(dot)com> wrote:
> >>
>
> >> Are you sure the two sets of vmstat/top figures are from when PG was
> >> crashing/running queries slow? Everything seems idle to me in those figures.
> >
> > No. They are the vmstat figures from when I was replying to your
> > email. What will vmstat tell me and how should I set it up to do
> > "vmstat 10 logging"?
>
>
> I'd write a small script and call it e.g. "trackusage.sh" and save it in
> /tmp/
>
> #!/bin/sh
> while (/bin/true)
> do
> date >> /tmp/vmstat_figures.txt
> vmstat 10 60 >> /tmp/vmstat_figures.txt
> done
>
> Then, set the execute flag on it and do something like:
> nohup /tmp/trackusage.sh &
>

Thanks Richard!

The script you suggested doesn't work:

tmp > ./trackusage.sh
-bash: ./trackusage.sh: /bin/sh: bad interpreter: Permission denied

Anyway, I did the vmstat command. I was running it while the system
was ok, then not ok, then ok...and so on. So I hope these numbers have
captured what the issue is:

tmp > vmstat 10 60
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 3380 331140 114344 2992304 0 0 31 34 20 73 2 1 93 3
0 0 3380 414412 114352 2992296 0 0 0 100 1105 286 1 1 96 2
0 0 3380 430356 114380 2992268 0 0 0 133 1103 280 1 1 95 3
0 0 3380 418988 114392 2992256 0 0 2 120 1098 277 1 2 93 4
0 0 3380 347996 114408 2992240 0 0 0 85 1081 134 1 0 97 2
0 0 3380 293236 114444 2992204 0 0 0 40 1076 138 0 0 97 2
0 0 3380 252860 114456 2992192 0 0 0 80 1086 141 0 0 97 2
0 0 3380 141340 114480 2992168 0 0 2 54 1078 145 1 0 97 2
0 0 3380 119940 114504 2992144 0 0 0 31 1079 143 1 1 97 1
0 0 3380 104252 114524 2992124 0 0 0 64 1087 182 1 1 96 2
0 0 3380 90556 114560 2992088 0 0 0 59 1087 144 1 0 97 2
0 0 3380 132476 115088 2995460 0 0 52 174 1130 447 2 1 92 4
0 1 3380 280628 115124 2995684 0 0 31 220 1144 479 4 2 91 4
0 0 3380 361340 115152 2995656 0 0 0 147 1135 338 2 1 94 3
0 0 3380 382028 115180 2995628 0 0 2 113 1109 253 1 1 96 2
0 0 3380 369740 115220 2995588 0 0 3 200 1107 260 1 1 93 4
0 0 3380 323140 115248 2995560 0 0 0 60 1097 153 1 0 97 2
0 0 3380 280260 115272 2995536 0 0 0 77 1087 133 1 0 98 1
0 0 3380 200580 115296 2995512 0 0 2 65 1089 140 1 0 97 2
0 0 3380 81916 115392 2995676 0 0 17 82 1089 188 2 1 94 2
0 0 3380 16980 98072 2974256 0 0 48 122 1102 190 2 1 95 3
1 0 3380 21588 73160 2954708 0 0 86 274 1128 276 2 2 88 8
0 0 3380 52692 57860 2932048 0 0 1 128 1106 211 2 1 95 3
0 0 3380 184748 57960 2931948 0 0 6 219 1128 451 2 1 92 5
0 0 3380 342996 58016 2931892 0 0 0 140 1122 465 2 1 94 3
0 0 3380 452020 58068 2932100 0 0 1 122 1114 268 1 1 95 2
0 0 3380 478044 58132 2932036 0 0 0 106 1099 294 1 1 95 3
0 0 3380 447540 58224 2931944 0 0 1 238 1098 319 2 2 91 5
0 0 3380 392524 58284 2931884 0 0 0 71 1078 134 0 1 97 2
0 0 3380 299684 58340 2931828 0 0 1 88 1079 150 1 0 97 2
0 0 3380 231652 58388 2931780 0 0 0 40 1076 135 1 1 97 1
0 0 3380 139012 58432 2931736 0 0 0 42 1076 145 1 0 97 2
0 0 3380 117884 58472 2931696 0 0 1 67 1092 151 1 0 96 2
0 0 3380 129460 58528 2931640 0 0 0 59 1097 190 1 1 96 2
0 0 3380 179892 58584 2931584 0 0 0 42 1100 158 1 1 97 2
0 0 3380 272900 58648 2931520 0 0 0 111 1114 308 1 1 95 3
0 0 3380 399100 58704 2931724 0 0 0 132 1128 352 1 1 95 2
0 0 3380 484556 58748 2931680 0 0 0 76 1111 269 1 1 96 2
0 0 3380 501180 58804 2931884 0 0 0 93 1103 249 1 1 96 2
0 0 3380 492636 58864 2931824 0 0 0 138 1094 259 1 1 95 3
1 1 3380 428380 58912 2932036 0 0 0 44 1088 142 1 0 98 1
0 0 3380 362340 58996 2931952 0 0 1 45 1085 138 1 0 97 2
0 0 3380 292708 59072 2931876 0 0 0 71 1082 138 1 1 97 2
0 0 3380 179292 59172 2931776 0 0 0 65 1089 149 1 0 97 2
0 0 3380 127292 59236 2931712 0 0 0 38 1090 149 1 0 97 1
0 0 3380 101940 59304 2931904 0 0 22 72 1097 186 1 1 96 2
0 0 3380 134068 59340 2931868 0 0 0 74 1100 148 1 0 97 1
0 1 3380 257908 59400 2932068 0 0 0 112 1114 424 2 1 95 3
0 1 3380 399484 59460 2932008 0 0 0 96 1127 336 1 1 96 2
1 0 3380 480548 59524 2932464 0 0 1 100 1118 286 1 1 96 2
1 0 3380 497092 59580 2932408 0 0 0 120 1110 282 1 1 96 3
0 1 3380 481684 59652 2932336 0 0 0 174 1099 310 2 2 92 4
0 0 3380 416772 59692 2932296 0 0 0 49 1085 136 1 0 97 2
0 0 3380 372108 59740 2932248 0 0 1 68 1089 144 1 0 97 2
1 1 3380 307676 59808 2932180 0 0 1 75 1083 140 1 0 97 2
0 0 3380 232620 59904 2932084 0 0 4 78 1077 154 1 0 97 2
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 3380 169996 59948 2932300 0 0 0 41 1092 157 1 1 96 2
0 0 3380 171660 59996 2932252 0 0 0 135 1105 190 1 1 96 3
2 0 3380 166140 60052 2932456 0 0 0 47 1101 158 1 0 97 2
4 0 3380 269860 60104 2932404 0 0 0 103 1114 326 1 2 94 2

Please advise??

From:	Greg Smith <gsmith(at)gregsmith(dot)com>
To:	Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 04:13:40
Message-ID:	Pine.GSO.4.64.0802152256500.8438@westnet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Sat, 16 Feb 2008, Phoenix Kiula wrote:

> The script you suggested doesn't work:
> tmp > ./trackusage.sh
> -bash: ./trackusage.sh: /bin/sh: bad interpreter: Permission denied

Try changing the first line to

#!/bin/bash

> Anyway, I did the vmstat command. I was running it while the system
> was ok, then not ok, then ok...and so on. So I hope these numbers have
> captured what the issue is:
>
> tmp > vmstat 10 60
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 3380 323140 115248 2995560 0 0 0 60 1097 153 1 0 97 2
> 0 0 3380 280260 115272 2995536 0 0 0 77 1087 133 1 0 98 1
> 0 0 3380 200580 115296 2995512 0 0 2 65 1089 140 1 0 97 2
> 0 0 3380 81916 115392 2995676 0 0 17 82 1089 188 2 1 94 2
> 0 0 3380 16980 98072 2974256 0 0 48 122 1102 190 2 1 95 3
> 1 0 3380 21588 73160 2954708 0 0 86 274 1128 276 2 2 88 8
> 0 0 3380 52692 57860 2932048 0 0 1 128 1106 211 2 1 95 3
> 0 0 3380 184748 57960 2931948 0 0 6 219 1128 451 2 1 92 5
> 0 0 3380 342996 58016 2931892 0 0 0 140 1122 465 2 1 94 3

Looks like the worst spot was in the middle here. Something gobbled up
over 300MB of memory in 40 seconds, enough to force the OS to blow away
almost half its disk buffers just to keep working memory free. Not so bad
that it went to swap or invoked the OOM killer but enough to push the I/O
block out (bo) up. I would guess the other ugly spots were the later
portions where the bo spiked >100.

But without knowing more about what the processing using this memory and
generating the output I/O are doing it's hard to say why. That's why I
suggested you watch top with the command lines turned on for a bit, to see
what process(es) are jumping around during the bad periods.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Greg Smith" <gsmith(at)gregsmith(dot)com>
Cc:	"PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 04:38:04
Message-ID:	e373d31e0802152038h2c68e8a0pc4a4be6790aaf969@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 16/02/2008, Greg Smith <gsmith(at)gregsmith(dot)com> wrote:
> On Sat, 16 Feb 2008, Phoenix Kiula wrote:
>
> > The script you suggested doesn't work:
> > tmp > ./trackusage.sh
> > -bash: ./trackusage.sh: /bin/sh: bad interpreter: Permission denied
>
>
> Try changing the first line to
>
> #!/bin/bash

Thanks Greg. Same problem with that too. I guess my tmp folder is
secured and doesn't allow for executables? I put it in another folder
and it's working.

...snip....
> Looks like the worst spot was in the middle here. Something gobbled up
> over 300MB of memory in 40 seconds, enough to force the OS to blow away
> almost half its disk buffers just to keep working memory free. Not so bad
> that it went to swap or invoked the OOM killer but enough to push the I/O
> block out (bo) up. I would guess the other ugly spots were the later
> portions where the bo spiked >100.
>
> But without knowing more about what the processing using this memory and
> generating the output I/O are doing it's hard to say why. That's why I
> suggested you watch top with the command lines turned on for a bit, to see
> what process(es) are jumping around during the bad periods.

Happy to do that, but top keeps changing before I can copy text from
it. I think most of the connections seem to be "httpd" which is Apache
2.2.6. I checked the netstat commands and the server is not under DDOS
or anything.

My hosting provider tells me that the Postgresql server is taking up a
lot of memory but I've been running the same db with the same config
for over 2 years. Yes we have been growing but what happened in the
last 3 days to warrant a sudden spike in memory consumption??!!

Anyway, I want to go back to them with some hard data that postgresql
is NOT the one that is causing my server to load. The indexes are all
in place (and I've REINDEXed my big tables anyway) so the performance
of pg itself is not an issue.

I just don't know where to get this hard data. The top output shows
httpd on top, and sometimes postmaster, but I don't know how to
repeatedly capture it. Any suggestions?

From:	"Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	"Greg Smith" <gsmith(at)gregsmith(dot)com>, "PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 05:19:15
Message-ID:	dcc563d10802152119h561f87e6qe6dcd3217cc173ce@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Feb 15, 2008 10:38 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:

> My hosting provider tells me that the Postgresql server is taking up a
> lot of memory but I've been running the same db with the same config
> for over 2 years. Yes we have been growing but what happened in the
> last 3 days to warrant a sudden spike in memory consumption??!!

OK, you've repeated this a few times. Unless your load has been the
same the whole time, this statement means little. If the same config
works for 2 years at load x, but fails in 1 day at load 3x then the
problem might have been there all along, and you just weren't running
the system hard enough to find the problem.

Just because PostgreSQL is exhibiting problems doesn't mean it's all
postgresql's fault.

150 or 100 connections is a LOT for a postgresql server, but
especially so if you went from actually using 5 or 10 to using 98.
The setting's the same, but the number is use is vastly different and
will have vastly different results on how postgresql runs.

Hanging connections could EASILY cause the problem you're seeing. If
the network loses your connection from your app tier to your database,
your database might have 100 connections open doing nothing but
sitting idle in transaction holding data in memory until the
tcp_keepalive kicks in and kills them.

The earlier host connection errors point to that problem as well.

So, do you have mysteriously crashing or disappearing apache child
processes? What do the error logs for apache have to say?

Can you tell what your load was when the system worked and what it is
now by trawling through the logs or something? (apache or pgsql as
long as their equivalent for both time periods.)

If you start leaving hanging connections to the database then you are
in fact DOSing the database server. Not all DOS attacks are
intentional, and a crashing apache - php can do it even without
persistent connections.

I'd say you haven't proven where the problem is yet, and should look
at the app tier.

From:	Greg Smith <gsmith(at)gregsmith(dot)com>
To:	Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 05:41:55
Message-ID:	Pine.GSO.4.64.0802160033370.20075@westnet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Sat, 16 Feb 2008, Phoenix Kiula wrote:

> The top output shows httpd on top, and sometimes postmaster, but I don't
> know how to repeatedly capture it. Any suggestions?

Try this:

top -bc | tee topdata

That will save everything to a file called topdata while also letting you
watch it scroll by. Not as easy to catch the bad periods that way, the
output is going to be a huge data file, but you'll have a log to sort
through of everything. Control-C to get out of there when you're bored.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

From:	"Andrej Ricnik-Bay" <andrej(dot)groups(at)gmail(dot)com>
To:	"PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Cc:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>, "Greg Smith" <gsmith(at)gregsmith(dot)com>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 19:47:55
Message-ID:	b35603930802161147x60e0d7bevd4b68a7b92227c67@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 16/02/2008, Greg Smith <gsmith(at)gregsmith(dot)com> wrote:

> top -bc | tee topdata
>
> That will save everything to a file called topdata while also letting you
> watch it scroll by. Not as easy to catch the bad periods that way, the
> output is going to be a huge data file, but you'll have a log to sort
> through of everything. Control-C to get out of there when you're bored.

Or pipe through the following awk-script to make the output
less overwhelming ... save it as top.awk

BEGIN{
# the sort numerically sorts by memory usage percentage
# head filters out the top 20 contenders
command = "LC_ALL=C sort -k 10,10gr|head -20"
}
{
# read all of top into an array
line[NR]=$0
last=NR
}
END{
# print the header well-formed
for(i=1;i<8;i++){
print line[i]
}
# and do the sort & strip of the processes
for(i=8;i<last;i++){
print line[i]|& command
}
close(command, "to")
while ((command |& getline out) > 0)
print out
close(command)
}

Invoke like so
top -b -d 1 | awk -f top.awk | tee topdata

Instead of the "getting bored Ctrl-C" maybe a "-n 3600" as
extra parameter to top to get roughly one hours worth of data...

Cheers,
Andrej

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Andrej Ricnik-Bay" <andrej(dot)groups(at)gmail(dot)com>
Cc:	"PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>, "Greg Smith" <gsmith(at)gregsmith(dot)com>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 19:55:17
Message-ID:	e373d31e0802161155s44f35737lfabf081bab8faf1d@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 17/02/2008, Andrej Ricnik-Bay <andrej(dot)groups(at)gmail(dot)com> wrote:
> On 16/02/2008, Greg Smith <gsmith(at)gregsmith(dot)com> wrote:
>
> > top -bc | tee topdata
> >
> > That will save everything to a file called topdata while also letting you
> > watch it scroll by. Not as easy to catch the bad periods that way, the
> > output is going to be a huge data file, but you'll have a log to sort
> > through of everything. Control-C to get out of there when you're bored.
>
> Or pipe through the following awk-script to make the output
> less overwhelming ... save it as top.awk
>
> BEGIN{
> # the sort numerically sorts by memory usage percentage
> # head filters out the top 20 contenders
> command = "LC_ALL=C sort -k 10,10gr|head -20"
> }
> {
> # read all of top into an array
> line[NR]=$0
> last=NR
> }
> END{
> # print the header well-formed
> for(i=1;i<8;i++){
> print line[i]
> }
> # and do the sort & strip of the processes
> for(i=8;i<last;i++){
> print line[i]|& command
> }
> close(command, "to")
> while ((command |& getline out) > 0)
> print out
> close(command)
> }
>
> Invoke like so
> top -b -d 1 | awk -f top.awk | tee topdata
>
> Instead of the "getting bored Ctrl-C" maybe a "-n 3600" as
> extra parameter to top to get roughly one hours worth of data...

Thanks, but it gives me syntax errors:

~> top -b -d 1 | awk -f top.awk | tee topdata

awk: top.awk:24: for(i=8;i<last;i++
awk: top.awk:24: ^ syntax error
awk: top.awk:28:
awk: top.awk:28: ^ syntax error
awk: top.awk:29:
awk: top.awk:29: ^ syntax error
awk: top.awk:31:
awk: top.awk:31: ^ syntax error
awk: top.awk:31: }
awk: top.awk:31: ^ invalid char '' in expression

Any ideas?

From:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To:	"Shashank Tripathi" <shanx(at)shanx(dot)com>
Cc:	"Andrej Ricnik-Bay" <andrej(dot)groups(at)gmail(dot)com>, "PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 20:20:01
Message-ID:	e373d31e0802161220w52dc8c11w5b01f9e498eca6e5@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 17/02/2008, Shashank Tripathi <shanx(at)shanx(dot)com> wrote:
> On 17/02/2008, Andrej Ricnik-Bay <andrej(dot)groups(at)gmail(dot)com> wrote:
> > On 17/02/2008, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:
> > > ~> top -b -d 1 | awk -f top.awk | tee topdata
> > >
> > > awk: top.awk:24: for(i=8;i<last;i++
> > > awk: top.awk:24: ^ syntax error
> > > awk: top.awk:28:
> > > awk: top.awk:28: ^ syntax error
> > > awk: top.awk:29:
> > > awk: top.awk:29: ^ syntax error
> > > awk: top.awk:31:
> > > awk: top.awk:31: ^ syntax error
> > > awk: top.awk:31: }
> > > awk: top.awk:31: ^ invalid char ' ' in expression
> >
> >
> > > Any ideas?
> > What OS are you on, which version of awk? Mine
> > works on most current Linux variants with a gawk > 3.x
> > awk -W version
> > GNU Awk 3.1.5
> > Copyright (C) 1989, 1991-2005 Free Software Foundation.
> >
> > Another thought is that maybe when you copy & pasted you
> > got some special characters into the script that awk doesn't
> > like ....
> >
>
>
>
> Thanks. I am on CentOS 4 (Linux) and the awk bit is
>
> GNU Awk 3.1.3
> Copyright (C) 1989, 1991-2003 Free Software Foundatio
>
> Do I need to update awk?
>

My question exactly.

Anyway I downloaded the text file that was attached in this thread,
and then it works. Must have been some copy/paste problem as
suggested.

I ran it with this command:

top -b -d 1 -n 3600 | awk -f top.awk | tee topdata

But this is kind of sitting there, hogging the command prompt. Is
there any way I can let it go on in the background?

Thanks for the awk tip. Looks like a thing I need to learn!

From:	"Andrej Ricnik-Bay" <andrej(dot)groups(at)gmail(dot)com>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	"PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Are indexes blown?
Date:	2008-02-16 20:31:29
Message-ID:	b35603930802161231q5805f3c0r7033824e8ebc8795@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On 17/02/2008, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:
> But this is kind of sitting there, hogging the command prompt. Is
> there any way I can let it go on in the background?

Ouch ... no, that's entirely my fault, wasn't quite awake I
guess, and hadn't thought it through completely ... that's
not going to give us the desired result...

Try this:

for z in `seq 1 3600`; do top -b -d 1 -n 1| awk -f top.awk; done | tee topoutput

Not sure whether it's going to give us the desire granularity of time...

Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :} Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

From:	Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Are indexes blown?
Date:	2008-02-18 16:20:36
Message-ID:	20080218162036.GD977@crankycanuck.ca
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Sat, Feb 16, 2008 at 12:38:04PM +0800, Phoenix Kiula wrote:

Maybe your database has just grown big enough that it no longer all fits in
memory, and your disk hardware is flakey? Or maybe the disk is flakey? Or
maybe the data itself is now different enough that the planner is changing
what it's doing? There are lots of explanations.

But I have to agree with someone upthread: you have to stop the crashes and
clean the system up before you try to debug this stuff. Your problem is
likely there.

From:	Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Are indexes blown?
Date:	2008-02-18 16:22:29
Message-ID:	20080218162229.GE977@crankycanuck.ca
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general

On Fri, Feb 15, 2008 at 09:09:32PM +0800, Phoenix Kiula wrote:
> Actually my host has just told me that I have a number of "hung
> semaphores" in my server. And he is relating them to postgresql. I am
> not surprised, because this is the only utility that has issues. All
> the rest is working (apache, mysql, exim, etc). Any thoughts on where
> I should start looking for hung semaphores?

If this means what I think it means, you have bigger problems than you've
been posting. Shut down postgres, and clean up any shared memory segments
that are hanging around. See the manual on ipcclean.