Re: very slow after a while...

Lists: pgsql-general
From: "Costin Manda" <siderite(at)madnet(dot)ro>
To: pgsql-general(at)postgresql(dot)org
Subject: very slow after a while...
Date: 2005-04-06 08:27:23
Message-ID: 1132.193.226.119.24.1112776043.squirrel@193.226.119.24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hello,
I have a machine that uses pgsql version 8.0.1 I don't think the version
is relevant because I had 7.4.1 before and I had the same problem. I have
a PHP script that runs regularily and does this:
select a bunch of lines from a mssql database
insert into postgres the values taken
if insert fails with duplicate id (I have a unique index there) then
perform an update with the values inside.

Pretty straightforward.
The thing is the after I updated to 8.0.1 and also (separate ocasion)
after I recreated the database one day, the script runs instantly with
thousands and hundreds of lines inserted and updated per second. However,
after a while the whole process slows down significantly, eraching the
point of one insert or update per second or per four!!! seconds.

I have tried vacuuming full, reindexing, deleting the table and recreating
it, I tried changing values in postgres conf and in the linux kernel.
Nothing works except re initialising the whole database directory.

PLEASE HELP!

-------------------------
E-Mail powered by MadNet.
http://www.madnet.ro/


From: Richard Huxton <dev(at)archonet(dot)com>
To: Costin Manda <siderite(at)madnet(dot)ro>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 09:17:59
Message-ID: 4253A947.9090300@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Costin Manda wrote:

> The thing is the after I updated to 8.0.1 and also (separate ocasion)
> after I recreated the database one day, the script runs instantly with
> thousands and hundreds of lines inserted and updated per second. However,
> after a while the whole process slows down significantly, eraching the
> point of one insert or update per second or per four!!! seconds.
>
> I have tried vacuuming full, reindexing, deleting the table and recreating
> it, I tried changing values in postgres conf and in the linux kernel.
> Nothing works except re initialising the whole database directory.

Some more info please:
1. This is this one INSERT statement per transaction, yes? If that
fails, you do an UPDATE
2. Are there any foreign-keys the insert will be checking?
3. What indexes are there on the main table/foreign-key-related tables?

Whatever the answers to these questions, perhaps look into loading your
data into a temporary table, inserting any rows without matching primary
keys and then deleting those and updating what's left.

--
Richard Huxton
Archonet Ltd


From: Richard Huxton <dev(at)archonet(dot)com>
To: Costin Manda <siderite(at)madnet(dot)ro>, List <pgsql-general(at)postgresql(dot)org>
Subject: Re: very slow after a while...
Date: 2005-04-06 10:32:39
Message-ID: 4253BAC7.4060107@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Please CC the list as well as replying directly - it means more people
can help.

Costin Manda wrote:
>>Some more info please:
>>1. This is this one INSERT statement per transaction, yes? If that
>>fails, you do an UPDATE
>
> correct.
>
>>2. Are there any foreign-keys the insert will be checking?
>>3. What indexes are there on the main table/foreign-key-related tables?
>
>
> this is the table, the only restriction at the insert is the logid which
> must be unique.
>
> Table "public.pgconnectionlog"
> Column | Type | Modifiers
> ----------------+-----------------------+-----------
> logid | integer | not null
> username | character varying(20) |
> logtime | integer |
> connecttime | integer |
> disconnecttime | integer |
> usedcredit | double precision |
> usedtime | integer |
> phonenum | character varying(30) |
> prephonenum | character varying(20) |
> pricelistname | character varying(30) |
> precode | character varying(20) |
> effectivetime | integer |
> callerid | character varying(30) |
> serialnumber | character varying(30) |
> prefix | character varying(20) |
> tara | character varying |
> Indexes:
> "pgconnectionlog_pkey" PRIMARY KEY, btree (logid)
> "connecttime_index" btree (connecttime)
> "disconnecttime_index" btree (disconnecttime)
> "logtime_index" btree (logtime)
> "prefix_index" btree (prefix)
> "tara_index" btree (tara)
> "username_index" btree (username)

Hmm - nothing unusual there. I'd be suspicious of a problem with the
indexes, except you say reindexing has no effect.

>>Whatever the answers to these questions, perhaps look into loading your
>>data into a temporary table, inserting any rows without matching primary
>>keys and then deleting those and updating what's left.
>
> You think this will be faster? It does make sense. Anyway, the problem
> is not optimising the script, is the speed change , dramatic I would
> say.

Could you monitor what's happening while this slows down. In particular,
could you:
1. Run "vmstat 10" so we can see memory/cpu/disk usage while this is
happening.
2. See what's happening in pg_xlog - are you creating/cycling through a
lot of transaction-log files?
3. Keep an eye on the logs - are there any warnings there?

If you vacuum full, it's worth adding "verbose" to the that too, to see
what it's doing.
--
Richard Huxton
Archonet Ltd


From: "Costin Manda" <siderite(at)madnet(dot)ro>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 11:33:58
Message-ID: 1469.193.226.119.24.1112787238.squirrel@193.226.119.24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


> Some more info please:
> 1. This is this one INSERT statement per transaction, yes? If that
> fails, you do an UPDATE

correct.

> 2. Are there any foreign-keys the insert will be checking?
> 3. What indexes are there on the main table/foreign-key-related tables?

this is the table, the only restriction at the insert is the logid which
must be unique.

Table "public.pgconnectionlog"
Column | Type | Modifiers
----------------+-----------------------+-----------
logid | integer | not null
username | character varying(20) |
logtime | integer |
connecttime | integer |
disconnecttime | integer |
usedcredit | double precision |
usedtime | integer |
phonenum | character varying(30) |
prephonenum | character varying(20) |
pricelistname | character varying(30) |
precode | character varying(20) |
effectivetime | integer |
callerid | character varying(30) |
serialnumber | character varying(30) |
prefix | character varying(20) |
tara | character varying |
Indexes:
"pgconnectionlog_pkey" PRIMARY KEY, btree (logid)
"connecttime_index" btree (connecttime)
"disconnecttime_index" btree (disconnecttime)
"logtime_index" btree (logtime)
"prefix_index" btree (prefix)
"tara_index" btree (tara)
"username_index" btree (username)

> Whatever the answers to these questions, perhaps look into loading your
> data into a temporary table, inserting any rows without matching primary
> keys and then deleting those and updating what's left.

You think this will be faster? It does make sense. Anyway, the problem
is not optimising the script, is the speed change , dramatic I would
say.

It is possible the problem doesn't come from this script, but from
others. The question is why does the database slow down to such a
degree? I repeat: dumping all data into a file, recreating the data
directory and reloading the data results in almost instantaneous inserts
and updates.

-------------------------
E-Mail powered by MadNet.
http://www.madnet.ro/


From: "Costin Manda" <siderite(at)madnet(dot)ro>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 12:24:31
Message-ID: 1504.193.226.119.24.1112790271.squirrel@193.226.119.24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


I think I found the problem. I was comparing wrongly some values and
based on that, every time the script was run (that means once every 5
minutes) my script deleted two tables and populated them with about 70
thousand records.

I still don't know why that affected the speed of the database (even
when the script was not running) and how to fix it. I have a script that
vacuums full and reindexes the database every day. Is there something
else I must do?

-------------------------
E-Mail powered by MadNet.
http://www.madnet.ro/


From: Richard Huxton <dev(at)archonet(dot)com>
To: Costin Manda <siderite(at)madnet(dot)ro>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 13:07:36
Message-ID: 4253DF18.1000704@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Costin Manda wrote:
> I think I found the problem. I was comparing wrongly some values and
> based on that, every time the script was run (that means once every 5
> minutes) my script deleted two tables and populated them with about 70
> thousand records.
>
> I still don't know why that affected the speed of the database (even
> when the script was not running) and how to fix it. I have a script that
> vacuums full and reindexes the database every day. Is there something
> else I must do?

I'm not sure I understand what you're saying, but if you vacuum at the
wrong time that can cause problems. I've shot myself in the foot before
now doing something like:

DELETE FROM big_table
VACUUM ANALYSE big_table
COPY lots of rows into big_table

Of course, the planner now thinks there are zero rows in big_table.
--
Richard Huxton
Archonet Ltd


From: Costin Manda <siderite(at)madnet(dot)ro>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 14:02:18
Message-ID: 20050406170028.1B79.SIDERITE@madnet.ro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Wed, 06 Apr 2005 14:07:36 +0100
Richard Huxton <dev(at)archonet(dot)com> wrote:

> Costin Manda wrote:
> > I think I found the problem. I was comparing wrongly some values and
> > based on that, every time the script was run (that means once every 5
> > minutes) my script deleted two tables and populated them with about 70
> > thousand records.

> I'm not sure I understand what you're saying, but if you vacuum at the
> wrong time that can cause problems. I've shot myself in the foot before
> now doing something like:
>
> DELETE FROM big_table
> VACUUM ANALYSE big_table
> COPY lots of rows into big_table
>
> Of course, the planner now thinks there are zero rows in big_table.

I mean from 5 to 5 minutes
DROP TABLE
CREATE TABLE
INSERT 70000 rows in table

and the Vacuuming is done at 2 in the morning, there should be no
activity then...


From: Richard Huxton <dev(at)archonet(dot)com>
To: Costin Manda <siderite(at)madnet(dot)ro>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 14:54:29
Message-ID: 4253F825.4070301@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Costin Manda wrote:
> On Wed, 06 Apr 2005 14:07:36 +0100
> Richard Huxton <dev(at)archonet(dot)com> wrote:
>
>
>>Costin Manda wrote:
>>
>>> I think I found the problem. I was comparing wrongly some values and
>>>based on that, every time the script was run (that means once every 5
>>>minutes) my script deleted two tables and populated them with about 70
>>>thousand records.
>
>
>>I'm not sure I understand what you're saying, but if you vacuum at the
>>wrong time that can cause problems. I've shot myself in the foot before
>>now doing something like:
>>
>>DELETE FROM big_table
>>VACUUM ANALYSE big_table
>>COPY lots of rows into big_table
>>
>>Of course, the planner now thinks there are zero rows in big_table.
>
>
> I mean from 5 to 5 minutes
> DROP TABLE
> CREATE TABLE
> INSERT 70000 rows in table

I thought you were trying an inserting / updating if it failed? You
shouldn't have any duplicates if the table was already empty. Or have I
misunderstood?

--
Richard Huxton
Archonet Ltd


From: Costin Manda <siderite(at)madnet(dot)ro>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 15:18:47
Message-ID: 20050406181254.FF1A.SIDERITE@madnet.ro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Wed, 06 Apr 2005 15:54:29 +0100
Richard Huxton <dev(at)archonet(dot)com> wrote:

> > I mean from 5 to 5 minutes
> > DROP TABLE
> > CREATE TABLE
> > INSERT 70000 rows in table
>
> I thought you were trying an inserting / updating if it failed? You
> shouldn't have any duplicates if the table was already empty. Or have I
> misunderstood?

Ok, let's start over :)

The script does the following thing:
1. read the count of rows in two tables from the mssql database
2. read the count of rows of the 'mirror' tables in postgres
these are tables that get updated rarely and have a maximum of 100000
records together
3. if the counts differ, delete from the mirror table everything and
reinsert everything.
4. THEN do the inserts that get updated on error

I thought the problem lied with step 4, but now I see that step 3 was
the culprit and that , indeed, I did not do drop table, create table but
delete from and inserts. I think that recreating these two tables should
solve the problem, isn't it?

A classical case of figuring out what the problem is while asking for
help, then feeling silly about it :)


From: Richard Huxton <dev(at)archonet(dot)com>
To: Costin Manda <siderite(at)madnet(dot)ro>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 16:28:43
Message-ID: 42540E3B.8000603@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Costin Manda wrote:
> On Wed, 06 Apr 2005 15:54:29 +0100
> Richard Huxton <dev(at)archonet(dot)com> wrote:
>
>
>>> I mean from 5 to 5 minutes
>>>DROP TABLE
>>>CREATE TABLE
>>>INSERT 70000 rows in table
>>
>>I thought you were trying an inserting / updating if it failed? You
>>shouldn't have any duplicates if the table was already empty. Or have I
>>misunderstood?
>
>
>
> Ok, let's start over :)
>
> The script does the following thing:
> 1. read the count of rows in two tables from the mssql database
> 2. read the count of rows of the 'mirror' tables in postgres
> these are tables that get updated rarely and have a maximum of 100000
> records together
> 3. if the counts differ, delete from the mirror table everything and
> reinsert everything.
> 4. THEN do the inserts that get updated on error
>
> I thought the problem lied with step 4, but now I see that step 3 was
> the culprit and that , indeed, I did not do drop table, create table but
> delete from and inserts. I think that recreating these two tables should
> solve the problem, isn't it?

Hmm - try TRUNCATE rather than DELETE. Also, you might drop the indexes,
re-insert the data then recreate the indexes - that can be faster for
bulk loading.

--
Richard Huxton
Archonet Ltd


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Costin Manda <siderite(at)madnet(dot)ro>, pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 16:59:28
Message-ID: 28287.1112806768@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Richard Huxton <dev(at)archonet(dot)com> writes:
> Costin Manda wrote:
>> I thought the problem lied with step 4, but now I see that step 3 was
>> the culprit and that , indeed, I did not do drop table, create table but
>> delete from and inserts. I think that recreating these two tables should
>> solve the problem, isn't it?

> Hmm - try TRUNCATE rather than DELETE.

Or VACUUM between deleting and inserting. But TRUNCATE would be better
if you can do it.

regards, tom lane


From: Ragnar HafstaĆ° <gnari(at)simnet(dot)is>
To: Costin Manda <siderite(at)madnet(dot)ro>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: very slow after a while...
Date: 2005-04-06 17:01:07
Message-ID: 1112806867.31874.4.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2005-04-06 at 18:18 +0300, Costin Manda wrote:

> The script does the following thing:
> 1. read the count of rows in two tables from the mssql database
> 2. read the count of rows of the 'mirror' tables in postgres
> these are tables that get updated rarely and have a maximum of 100000
> records together
> 3. if the counts differ, delete from the mirror table everything and
> reinsert everything.
> 4. THEN do the inserts that get updated on error

if you empty the table in step 3, why do you have to test for
the duplicate id error? are there duplicates in the mssql table?

gnari