Re: db growing out of proportion

Lists: pgsql-bugspgsql-performance
From: Tomas Szepe <szepe(at)pinerecords(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: db growing out of proportion
Date: 2003-05-29 16:32:39
Message-ID: 20030529163239.GA11101@louise.pinerecords.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

Hello everybody,

I'm facing a simple yet gravely problem with postgresql 7.3.2 on x86 Linux.
My db is used to store IP accounting statistics for about 30 C's. There are
a couple truly trivial tables such as the one below:

CREATE TABLE stats_min
(
ip inet NOT NULL,
start timestamp NOT NULL default CURRENT_TIMESTAMP(0),
intlen int4 NOT NULL default 60,
d_in int8 NOT NULL,
d_out int8 NOT NULL,

constraint "stats_min_pkey" PRIMARY KEY ("ip", "start")
);
CREATE INDEX stats_min_start ON stats_min (start);

A typical transaction committed on these tables looks like this:

BEGIN WORK
DELETE ...
UPDATE/INSERT ...
COMMIT WORK

Trouble is, as the rows in the tables get deleted/inserted/updated
(the frequency being a couple thousand rows per minute), the database
is growing out of proportion in size. After about a week, I have
to redump the db by hand so as to get query times back to sensible
figures. A transaction that takes ~50 seconds before the redump will
then complete in under 5 seconds (the corresponding data/base/ dir having
shrunk from ~2 GB to ~0.6GB).

A nightly VACCUM ANALYZE is no use.

A VACUUM FULL is no use.

A VACUUM FULL followed by REINDEX is no use.

It seems that only a full redump involving "pg_dump olddb | \
psql newdb" is capable of restoring the system to its working
glory.

Please accept my apologies if I've overlooked a relevant piece of
information in the docs. I'm in an urgent need of getting this
problem resolved.

--
Tomas Szepe <szepe(at)pinerecords(dot)com>


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Tomas Szepe <szepe(at)pinerecords(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: db growing out of proportion
Date: 2003-05-29 17:37:38
Message-ID: 20030529103316.K60582-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

On Thu, 29 May 2003, Tomas Szepe wrote:

> Hello everybody,
>
> I'm facing a simple yet gravely problem with postgresql 7.3.2 on x86 Linux.
> My db is used to store IP accounting statistics for about 30 C's. There are
> a couple truly trivial tables such as the one below:
>
> CREATE TABLE stats_min
> (
> ip inet NOT NULL,
> start timestamp NOT NULL default CURRENT_TIMESTAMP(0),
> intlen int4 NOT NULL default 60,
> d_in int8 NOT NULL,
> d_out int8 NOT NULL,
>
> constraint "stats_min_pkey" PRIMARY KEY ("ip", "start")
> );
> CREATE INDEX stats_min_start ON stats_min (start);
>
> A typical transaction committed on these tables looks like this:
>
> BEGIN WORK
> DELETE ...
> UPDATE/INSERT ...
> COMMIT WORK
>
> Trouble is, as the rows in the tables get deleted/inserted/updated
> (the frequency being a couple thousand rows per minute), the database
> is growing out of proportion in size. After about a week, I have
> to redump the db by hand so as to get query times back to sensible
> figures. A transaction that takes ~50 seconds before the redump will
> then complete in under 5 seconds (the corresponding data/base/ dir having
> shrunk from ~2 GB to ~0.6GB).
>
> A nightly VACCUM ANALYZE is no use.
>
> A VACUUM FULL is no use.
>
> A VACUUM FULL followed by REINDEX is no use.

Is the space being taken up by stats_min, this index, some other object?
I'm not 100% sure, but after vacuums maybe
select * from pg_class order by relpages desc limit 10;
will give a good idea.

What does VACUUM FULL VERBOSE stats_min; give you?


From: Tomas Szepe <szepe(at)pinerecords(dot)com>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-30 07:24:42
Message-ID: 20030530072442.GD14159@louise.pinerecords.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

> [sszabo(at)megazone23(dot)bigpanda(dot)com]
>
> > Trouble is, as the rows in the tables get deleted/inserted/updated
> > (the frequency being a couple thousand rows per minute), the database
> > is growing out of proportion in size. After about a week, I have
> > to redump the db by hand so as to get query times back to sensible
> > figures. A transaction that takes ~50 seconds before the redump will
> > then complete in under 5 seconds (the corresponding data/base/ dir having
> > shrunk from ~2 GB to ~0.6GB).
> >
> > A nightly VACCUM ANALYZE is no use.
> >
> > A VACUUM FULL is no use.
> >
> > A VACUUM FULL followed by REINDEX is no use.
>
> Is the space being taken up by stats_min, this index, some other object?

relname | relkind | relpages | reltuples
---------------------------------+---------+----------+-------------
stats_hr | r | 61221 | 3.01881e+06
stats_hr_pkey | i | 26414 | 3.02239e+06
stats_min_pkey | i | 20849 | 953635
stats_hr_start | i | 17218 | 3.02142e+06
stats_min_start | i | 15284 | 949788
stats_min | r | 10885 | 948792
authinfo_pkey | i | 1630 | 1342
authinfo | r | 1004 | 1342
contract_ips | r | 865 | 565
contract_ips_pkey | i | 605 | 565

> What does VACUUM FULL VERBOSE stats_min; give you?

Sorry, I can't run a VACUUM FULL at this time.
We're in production use.

--
Tomas Szepe <szepe(at)pinerecords(dot)com>


From: Peter Childs <blue(dot)dragon(at)blueyonder(dot)co(dot)uk>
To:
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-30 09:21:51
Message-ID: Pine.LNX.4.44.0305301012040.22127-100000@RedDragon.Childs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

On Fri, 30 May 2003, Tomas Szepe wrote:

> > [sszabo(at)megazone23(dot)bigpanda(dot)com]
> >
> > > Trouble is, as the rows in the tables get deleted/inserted/updated
> > > (the frequency being a couple thousand rows per minute), the database
> > > is growing out of proportion in size. After about a week, I have
> > > to redump the db by hand so as to get query times back to sensible
> > > figures. A transaction that takes ~50 seconds before the redump will
> > > then complete in under 5 seconds (the corresponding data/base/ dir having
> > > shrunk from ~2 GB to ~0.6GB).
> > >
> > > A nightly VACCUM ANALYZE is no use.
> > >
> > > A VACUUM FULL is no use.
> > >
> > > A VACUUM FULL followed by REINDEX is no use.
> >
> > Is the space being taken up by stats_min, this index, some other object?
>
> relname | relkind | relpages | reltuples
> ---------------------------------+---------+----------+-------------
> stats_hr | r | 61221 | 3.01881e+06
> stats_hr_pkey | i | 26414 | 3.02239e+06
> stats_min_pkey | i | 20849 | 953635
> stats_hr_start | i | 17218 | 3.02142e+06
> stats_min_start | i | 15284 | 949788
> stats_min | r | 10885 | 948792
> authinfo_pkey | i | 1630 | 1342
> authinfo | r | 1004 | 1342
> contract_ips | r | 865 | 565
> contract_ips_pkey | i | 605 | 565
>
> > What does VACUUM FULL VERBOSE stats_min; give you?
>
> Sorry, I can't run a VACUUM FULL at this time.
> We're in production use.
>
>

Would more regular vacuum help. I think a vaccum every hour may do
the job. perhaps with an analyse every day. (I presume the statistics
don't change too much)
While I don't surgest doing a vacuum more than twice an hour as
this would slow down the system with little gain more than once a day may
improve the speed and space usage.
Just an idea.

Peter


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-30 13:11:39
Message-ID: 8938.1054300299@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

Peter Childs <blue(dot)dragon(at)blueyonder(dot)co(dot)uk> writes:
> On Fri, 30 May 2003, Tomas Szepe wrote:
>> Trouble is, as the rows in the tables get deleted/inserted/updated
>> (the frequency being a couple thousand rows per minute), the database
>> is growing out of proportion in size.

> Would more regular vacuum help. I think a vaccum every hour may do
> the job.

Also note that no amount of vacuuming will save you if the FSM is not
large enough to keep track of all the free space. The default FSM
settings, like all the other default settings in Postgres, are set up
for a small installation. You'd probably need to raise them by at least
a factor of 10 for this installation.

regards, tom lane


From: Todd Nemanich <todd(at)twopunks(dot)org>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-30 14:25:42
Message-ID: 3ED769E6.3050300@twopunks.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

I have a database with similar performance constraints. Our best
estimates put the turnover on our most active table at 350k tuples/day.
The hardware is a 4x1.4GHz Xeon w/ a RAID 1 disk setup, and the DB
floats around 500MB of disk space taken. Here is what we do to maintain
operations:

1) Cron job @ 4:00AM that runs a full vacuum analyze on the DB, and
reindex on the major tables. (Reindex is to maintain index files in SHM)
An alerting feature pages the administrator if the job does not complete
within a reasonable amount of time.

2) Every 15 minutes, a cron job runs a vacuum analyze on our five
largest tables. An alert is emailed to the administrator if a second
vacuum attempts to start before the previous completes.

3) Every week, we review the disk usage numbers from daily peaks. This
determines if we need to increase our shmmax & shared buffers.

Additionally, you may want to take a look at your query performance. Are
most of your queries doing sequential scans? In my system, the crucial
columns of the primary tables are int8 and float8 fields. I have those
indexed, and I get a serious performance boost by making sure all
SELECT/UPDATE/DELETE queries that use those columns in the WHERE have an
explicit ::int8 or ::float8 (Explain analyze is your friend). During
peak usage, there is an order of magnitude difference (usually 10 to
15x) between queries doing sequential scans on the table, and queries
doing index scans. Might be worth investigating if your queries are
taking 5 seconds when your DB is fresh. HTH.

Tomas Szepe wrote:
> Hello everybody,
>
> I'm facing a simple yet gravely problem with postgresql 7.3.2 on x86 Linux.
> My db is used to store IP accounting statistics for about 30 C's. There are
> a couple truly trivial tables such as the one below:
>
> CREATE TABLE stats_min
> (
> ip inet NOT NULL,
> start timestamp NOT NULL default CURRENT_TIMESTAMP(0),
> intlen int4 NOT NULL default 60,
> d_in int8 NOT NULL,
> d_out int8 NOT NULL,
>
> constraint "stats_min_pkey" PRIMARY KEY ("ip", "start")
> );
> CREATE INDEX stats_min_start ON stats_min (start);
>
> A typical transaction committed on these tables looks like this:
>
> BEGIN WORK
> DELETE ...
> UPDATE/INSERT ...
> COMMIT WORK
>
> Trouble is, as the rows in the tables get deleted/inserted/updated
> (the frequency being a couple thousand rows per minute), the database
> is growing out of proportion in size. After about a week, I have
> to redump the db by hand so as to get query times back to sensible
> figures. A transaction that takes ~50 seconds before the redump will
> then complete in under 5 seconds (the corresponding data/base/ dir having
> shrunk from ~2 GB to ~0.6GB).
>
> A nightly VACCUM ANALYZE is no use.
>
> A VACUUM FULL is no use.
>
> A VACUUM FULL followed by REINDEX is no use.
>
> It seems that only a full redump involving "pg_dump olddb | \
> psql newdb" is capable of restoring the system to its working
> glory.
>
> Please accept my apologies if I've overlooked a relevant piece of
> information in the docs. I'm in an urgent need of getting this
> problem resolved.
>


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Tomas Szepe <szepe(at)pinerecords(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: db growing out of proportion
Date: 2003-05-30 15:40:43
Message-ID: 20030530083944.N91707-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance


On Fri, 30 May 2003, Tomas Szepe wrote:

> > [sszabo(at)megazone23(dot)bigpanda(dot)com]
> >
> > > Trouble is, as the rows in the tables get deleted/inserted/updated
> > > (the frequency being a couple thousand rows per minute), the database
> > > is growing out of proportion in size. After about a week, I have
> > > to redump the db by hand so as to get query times back to sensible
> > > figures. A transaction that takes ~50 seconds before the redump will
> > > then complete in under 5 seconds (the corresponding data/base/ dir having
> > > shrunk from ~2 GB to ~0.6GB).
> > >
> > > A nightly VACCUM ANALYZE is no use.
> > >
> > > A VACUUM FULL is no use.
> > >
> > > A VACUUM FULL followed by REINDEX is no use.
> >
> > Is the space being taken up by stats_min, this index, some other object?
>
> relname | relkind | relpages | reltuples
> ---------------------------------+---------+----------+-------------
> stats_hr | r | 61221 | 3.01881e+06
> stats_hr_pkey | i | 26414 | 3.02239e+06
> stats_min_pkey | i | 20849 | 953635
> stats_hr_start | i | 17218 | 3.02142e+06
> stats_min_start | i | 15284 | 949788
> stats_min | r | 10885 | 948792
> authinfo_pkey | i | 1630 | 1342
> authinfo | r | 1004 | 1342
> contract_ips | r | 865 | 565
> contract_ips_pkey | i | 605 | 565
>
> > What does VACUUM FULL VERBOSE stats_min; give you?
>
> Sorry, I can't run a VACUUM FULL at this time.
> We're in production use.

As Tom said, you probably need higher FSM settings, but also, do you have
any long lived transactions (say from some kind of persistent connection
system) that might be preventing vacuum from removing rows?


From: Tomas Szepe <szepe(at)pinerecords(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-30 22:59:39
Message-ID: 20030530225939.GE21944@louise.pinerecords.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

> [tgl(at)sss(dot)pgh(dot)pa(dot)us]
>
> Peter Childs <blue(dot)dragon(at)blueyonder(dot)co(dot)uk> writes:
> > On Fri, 30 May 2003, Tomas Szepe wrote:
> >> Trouble is, as the rows in the tables get deleted/inserted/updated
> >> (the frequency being a couple thousand rows per minute), the database
> >> is growing out of proportion in size.
>
> > Would more regular vacuum help. I think a vaccum every hour may do
> > the job.
>
> Also note that no amount of vacuuming will save you if the FSM is not
> large enough to keep track of all the free space. The default FSM
> settings, like all the other default settings in Postgres, are set up
> for a small installation. You'd probably need to raise them by at least
> a factor of 10 for this installation.

Thanks, I'll try to tweak those settings and will let the list know how
things went.

--
Tomas Szepe <szepe(at)pinerecords(dot)com>


From: Tomas Szepe <szepe(at)pinerecords(dot)com>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-30 23:00:50
Message-ID: 20030530230050.GF21944@louise.pinerecords.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

> As Tom said, you probably need higher FSM settings, but also, do you have
> any long lived transactions (say from some kind of persistent connection
> system) that might be preventing vacuum from removing rows?

No, not at all.

--
Tomas Szepe <szepe(at)pinerecords(dot)com>


From: Tomas Szepe <szepe(at)pinerecords(dot)com>
To: Todd Nemanich <todd(at)twopunks(dot)org>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-30 23:08:21
Message-ID: 20030530230821.GG21944@louise.pinerecords.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

> [todd(at)twopunks(dot)org]
>
> Additionally, you may want to take a look at your query performance. Are
> most of your queries doing sequential scans? In my system, the crucial
> columns of the primary tables are int8 and float8 fields. I have those
> indexed, and I get a serious performance boost by making sure all
> SELECT/UPDATE/DELETE queries that use those columns in the WHERE have an
> explicit ::int8 or ::float8 (Explain analyze is your friend). During
> peak usage, there is an order of magnitude difference (usually 10 to
> 15x) between queries doing sequential scans on the table, and queries
> doing index scans. Might be worth investigating if your queries are
> taking 5 seconds when your DB is fresh. HTH.

Yes, I have taken special care to fine-tune all queries on authentic
data. The db setup works as expected in whatever respect with the
exception of query times deterioration that apparently corelates to
the db's on-disk size growth.

Thanks for your suggestions,

--
Tomas Szepe <szepe(at)pinerecords(dot)com>


From: Robert Creager <Robert_Creager(at)LogicalChaos(dot)org>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-31 03:21:01
Message-ID: 20030530212101.37cef86f.Robert_Creager@LogicalChaos.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

On Fri, 30 May 2003 09:11:39 -0400
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> said something like:

>
> Also note that no amount of vacuuming will save you if the FSM is not
> large enough to keep track of all the free space. The default FSM
> settings, like all the other default settings in Postgres, are set up
> for a small installation. You'd probably need to raise them by at least
> a factor of 10 for this installation.
>

Tom,

Thanks for the hint. I just upped my shared_buffers to 8192, fsm_relations to 10000, fsm_pages to 100000, sort_mem to 64000, and an UPDATE which was taking over 2 hours dropped down to 1 to 2 minutes!

Nice...

Thanks,
Rob


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Creager <Robert_Creager(at)LogicalChaos(dot)org>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-31 04:11:26
Message-ID: 17783.1054354286@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

Robert Creager <Robert_Creager(at)LogicalChaos(dot)org> writes:
> Thanks for the hint. I just upped my shared_buffers to 8192, fsm_relations to 10000, fsm_pages to 100000, sort_mem to 64000, and an UPDATE which was taking over 2 hours dropped down to 1 to 2 minutes!

Cool ... but it's not immediately obvious which of these changes did the
trick for you. What settings were you at before? And what's the
details of the problem query?

The first three settings you mention all seem like reasonable choices,
but I'd be hesitant to recommend 64M sort_mem for general use (it won't
take very many concurrent sorts to drive you into the ground...). So
I'm interested to narrow down exactly what was the issue here.

regards, tom lane


From: Robert Creager <Robert_Creager(at)LogicalChaos(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-31 04:50:02
Message-ID: 20030530225002.5750376f.Robert_Creager@LogicalChaos.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

On Sat, 31 May 2003 00:11:26 -0400
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> said something like:
>
> Cool ... but it's not immediately obvious which of these changes did the
> trick for you. What settings were you at before? And what's the
> details of the problem query?
>
> The first three settings you mention all seem like reasonable choices,
> but I'd be hesitant to recommend 64M sort_mem for general use (it won't
> take very many concurrent sorts to drive you into the ground...). So
> I'm interested to narrow down exactly what was the issue here.
>
> regards, tom lane

shared_buffers was 1024, now 8192
max_fsm_relations was 1000, now 10000
max_fsm_pages was 20000, now 100000
wal_buffers was 8, now 16
sort_mem was 1024, now 64000
vacuum_mem was 1024, now 64000
effective_cache_size was 1000, now 100000

I am in the process of reloading the dB, but obs_v and obs_i contain ~750000 records each. I'd be happy to play around with the settings if you would like to see the timing results. I'll also be able to get some explain analyze results tomorrow when finished reloading. Suggestions as to what values to change first?

There is a 'C' language trigger on the obs_v and obs_i tables which essentially combines the data from the the obs_? tables and updates the catalog table when the obs_? records are updated.

The query is:

UPDATE obs_v
SET mag = obs_v.imag + zp.zero_v + cg.color_v * (obs_v.imag - i.imag),
use = true
FROM color_group AS cg, zero_pair AS zp, obs_i AS i, files AS f
WHERE obs_v.star_id = i.star_id
AND obs_v.file_id = f.file_id
AND cg.group_id = f.group_id
AND f.group_id = $group_id
AND zp.pair_id = f.pair_id

which is called from a perl script (DBD::Pg - which sets $group_id), and the relevant tables are:

Table "public.obs_v"
Column | Type | Modifiers
---------+---------+------------------------------------------------
x | real | not null
y | real | not null
imag | real | not null
smag | real | not null
ra | real | not null
dec | real | not null
obs_id | integer | not null default nextval('"obs_id_seq"'::text)
file_id | integer |
use | boolean | default false
solve | boolean | default false
star_id | integer |
mag | real |
Indexes: obs_v_file_id_index btree (file_id),
obs_v_loc_index btree (ra, "dec"),
obs_v_obs_id_index btree (obs_id),
obs_v_star_id_index btree (star_id),
obs_v_use_index btree (use)
Foreign Key constraints: obs_v_files_constraint FOREIGN KEY (file_id) REFERENCES files(file_id) ON UPDATE NO ACTION ON DELETE CASCADE
Triggers: obs_v_trig

with obs_i being identical (inherited from same root table)

Table "public.color_group"
Column | Type | Modifiers
----------+---------+-----------
group_id | integer |
color_u | real | default 0
color_b | real | default 0
color_v | real | default 0
color_r | real | default 0
color_i | real | default 0
Indexes: color_group_group_id_index btree (group_id)
Foreign Key constraints: $1 FOREIGN KEY (group_id) REFERENCES groups(group_id) ON UPDATE NO ACTION ON DELETE CASCADE

Table "public.zero_pair"
Column | Type | Modifiers
---------+---------+-----------
pair_id | integer | not null
zero_u | real | default 0
zero_b | real | default 0
zero_v | real | default 0
zero_r | real | default 0
zero_i | real | default 0
Indexes: zero_pair_pkey primary key btree (pair_id),
zero_pair_pair_id_index btree (pair_id)
Foreign Key constraints: $1 FOREIGN KEY (pair_id) REFERENCES pairs(pair_id) ON UPDATE NO ACTION ON DELETE CASCADE

Table "public.files"
Column | Type | Modifiers
----------+--------------------------+-------------------------------------------------------
file_id | integer | not null default nextval('"files_file_id_seq"'::text)
group_id | integer |
pair_id | integer |
date | timestamp with time zone | not null
name | character varying | not null
ra_min | real | default 0
ra_max | real | default 0
dec_min | real | default 0
dec_max | real | default 0
Indexes: files_pkey primary key btree (file_id),
files_name_key unique btree (name),
files_id_index btree (file_id, group_id, pair_id),
files_range_index btree (ra_min, ra_max, dec_min, dec_max),
imported__file_id_idex btree (file_id)
Foreign Key constraints: $1 FOREIGN KEY (group_id) REFERENCES groups(group_id) ON UPDATE NO ACTION ON DELETE CASCADE,
$2 FOREIGN KEY (pair_id) REFERENCES pairs(pair_id) ON UPDATE NO ACTION ON DELETE CASCADE

Table "public.catalog"
Column | Type | Modifiers
------------------+------------------+-------------------------------------------------
star_id | integer | not null default nextval('"star_id_seq"'::text)
loc_count | integer | default 0
ra | real | not null
ra_sum | double precision | default 0
ra_sigma | real | default 0
ra_sum_square | double precision | default 0
dec | real | not null
dec_sum | double precision | default 0
dec_sigma | real | default 0
dec_sum_square | double precision | default 0
mag_u_count | integer | default 0
mag_u | real | default 99
mag_u_sum | double precision | default 0
mag_u_sigma | real | default 0
mag_u_sum_square | double precision | default 0
mag_b_count | integer | default 0
mag_b | real | default 99
mag_b_sum | double precision | default 0
mag_b_sigma | real | default 0
mag_b_sum_square | double precision | default 0
mag_v_count | integer | default 0
mag_v | real | default 99
mag_v_sum | double precision | default 0
mag_v_sigma | real | default 0
mag_v_sum_square | double precision | default 0
mag_r_count | integer | default 0
mag_r | real | default 99
mag_r_sum | double precision | default 0
mag_r_sigma | real | default 0
mag_r_sum_square | double precision | default 0
mag_i_count | integer | default 0
mag_i | real | default 99
mag_i_sum | double precision | default 0
mag_i_sigma | real | default 0
mag_i_sum_square | double precision | default 0
Indexes: catalog_pkey primary key btree (star_id),
catalog_ra_decl_index btree (ra, "dec"),
catalog_star_id_index btree (star_id)

--
O_


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Creager <Robert_Creager(at)LogicalChaos(dot)org>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-05-31 16:13:12
Message-ID: 20748.1054397592@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

Robert Creager <Robert_Creager(at)LogicalChaos(dot)org> writes:
>> I'm interested to narrow down exactly what was the issue here.

> shared_buffers was 1024, now 8192
> max_fsm_relations was 1000, now 10000
> max_fsm_pages was 20000, now 100000
> wal_buffers was 8, now 16
> sort_mem was 1024, now 64000
> vacuum_mem was 1024, now 64000
> effective_cache_size was 1000, now 100000

> The query is:

> UPDATE obs_v
> SET mag = obs_v.imag + zp.zero_v + cg.color_v * (obs_v.imag - i.imag),
> use = true
> FROM color_group AS cg, zero_pair AS zp, obs_i AS i, files AS f
> WHERE obs_v.star_id = i.star_id
> AND obs_v.file_id = f.file_id
> AND cg.group_id = f.group_id
> AND f.group_id = $group_id
> AND zp.pair_id = f.pair_id

Hm. My best guess is that the increase in sort_mem allowed this query
to use a more efficient join plan. Perhaps the planner switched from
merge to hash join once it thought the hash table would fit in sort_mem;
or maybe the plan didn't change but the executor was able to keep
everything in memory instead of using temp files. The other changes you
mention seem good as general housekeeping, but I doubt they'd have much
direct effect on this query's speed. It'd be interesting to look at
EXPLAIN ANALYZE results for the same query at several different sort_mem
values.

regards, tom lane


From: Robert Creager <Robert_Creager(at)LogicalChaos(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Adjusting fsm values was Re: [BUGS] db growing out of proportion
Date: 2003-06-13 03:49:34
Message-ID: 20030612214934.67533e95.Robert_Creager@LogicalChaos.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance


Hey Tom,

Sorry for the long delay. I'd been having mail trouble, and your and
postgresql mail servers were bouncing me because Starband (my ISP)
doesn't setup a full DNS entry for their clients. I'm now relaying
through another host.

I'm posting to the performance list, as it seems more appropriate there.

The results were not as clear cut as I would of thought. If either
fsm_relations, fsm_pages or sort_mem were dropped to their original
values, the queries went from 3 hours to not completing 9/15 sets after
13 hours. When the shared buffers were reverted, the set completed in 12
hours.

I didn't capture any explains for the problem settings, but will be
happy to do so if you would like to see some of the results (if they are
different). I'm almost caught up with importing new data (too much rain
around here to take new data), and can explain away this weekend.

Cheers,
Rob

On Fri, 30 May 2003 22:50:02 -0600
Robert Creager <Robert_Creager(at)LogicalChaos(dot)org> said something like:

> On Sat, 31 May 2003 00:11:26 -0400
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> said something like:
> >
> > Cool ... but it's not immediately obvious which of these changes did
> > the trick for you. What settings were you at before? And what's
> > the details of the problem query?
> >
> > The first three settings you mention all seem like reasonable
> > choices, but I'd be hesitant to recommend 64M sort_mem for general
> > use (it won't take very many concurrent sorts to drive you into the
> > ground...). So I'm interested to narrow down exactly what was the
> > issue here.
> >
> > regards, tom lane
>
> shared_buffers was 1024, now 8192
> max_fsm_relations was 1000, now 10000
> max_fsm_pages was 20000, now 100000
> wal_buffers was 8, now 16
> sort_mem was 1024, now 64000
> vacuum_mem was 1024, now 64000
> effective_cache_size was 1000, now 100000
>
> I am in the process of reloading the dB, but obs_v and obs_i contain
> ~750000 records each. I'd be happy to play around with the settings
> if you would like to see the timing results. I'll also be able to get
> some explain analyze results tomorrow when finished reloading.
> Suggestions as to what values to change first?
>
> There is a 'C' language trigger on the obs_v and obs_i tables which
> essentially combines the data from the the obs_? tables and updates
> the catalog table when the obs_? records are updated.
>
> The query is:
>
> UPDATE obs_v
> SET mag = obs_v.imag + zp.zero_v + cg.color_v * (obs_v.imag - i.imag),
> use = true
> FROM color_group AS cg, zero_pair AS zp, obs_i AS i, files AS f
> WHERE obs_v.star_id = i.star_id
> AND obs_v.file_id = f.file_id
> AND cg.group_id = f.group_id
> AND f.group_id = $group_id
> AND zp.pair_id = f.pair_id
>
> which is called from a perl script (DBD::Pg - which sets $group_id),
> and the relevant tables are:
>
> Table "public.obs_v"
> Column | Type | Modifiers
> ---------+---------+------------------------------------------------
> x | real | not null
> y | real | not null
> imag | real | not null
> smag | real | not null
> ra | real | not null
> dec | real | not null
> obs_id | integer | not null default nextval('"obs_id_seq"'::text)
> file_id | integer |
> use | boolean | default false
> solve | boolean | default false
> star_id | integer |
> mag | real |
> Indexes: obs_v_file_id_index btree (file_id),
> obs_v_loc_index btree (ra, "dec"),
> obs_v_obs_id_index btree (obs_id),
> obs_v_star_id_index btree (star_id),
> obs_v_use_index btree (use)
> Foreign Key constraints: obs_v_files_constraint FOREIGN KEY (file_id)
> REFERENCES files(file_id) ON UPDATE NO ACTION ON DELETE CASCADE
> Triggers: obs_v_trig
>
> with obs_i being identical (inherited from same root table)
>
> Table "public.color_group"
> Column | Type | Modifiers
> ----------+---------+-----------
> group_id | integer |
> color_u | real | default 0
> color_b | real | default 0
> color_v | real | default 0
> color_r | real | default 0
> color_i | real | default 0
> Indexes: color_group_group_id_index btree (group_id)
> Foreign Key constraints: $1 FOREIGN KEY (group_id) REFERENCES
> groups(group_id) ON UPDATE NO ACTION ON DELETE CASCADE
>
> Table "public.zero_pair"
> Column | Type | Modifiers
> ---------+---------+-----------
> pair_id | integer | not null
> zero_u | real | default 0
> zero_b | real | default 0
> zero_v | real | default 0
> zero_r | real | default 0
> zero_i | real | default 0
> Indexes: zero_pair_pkey primary key btree (pair_id),
> zero_pair_pair_id_index btree (pair_id)
> Foreign Key constraints: $1 FOREIGN KEY (pair_id) REFERENCES
> pairs(pair_id) ON UPDATE NO ACTION ON DELETE CASCADE
>
> Table "public.files"
> Column | Type | Modifiers
>
> ----------+--------------------------+-------------------------------
> ------------------------
> file_id | integer | not null default
> nextval('"files_file_id_seq"'::text) group_id | integer
> |
> pair_id | integer |
> date | timestamp with time zone | not null
> name | character varying | not null
> ra_min | real | default 0
> ra_max | real | default 0
> dec_min | real | default 0
> dec_max | real | default 0
> Indexes: files_pkey primary key btree (file_id),
> files_name_key unique btree (name),
> files_id_index btree (file_id, group_id, pair_id),
> files_range_index btree (ra_min, ra_max, dec_min, dec_max),
> imported__file_id_idex btree (file_id)
> Foreign Key constraints: $1 FOREIGN KEY (group_id) REFERENCES
> groups(group_id) ON UPDATE NO ACTION ON DELETE CASCADE,
> $2 FOREIGN KEY (pair_id) REFERENCES
> pairs(pair_id) ON UPDATE NO ACTION ON DELETE
> CASCADE
>
> Table "public.catalog"
> Column | Type | Modifiers
>
> ------------------+------------------+-------------------------------
> ------------------
> star_id | integer | not null default
> nextval('"star_id_seq"'::text) loc_count | integer |
> default 0 ra | real | not null
> ra_sum | double precision | default 0
> ra_sigma | real | default 0
> ra_sum_square | double precision | default 0
> dec | real | not null
> dec_sum | double precision | default 0
> dec_sigma | real | default 0
> dec_sum_square | double precision | default 0
> mag_u_count | integer | default 0
> mag_u | real | default 99
> mag_u_sum | double precision | default 0
> mag_u_sigma | real | default 0
> mag_u_sum_square | double precision | default 0
> mag_b_count | integer | default 0
> mag_b | real | default 99
> mag_b_sum | double precision | default 0
> mag_b_sigma | real | default 0
> mag_b_sum_square | double precision | default 0
> mag_v_count | integer | default 0
> mag_v | real | default 99
> mag_v_sum | double precision | default 0
> mag_v_sigma | real | default 0
> mag_v_sum_square | double precision | default 0
> mag_r_count | integer | default 0
> mag_r | real | default 99
> mag_r_sum | double precision | default 0
> mag_r_sigma | real | default 0
> mag_r_sum_square | double precision | default 0
> mag_i_count | integer | default 0
> mag_i | real | default 99
> mag_i_sum | double precision | default 0
> mag_i_sigma | real | default 0
> mag_i_sum_square | double precision | default 0
> Indexes: catalog_pkey primary key btree (star_id),
> catalog_ra_decl_index btree (ra, "dec"),
> catalog_star_id_index btree (star_id)
>
>
>
> --
> O_
>

--
O_


From: Tomas Szepe <szepe(at)pinerecords(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: db growing out of proportion
Date: 2003-06-13 05:34:59
Message-ID: 20030613053459.GB23789@louise.pinerecords.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-performance

> [szepe(at)pinerecords(dot)com]
>
> > Peter Childs <blue(dot)dragon(at)blueyonder(dot)co(dot)uk> writes:
> > > On Fri, 30 May 2003, Tomas Szepe wrote:
> > >> Trouble is, as the rows in the tables get deleted/inserted/updated
> > >> (the frequency being a couple thousand rows per minute), the database
> > >> is growing out of proportion in size.
> >
> > > Would more regular vacuum help. I think a vaccum every hour may do
> > > the job.
> >
> > Also note that no amount of vacuuming will save you if the FSM is not
> > large enough to keep track of all the free space. The default FSM
> > settings, like all the other default settings in Postgres, are set up
> > for a small installation. You'd probably need to raise them by at least
> > a factor of 10 for this installation.
>
> Thanks, I'll try to tweak those settings and will let the list know how
> things went.

Well, raising max_fsm_pages to 500000 seems to have solved the problem
entirely. My thanks go to everyone who've offered their help.

--
Tomas Szepe <szepe(at)pinerecords(dot)com>