Re: CLUSTER and MVCC

From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CLUSTER and MVCC
Date: 2007-03-09 12:25:29
Message-ID: 1173443129.9058.66.camel@coppola.muc.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2007-03-09 at 12:29, Heikki Linnakangas wrote:
> Csaba, you mentioned recently
> (http://archives.postgresql.org/pgsql-hackers/2007-03/msg00027.php) that
> you're actually using the MVCC-violation to clean up tables during a
> backup. Can you tell us a bit more about that? Would you be upset if we
> shut that backdoor?

My use case: a queue-like table (in fact a 'task' table) which is very
frequently inserted/updated/deleted. This table tends to be bloated in
the presence of any long running transaction... the only transactional
behavior we need from this table is to make sure that when we insert
something in this table in a transaction (possibly together with other
actions) and then commit/rollback, it commits/rolls back the insert.
CLUSTER's violation of MVCC does not affect this, as CLUSTER will not be
able to lock the table if another transaction inserted something in it
(the inserting transaction will have a lock on the table). Selections on
this table are not critical for us, it just doesn't matter which job
processor is getting which task and in what order... (actually it does
matter, but CLUSTER won't affect that either).

So what I do is execute CLUSTER once in 5 minutes on this table. This
works just fine, and keeps the table size small even if I have long
running transactions in progress. The DB backup is one of such
unavoidable long running transactions, and I use the table exclusion
switch to exclude this task table from the backup so it won't get locked
by it and let CLUSTER still do it's job (I had a rudimentary patch to do
this even before the feature was introduced to pg_dump). The table can
be dumped separately which is a brief operation, but I would have anyway
to clear it on a crash...

Now I could try and disable the CLUSTER cron job and see if i get
problems, as last it was disabled with postgres 7.4, maybe something
changed in between... but I can tell for sure that last time I enabled
it it really fixed our load on the DB server...

Wouldn't be possible to do it like Simon (IIRC) suggested, and add a
parameter to enable/disable the current behavior, and use the MVCC
behavior as default ?

Cheers,
Csaba.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2007-03-09 12:25:35 Re: CLUSTER and MVCC
Previous Message Andrew Dunstan 2007-03-09 12:22:59 Re: who gets paid for this